Draft
Conversation
Refactor the EndOfTurnPenaltyItem logic to improve clarity and functionality. Group related penalty items with descriptive comments for better maintainability. Adjust penalties for situations with Smart Turn and VAD to improve detection accuracy, including new conditions for SMART_TURN_FALSE and ACTIVE combinations. This change is necessary to fine-tune the configuration for complex speech patterns and ensure better end-of-turn detection in the transcription process.
Refactor the EndOfTurnPenaltyItem logic to improve clarity and functionality. Group related penalty items with descriptive comments for better maintainability. Adjust penalties for situations with Smart Turn and VAD to improve detection accuracy, including new conditions for SMART_TURN_FALSE and ACTIVE combinations. This change is necessary to fine-tune the configuration for complex speech patterns and ensure better end-of-turn detection in the transcription process.
* Add No Signal Penalty for Smart Turn * Update Penalty to Extend TTL
…chmatics-python-sdk into fix/smart-turn # Conflicts: # sdk/voice/speechmatics/voice/_models.py
Introduce `test_no_feou_fix.py` to validate scenarios where Fixed End Of Utterance (FEOU) is disabled. This test ensures correct behavior when FEOU mode is set to FIXED in the `VoiceAgentConfig`. Utilize additional vocabulary and message logging for enhanced debugging. Skipped in CI to avoid unnecessary API calls without a valid key.
Add `validate_config` method to `VoiceAgentConfig` to ensure cross-field validation post-merging. This enhances the robustness of configurations by checking for inconsistencies and errors, such as ensuring valid combinations of end-of-utterance modes and features like VAD, and sample rates. Enhance preset functionality by validating merged configurations. This ensures that custom configurations derived from presets are validated before use, preventing runtime errors due to invalid configurations. Drop use of `model_validator` for clearer validation flow and improve error reporting by raising specific exceptions for validation failures.
Set `use_forced_eou` to True in EndOfTurnConfig to ensure correct behavior for utterance detection. Previously, `use_forced_eou` was set to False, which could lead to inaccurate turn-taking scenarios. Added validation in `validate_config` to prevent setting `use_forced_eou` to False, ensuring configurations remain consistent with intended usage and avoiding potential run-time errors.
Remove redundant flags and streamline end-of-utterance (EOU) and voice activity detection (VAD) handling in the VoiceAgentClient class. Changes include: - Rename confusing boolean flags to improve clarity. - Simplify logic for determining when to listen to EOU messages. - Remove unused code paths and clean up comments for better readability. - Combine similar conditional logic to avoid duplicated checks. These changes are intended to make the codebase more maintainable, reduce potential for errors, and improve overall performance.
Remove the `use_forced_eou` setting from the `EndOfTurnConfig` in several test files to simplify configurations. Forced end-of-utterance must always be true (default), so removed.
…n VoiceAgentConfig Remove the conditional validation logic for 'use_forced_eou' within the 'VoiceAgentConfig.validate_config' method. This logic was enforcing that 'EndOfTurnConfig.use_forced_eou' cannot be False, which is no longer required. This change streamlines the validation process, aligning it with updated requirements, and ensuring clarity around utterance handling configurations. Additionally, cleanup of imports of 'EndOfTurnConfig' in test files reflects this update.
Refactor turn management logic to ensure better handling of forced End of Utterance (EOU) configurations. While FEOU cannot be disabled in normal use, it can be disabled for testing directly manipulating the config value: `config.end_of_turn_config.use_forced_eou = False`
Remove unnecessary boolean conversion for 'end_of_utterance_mode' check and update the conditional logic for '_listen_to_eou_messages'. This resolves a logical error that prevented proper handling of 'fixed' end of utterance mode, and ensures the client correctly listens or doesn't listen to EOU messages based on '_listen_to_eou_messages' state. These changes enhance the processing of end of utterance events, improving overall speech-to-text functionality.
Extract the configuration setup into a separate `config` variable to improve readability and maintainability. Add debug print statements for configuration details to aid in debugging. Move client disconnect logic to the end of the test to ensure the connection is properly closed, improving resource management.
Change speechmatics-rt version specifier from a minimum version requirement to an exact version pin (==0.5.3).
Add FFT-based resampling in SmartTurnDetector for non-16kHz audio. Parameterise Silero VAD chunk/context sizes to handle both 8kHz and 16kHz natively. Refactor forced end-of-utterance control: replace the testing flag with a declarative `_use_forced_eou` derived from config. Defer audio format initialisation in AsyncClient until start_session is called, and return the FEOU timestamp for diagnostic logging. Rename `_vad_evaluation` to `_speaker_start_stop_evaluation` and remove unused `EndOfTurnConfig` from presets.
Disable smart turn cutoff skip that prevented re-evaluation. Improve multiple speakers test with accumulated error reporting and turn boundary tracking.
# Conflicts: # sdk/rt/speechmatics/rt/_async_client.py # tests/voice/test_17_eou_feou.py
…f utterance padding
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
TBD