Run details

refactor: client/voice (audio playback TTS) (#16) ## PR Type What kind of change does this PR introduce?  ``` [ ] Bugfix [ ] Feature [ ] Code style update (formatting, local variables) [x] Refactoring (no functional changes, no api changes) [ ] Build related changes [ ] CI related changes [ ] Documentation content changes [ ] Other... Please describe: ``` ## What is the current behavior? The voice functionality is currently tightly coupled in the React layer with: - Audio capture logic embedded in `use-maelstrom-voice.ts` (using inline AudioWorklet code) - Audio playback handling mixed with state management - VAD logic partially in React hooks - Complex orchestration spread across multiple files - Limited test coverage for audio operations ## What is the new behavior? This PR refactors the voice architecture by extracting concerns into dedicated packages: 1. **`audio-capture`**: New standalone package with `AudioCapture` class handling microphone access, AudioWorklet management, and audio streaming 2. **`audio-playback`**: New standalone package with `AudioPlayback` class for TTS audio queue management and scheduling 3. **`voice-session`**: New package containing `VoiceSession` class that orchestrates STT/TTS flow, VAD integration, and state management 4. **Simplified React hooks**: `use-maelstrom-voice.ts` reduced from ~800 lines to ~200 lines by delegating to `VoiceSession` 5. **Improved VAD**: Better Silero VAD integration with configurable timeout options 6. **Enhanced testing**: Added comprehensive test suites for audio-capture (328 lines), audio-playback (194 lines), and voice-session (1238 lines) 7. **Better concurrency**: Fixed race conditions in audio playback and capture 8. **Code organization**: Moved all audio processing logic from React layer to client library API remains unchanged for consumers of the React hook. ## Does this PR introduce a breaking change? ``` [ ] Yes [x] No ```  ## Other information - Removed ~700 lines from `use-maelstrom-voice.ts` and ~777 lines from `voice-connector.ts` - Total net addition of ~1,300 lines (mostly new tests and documentation) - All existing voice functionality preserved - Improves maintainability by separating audio concerns from React UI logic - Enables reuse of voice logic in non-React contexts

nx run-many -t e2e

Succeeded

CI Pipeline Execution

nx run-many -t e2e

Click to copy

2 CPU cores

read-write access token used

ddbd47ebmain

© 2026 - Nx Cloud

Terms of Service Privacy Policy Status Docs Contact Nx Cloud Pricing Company @NxDevTools