naxodev
OSS
naxodev/mar
Sign in / Sign up
Open main menu
naxodev/mar
GitHub
Overview
Runs
Analytics
Loading workspace stats
Loading workspace insights...
Statistics interval
7 days
30 days
Latest CI Pipeline Executions
Status
Fix filter
Filter
Fuzzy
Filter range
Sort by
Sort by
Start time
Sort ascending
Sort descending
Succeeded
main
ddbd47eb refactor: client/voice (audio playback TTS) (#16) ## PR Type What kind of change does this PR introduce? <!-- Please check the one that applies to this PR using "x". --> ``` [ ] Bugfix [ ] Feature [ ] Code style update (formatting, local variables) [x] Refactoring (no functional changes, no api changes) [ ] Build related changes [ ] CI related changes [ ] Documentation content changes [ ] Other... Please describe: ``` ## What is the current behavior? The voice functionality is currently tightly coupled in the React layer with: - Audio capture logic embedded in `use-maelstrom-voice.ts` (using inline AudioWorklet code) - Audio playback handling mixed with state management - VAD logic partially in React hooks - Complex orchestration spread across multiple files - Limited test coverage for audio operations ## What is the new behavior? This PR refactors the voice architecture by extracting concerns into dedicated packages: 1. **`audio-capture`**: New standalone package with `AudioCapture` class handling microphone access, AudioWorklet management, and audio streaming 2. **`audio-playback`**: New standalone package with `AudioPlayback` class for TTS audio queue management and scheduling 3. **`voice-session`**: New package containing `VoiceSession` class that orchestrates STT/TTS flow, VAD integration, and state management 4. **Simplified React hooks**: `use-maelstrom-voice.ts` reduced from ~800 lines to ~200 lines by delegating to `VoiceSession` 5. **Improved VAD**: Better Silero VAD integration with configurable timeout options 6. **Enhanced testing**: Added comprehensive test suites for audio-capture (328 lines), audio-playback (194 lines), and voice-session (1238 lines) 7. **Better concurrency**: Fixed race conditions in audio playback and capture 8. **Code organization**: Moved all audio processing logic from React layer to client library API remains unchanged for consumers of the React hook. ## Does this PR introduce a breaking change? ``` [ ] Yes [x] No ``` <!-- If this PR contains a breaking change, please describe the impact and migration path for existing applications below. --> ## Other information - Removed ~700 lines from `use-maelstrom-voice.ts` and ~777 lines from `voice-connector.ts` - Total net addition of ~1,300 lines (mostly new tests and documentation) - All existing voice functionality preserved - Improves maintainability by separating audio concerns from React UI logic - Enables reuse of voice logic in non-React contexts
18 hours ago
by Javier Cab...
J
Succeeded
refactoring/client/voice
b90ab012 adding max user speech as option in vad fix
21 hours ago
by Javier Cab...
J
Failed
refactoring/client/voice
Fix ready
→
55b9fd68 adding max user speech as option in vad
21 hours ago
by Javier Cab...
J
Failed
refactoring/client/voice
Fix ready
→
a66595e8 long hard limit
21 hours ago
by Javier Cab...
J
Failed
refactoring/client/voice
Fix ready
→
83055c82 fix vad sould be able to decide if interim is final after timeout fix
21 hours ago
by Javier Cab...
J
Failed
refactoring/client/voice
Fix ready
→
3e2f8fd4 lab results
22 hours ago
by Javier Cab...
J
Previous page
Previous
Next
Next page