Loading workspace insights... Statistics interval
7 days30 daysLatest CI Pipeline Executions
215b6b40 fix(ai-openai): migrate WebRTC realtime adapter to OpenAI GA API (#699)
* fix(ai-openai): migrate WebRTC realtime adapter to OpenAI GA API
* Merge branch 'main' into fix/openai-realtime-ga-migration
* fix(ai-openai): complete realtime Beta-to-GA migration
Completes the GA migration started in this PR so the whole realtime flow
works against OpenAI's GA API (the Beta shape was shut down 2026-05-12):
- openaiRealtimeToken() mints ephemeral keys via POST
/v1/realtime/client_secrets (the Beta /v1/realtime/sessions endpoint is
retired) and parses the GA top-level value/expires_at response shape
- session.update payloads use the GA shape via a new pure
buildSessionUpdate() helper: required session.type, audio.input.*,
audio.output.voice, output_modalities, max_output_tokens; temperature
(removed in GA) is dropped with a debug log instead of getting the whole
update rejected with unknown_parameter
- server events handled under GA names (response.output_audio_transcript.*,
response.output_audio.*, output_text/output_audio content parts)
- removed the now-unused model local in createWebRTCConnection (the GA
/calls endpoint rejects ?model=; the model is bound to the ephemeral key)
- default model gpt-realtime; dead gpt-4o-(mini-)realtime-preview ids
(shut down 2026-05-07) removed from OpenAIRealtimeModel, docs, and
examples
- unit tests for the session.update payload and client-secret
request/response shapes; changeset added
Live-verified against the OpenAI API: client_secrets 200 (ek_ token),
/v1/realtime/calls 201 with SDP answer, and session.updated echoing voice,
semantic VAD, tools, output_modalities, and max_output_tokens.
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
* fix(ai-openai): collapse output modalities to single GA-supported value
The GA realtime API only accepts ['audio'] or ['text'] for
output_modalities; the Beta API accepted ['audio', 'text'] and the
provider-agnostic RealtimeSessionConfig still legitimately produces it
(e.g. the example UI's audio+text mode). Sending both got the whole
session.update rejected with: Invalid modalities: ['audio', 'text'].
Collapse to ['audio'] when audio is requested — GA audio replies still
stream text via response.output_audio_transcript.* events, so visible
behavior is unchanged. Live-verified: session.updated accepted.
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
* Merge branch 'main' into fix/openai-realtime-ga-migration 215b6b40 fix(ai-openai): migrate WebRTC realtime adapter to OpenAI GA API (#699)
* fix(ai-openai): migrate WebRTC realtime adapter to OpenAI GA API
* Merge branch 'main' into fix/openai-realtime-ga-migration
* fix(ai-openai): complete realtime Beta-to-GA migration
Completes the GA migration started in this PR so the whole realtime flow
works against OpenAI's GA API (the Beta shape was shut down 2026-05-12):
- openaiRealtimeToken() mints ephemeral keys via POST
/v1/realtime/client_secrets (the Beta /v1/realtime/sessions endpoint is
retired) and parses the GA top-level value/expires_at response shape
- session.update payloads use the GA shape via a new pure
buildSessionUpdate() helper: required session.type, audio.input.*,
audio.output.voice, output_modalities, max_output_tokens; temperature
(removed in GA) is dropped with a debug log instead of getting the whole
update rejected with unknown_parameter
- server events handled under GA names (response.output_audio_transcript.*,
response.output_audio.*, output_text/output_audio content parts)
- removed the now-unused model local in createWebRTCConnection (the GA
/calls endpoint rejects ?model=; the model is bound to the ephemeral key)
- default model gpt-realtime; dead gpt-4o-(mini-)realtime-preview ids
(shut down 2026-05-07) removed from OpenAIRealtimeModel, docs, and
examples
- unit tests for the session.update payload and client-secret
request/response shapes; changeset added
Live-verified against the OpenAI API: client_secrets 200 (ek_ token),
/v1/realtime/calls 201 with SDP answer, and session.updated echoing voice,
semantic VAD, tools, output_modalities, and max_output_tokens.
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
* fix(ai-openai): collapse output modalities to single GA-supported value
The GA realtime API only accepts ['audio'] or ['text'] for
output_modalities; the Beta API accepted ['audio', 'text'] and the
provider-agnostic RealtimeSessionConfig still legitimately produces it
(e.g. the example UI's audio+text mode). Sending both got the whole
session.update rejected with: Invalid modalities: ['audio', 'text'].
Collapse to ['audio'] when audio is requested — GA audio replies still
stream text via response.output_audio_transcript.* events, so visible
behavior is unchanged. Live-verified: session.updated accepted.
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
* Merge branch 'main' into fix/openai-realtime-ga-migration