Overview - Ai

ai

7 days30 days

Latest CI Pipeline Executions

Fuzzy

Succeeded
feat/otel-middleware
3a513b6d fix(ai-client): capture abort signal before await to prevent race condition (#377) * capture abort signal before await to prevent race condition * add unit tests for abort signal race condition fix Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * replace setTimeout with deterministic promise await in test Capture the nested append() promise and await it directly instead of relying on a fixed 50ms setTimeout. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * Added changeset * ci: apply automated fixes * test(ai-client): exercise actual reload() race + add asChunk casts Rewrites the second race-condition test to use reload() instead of a queued append() — append() early-returns when isLoading and queues via queuePostStreamAction, so it never reassigns this.abortController mid-flight. reload() calls cancelInFlightStream() synchronously then starts a new streamResponse(), which is the actual code path that triggers the race the fix protects against. Adds asChunk() casts so the new yields satisfy the strict AGUIEvent typing introduced by AG-UI core interop. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(ai-client): bail out when stream is cancelled during onResponse await Capturing the AbortController signal locally avoided the original null deref but exposed a latent deadlock: when stop() runs during the onResponse await, cancelInFlightStream() calls resolveProcessing() before waitForProcessing() has set processingResolve, so the call is a no-op. Post-fix, streamResponse no longer crashed on the now-null controller, reached waitForProcessing() (creating a fresh resolver nothing would resolve), and hung on `await processingComplete` — breaking the ai-react useChat unmount test. Add a `signal.aborted` check after the onResponse await to short-circuit cancelled or superseded streams cleanly, restoring main's pre-fix flow control without relying on a thrown TypeError. Update the two race tests to reflect the correct semantics: cancelled streams must not invoke the connection layer or surface errors. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Co-authored-by: Tom Beckenham <34339192+tombeckenham@users.noreply.github.com> Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com>
by Francisco ...
Succeeded
feat/otel-middleware
3a513b6d Merge edec1909b85f56c3d093b610eee3b50566c74b76 into 13cceaedf64e398ca15b8dbbbfe215329ea26794
by Alem Tuzlak
Succeeded
feat/otel-middleware
c9f692f0 fix(ai-isolate-cloudflare): port worker from unsafe_eval to worker_loader (#523) * fix(ai-isolate-cloudflare): port worker from unsafe_eval to worker_loader Cloudflare gates the `unsafe_eval` binding for all customer prod accounts (no public entitlement); the previous driver was unusable in production and broken in `wrangler dev` on current Wrangler 4.x. Swap `env.UNSAFE_EVAL.eval(code)` for the supported `worker_loader` (Dynamic Workers) binding — load the wrapped code as an ES module into a fresh child Worker isolate via `env.LOADER.load({...}).getEntrypoint() .fetch(...)` and read the structured result back as JSON. The HTTP tool-callback protocol, driver, and public API are unchanged. ~120 LOC change in worker; tests + wrangler.toml + README updated. Workers Paid plan is required for any edge usage (deploy or `wrangler dev --remote`); local `wrangler dev` works on the Free plan. The custom Miniflare `dev-server.mjs` is removed since `wrangler dev` now binds `worker_loader` natively. Closes #522. * fix(ai-isolate-cloudflare): cancel in-flight fetch on timeout + happy-path tests Address CodeRabbit review on #523: 1. Promise.race timeout left `entrypoint.fetch` running, leaking the loaded child Worker isolate. Add an AbortController whose signal flows into the Request passed to entrypoint.fetch — the timeout now actually cancels the in-flight request. Promise.race remains as a belt-and-suspenders guard. 2. Add three integration tests against a mocked LOADER binding: - happy path: full load → getEntrypoint → fetch chain, asserts the load() arguments (mainModule, modules, globalOutbound) and that the Request carries an AbortSignal - need_tools: forwards toolCalls + continuationId from sandbox - TimeoutError: AbortSignal-driven cancellation triggers the right error shape Tests: 39/39 pass. * fix(ai-isolate-cloudflare): tighten timeout test + happy-path assertions Address CodeRabbit second-pass review: 1. happy-path test: hoist load() argument assertions out of the synchronous mock. Inside load() they get swallowed by the worker's outer try/catch and surface as a generic 500. Capture options into a local + assert after worker.fetch() resolves. 2. timeout test: `expect(receivedSignal).not.toBeNull()` is trivially true per the Fetch spec (Request.signal is always present). Drop it from the happy-path test and instead assert `signal.aborted === true` in the timeout test, which actually proves the outer worker's AbortController fired. 3. worker fix: when the AbortController fires first, fetchPromise rejects before timeoutPromise. Detect the timeout via either TIMEOUT_SENTINEL or `controller.signal.aborted` so the right error surfaces regardless of which branch of the race wins. Tests: 39/39 pass. * Updated wrangler * chore(ai-isolate-cloudflare): drop unused esbuild and miniflare devDeps These were only used by the deleted dev-server.mjs, which existed to work around wrangler dev not surfacing the old unsafe_eval binding. With the worker_loader port, wrangler dev handles dev natively. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Tom Beckenham <34339192+tombeckenham@users.noreply.github.com> Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
by Sriket Komali
Succeeded
feat/otel-middleware
c9f692f0 Merge 4093f71d9c0317cc1e1bd259b3bd773226521496 into a9fc316592291d8176fa4c8cbc9406d80bd81726
by Alem Tuzlak
Succeeded
feat/otel-middleware
d3275020 fix(ai, ai-anthropic): thinking blocks missing on turn 2+ in tool loops (#391) * fix(ai, ai-anthropic): thinking blocks missing on turn 2+ in tool loops - Track thinking per-step via stepId instead of merging into single ThinkingPart - Capture Anthropic signature_delta and preserve through the full stack - Server-side TextEngine accumulates thinking + signatures per iteration - Include thinking blocks in Anthropic message history for multi-turn context - Add interleaved-thinking-2025-05-14 beta header when thinking is enabled - Add tests for multi-step thinking, backward compat, and result aggregation Closes TanStack/ai#340 * ci: apply automated fixes * fix(ai, ai-anthropic): scope interleaved-thinking betas to beta endpoint; clear stale pendingThinkingStepId - Move `betas: ['interleaved-thinking-2025-05-14']` out of the shared mapper and onto the `beta.messages.create` call site in chatStream. Prevents the non-beta structuredOutput endpoint from receiving an invalid `betas` field when a thinking budget is configured. - Clear `pendingThinkingStepId` when a later STEP_STARTED takes the active-message branch, and also in `resetStreamState`, so a stale pending id can't misattribute a later STEP_FINISHED's delta to an earlier step. - Add covering test for the pendingThinkingStepId leak (red-green verified). * test(e2e), changeset: multi-step thinking scenario and release note - Add `thinking-multi-step` mock scenario to the e2e harness emitting STEP_STARTED/STEP_FINISHED pairs for two distinct stepIds with provider signatures, followed by a text message and RUN_FINISHED. - Expose thinkingPartCount / thinkingStepIds on the mock chat page via data-* attributes for assertion. - Add tests/thinking.spec.ts asserting two ThinkingParts with distinct stepIds and matching signatures are produced (pre-PR behavior merged them into a single part). - Add .changeset/thinking-blocks-per-step.md bumping @tanstack/ai, @tanstack/ai-anthropic, and @tanstack/ai-client. * fix(ai): consume pending stepId in REASONING_MESSAGE_CONTENT When STEP_STARTED arrives before the assistant message exists, its stepId is stashed in pendingThinkingStepId. handleStepFinishedEvent already consumes it, but handleReasoningMessageContentEvent did not, so reasoning deltas were keyed by the reasoning messageId and the matching signature from STEP_FINISHED landed on a different ThinkingPart. With Anthropic's interleaved thinking around tool calls this produced two ThinkingParts per block (one unsigned content, one signed empty). Consume the pending stepId here too so both event paths attribute to the same step. * ci: apply automated fixes * refactor(ai): extract consumePendingThinkingStep helper Both handleStepFinishedEvent and handleReasoningMessageContentEvent need to promote a pending stepId from a STEP_STARTED that arrived before the assistant message existed. Pull the shared logic into a small private helper instead of duplicating it. * fix(ai-anthropic): wrap signed STEP_FINISHED yield in asChunk The bare yield bypassed the file's existing StreamChunk cast helper, so after the merge from main StepFinishedEvent now extends the stricter AG-UI base (requires stepName, no signature) and CI failed to build. Use asChunk like every other yield in this generator and add stepName. * test(ai): cover multi-turn anthropic reasoning * chore: adjust thinking callback changeset bump * fix(ai): satisfy lint for tool name fallback --------- Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com> Co-authored-by: Alem Tuzlak <t.zlak@hotmail.com>
by Isaac Sher...
Succeeded
feat/otel-middleware
d3275020 fix(ai, ai-anthropic): thinking blocks missing on turn 2+ in tool loops (#391) * fix(ai, ai-anthropic): thinking blocks missing on turn 2+ in tool loops - Track thinking per-step via stepId instead of merging into single ThinkingPart - Capture Anthropic signature_delta and preserve through the full stack - Server-side TextEngine accumulates thinking + signatures per iteration - Include thinking blocks in Anthropic message history for multi-turn context - Add interleaved-thinking-2025-05-14 beta header when thinking is enabled - Add tests for multi-step thinking, backward compat, and result aggregation Closes TanStack/ai#340 * ci: apply automated fixes * fix(ai, ai-anthropic): scope interleaved-thinking betas to beta endpoint; clear stale pendingThinkingStepId - Move `betas: ['interleaved-thinking-2025-05-14']` out of the shared mapper and onto the `beta.messages.create` call site in chatStream. Prevents the non-beta structuredOutput endpoint from receiving an invalid `betas` field when a thinking budget is configured. - Clear `pendingThinkingStepId` when a later STEP_STARTED takes the active-message branch, and also in `resetStreamState`, so a stale pending id can't misattribute a later STEP_FINISHED's delta to an earlier step. - Add covering test for the pendingThinkingStepId leak (red-green verified). * test(e2e), changeset: multi-step thinking scenario and release note - Add `thinking-multi-step` mock scenario to the e2e harness emitting STEP_STARTED/STEP_FINISHED pairs for two distinct stepIds with provider signatures, followed by a text message and RUN_FINISHED. - Expose thinkingPartCount / thinkingStepIds on the mock chat page via data-* attributes for assertion. - Add tests/thinking.spec.ts asserting two ThinkingParts with distinct stepIds and matching signatures are produced (pre-PR behavior merged them into a single part). - Add .changeset/thinking-blocks-per-step.md bumping @tanstack/ai, @tanstack/ai-anthropic, and @tanstack/ai-client. * fix(ai): consume pending stepId in REASONING_MESSAGE_CONTENT When STEP_STARTED arrives before the assistant message exists, its stepId is stashed in pendingThinkingStepId. handleStepFinishedEvent already consumes it, but handleReasoningMessageContentEvent did not, so reasoning deltas were keyed by the reasoning messageId and the matching signature from STEP_FINISHED landed on a different ThinkingPart. With Anthropic's interleaved thinking around tool calls this produced two ThinkingParts per block (one unsigned content, one signed empty). Consume the pending stepId here too so both event paths attribute to the same step. * ci: apply automated fixes * refactor(ai): extract consumePendingThinkingStep helper Both handleStepFinishedEvent and handleReasoningMessageContentEvent need to promote a pending stepId from a STEP_STARTED that arrived before the assistant message existed. Pull the shared logic into a small private helper instead of duplicating it. * fix(ai-anthropic): wrap signed STEP_FINISHED yield in asChunk The bare yield bypassed the file's existing StreamChunk cast helper, so after the merge from main StepFinishedEvent now extends the stricter AG-UI base (requires stepName, no signature) and CI failed to build. Use asChunk like every other yield in this generator and add stepName. * test(ai): cover multi-turn anthropic reasoning * chore: adjust thinking callback changeset bump * fix(ai): satisfy lint for tool name fallback --------- Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com> Co-authored-by: Alem Tuzlak <t.zlak@hotmail.com>
by Isaac Sher...
Succeeded
feat/otel-middleware
68213037 ci: apply automated fixes
by autofix-ci...
Succeeded
feat/otel-middleware
68213037 ci: apply automated fixes
by autofix-ci...
Succeeded
feat/otel-middleware
063517ab ci: apply automated fixes
by autofix-ci...
Succeeded
feat/otel-middleware
063517ab Merge 673cd4ec7536c462e9090e974f046b0b6542d89b into ff338557d9ea54c960617f52c89c488140d60f85
by Alem Tuzlak
Succeeded
feat/otel-middleware
c33138b7 feat(ai): emit GenAI semconv attributes on otel spans for PostHog compatibility Adds `gen_ai.input.messages` / `gen_ai.output.messages` attribute form alongside existing span events so backends that read prompt/completion content from attributes (e.g. PostHog LLM Analytics) render correctly. Also drops `gen_ai.operation.name` from the root span to avoid duplicate generation events, and stamps tool args/results onto tool spans.
by Tom Beckenham
Succeeded
feat/otel-middleware
c33138b7 Merge 0ce2c75c53bb07ac3c0f4aa0af665b9edf057427 into ff338557d9ea54c960617f52c89c488140d60f85
by Alem Tuzlak
Succeeded
feat/otel-middleware
428ab8d5 ci(ai): ignore @opentelemetry/api in knip for the ai workspace @opentelemetry/api is an intentional optional peer dependency of @tanstack/ai — it is referenced from src/middlewares/otel.ts but users who don't import the otel subpath never load it. Knip's default rule flags referenced optional peers as an error; add a workspace-scoped ignoreDependencies entry so knip stays happy without forcing the peer to be non-optional.
by Alem Tuzlak
Succeeded
feat/otel-middleware
428ab8d5 Merge 5359a60484991c09bcc6f4aca781de2a22bf6d59 into dc71c721a01d3b5d73d09e36fc2d87873b206b1b
by Alem Tuzlak
Succeeded
feat/otel-middleware
e896c656 fix(ai): address second CodeRabbit review pass on otel middleware - otel.ts: remove token histogram recording from onChunk; the chat runner always follows RUN_FINISHED-with-usage by runOnUsage, so recording in both hooks double-counted every histogram. onChunk keeps attribute setting, onUsage is the canonical histogram source. - otel.ts: wrap JSON.stringify in onAfterToolCall with safeCall. A tool result containing circular refs or BigInt used to throw out of the handler body and skip toolSpan.end() + state.toolSpans cleanup, leaving the span dangling. Failed serialization now yields a sentinel string and the handler always finalizes the span. - fake-otel.ts: cache spanId once per FakeSpan so repeat spanContext() calls return a consistent id. - api.middleware-test.ts (e2e): gate both the POST 'otel' middleware mode and the GET capture fetch behind an OTEL_TEST_ENABLED env check so the endpoint cannot be used as an oracle outside E2E runs. Track 'ended' separately so the capture span's isRecording() flips after end() like real OTel. Reorder imports. - otel-capture.ts: document the shallow Object.assign patch behavior on recordOtelSpan so future callers know nested fields replace instead of merging. - middleware.spec.ts: discriminate root-vs-iteration spans by SpanKind (INTERNAL vs CLIENT) instead of presence of the 'tanstack.ai.iteration' attribute; factor the capture-fetch into a helper that validates testId and response.ok.
by Alem Tuzlak
Failed
feat/otel-middleware
e896c656 fix(ai): address second CodeRabbit review pass on otel middleware - otel.ts: remove token histogram recording from onChunk; the chat runner always follows RUN_FINISHED-with-usage by runOnUsage, so recording in both hooks double-counted every histogram. onChunk keeps attribute setting, onUsage is the canonical histogram source. - otel.ts: wrap JSON.stringify in onAfterToolCall with safeCall. A tool result containing circular refs or BigInt used to throw out of the handler body and skip toolSpan.end() + state.toolSpans cleanup, leaving the span dangling. Failed serialization now yields a sentinel string and the handler always finalizes the span. - fake-otel.ts: cache spanId once per FakeSpan so repeat spanContext() calls return a consistent id. - api.middleware-test.ts (e2e): gate both the POST 'otel' middleware mode and the GET capture fetch behind an OTEL_TEST_ENABLED env check so the endpoint cannot be used as an oracle outside E2E runs. Track 'ended' separately so the capture span's isRecording() flips after end() like real OTel. Reorder imports. - otel-capture.ts: document the shallow Object.assign patch behavior on recordOtelSpan so future callers know nested fields replace instead of merging. - middleware.spec.ts: discriminate root-vs-iteration spans by SpanKind (INTERNAL vs CLIENT) instead of presence of the 'tanstack.ai.iteration' attribute; factor the capture-fetch into a helper that validates testId and response.ok.
by Alem Tuzlak
Succeeded
feat/otel-middleware
963cc456 fix(ai): address otel-middleware review feedback Critical - C1: onUsage was a no-op in production — RUN_FINISHED closed the iteration span before runOnUsage fired, dropping gen_ai.usage.* attrs and the token histogram. Fix: keep the iteration span open through tool execution and onUsage; close it on the next onConfig(beforeModel), on onFinish, or on onError/onAbort. Token histogram is also recorded directly from chunk.usage at RUN_FINISHED and redundantly from onUsage so neither hook-order variant loses data. - C3: otelMiddleware is now exported from the dedicated subpath @tanstack/ai/middlewares/otel instead of the main middlewares barrel, so importing toolCacheMiddleware or contentGuardMiddleware no longer eagerly requires @opentelemetry/api. - C4: redactor failures fail closed to the literal sentinel "[redaction_failed]" and log a warning — raw content can no longer leak when a PII redactor throws. - C2: added two middleware.spec.ts scenarios that exercise otel end-to-end through the real chat() runner (basic-text + with-tool), guarding against the C1 regression and verifying tool-span nesting. Important - I1: safeCall now logs callback failures via console.warn with a label, matching the docs' "a thrown callback becomes a log line" promise. - I2: replaced gen_ai.completion.reason='cancelled' (not a valid semconv attribute) with tanstack.ai.completion.reason. - I3: onAbort now records gen_ai.client.operation.duration with error.type='cancelled'. - I4: docs + changeset corrected — duration histogram is per-run, token histogram is per-iteration. - I5: error/abort tests now assert the full onSpanEnd sequence (iteration, tool, chat) with each span captured before .end(). - I6: tool spans still open at onFinish are swept with tanstack.ai.tool.outcome='unknown'. - I7: OtelSpanInfo is now a proper discriminated union; tool-only fields narrow inside callbacks and the internal 'as OtelSpanInfo<\"tool\">' casts are mostly gone. - I8: iteration spans now use 'chat <model> #<iteration>' so they are distinguishable in trace viewers. - I9: gen_ai.response.model dropped from the duration histogram attrs (high-cardinality). Misc - fake-otel: resolve parent via the explicit context arg passed to startSpan, eliminating the ;(span as any).parent = ... test hack. createHistogram also captures options so unit/description are assertable. - redactException paths use 'as Exception' instead of 'as Error' so non-Error throwables are preserved. - OtelSpanKind kept as a deprecated alias of OtelSpanScope to avoid shadowing OTel's built-in SpanKind. - Public types documented with JSDoc. - Assistant text buffer is now capped at maxContentLength (default 100k).
by Alem Tuzlak
Failed
feat/otel-middleware
963cc456 Merge 54666a99e44add4b3b5896e6e3385720c47a00e6 into dc71c721a01d3b5d73d09e36fc2d87873b206b1b
by Alem Tuzlak
Succeeded
feat/otel-middleware
a77339d7 ci: apply automated fixes
by autofix-ci...
Failed
feat/otel-middleware
a77339d7 Merge 37ee76dc0493047dd954548b2e5b4a76bed01b87 into 54523f5e9a9b4d4ea6c49e4551936bc2cc25593a
by Alem Tuzlak