Loading workspace insights... Statistics interval
7 days30 daysLatest CI Pipeline Executions
c9f692f0 fix(ai-isolate-cloudflare): port worker from unsafe_eval to worker_loader (#523)
* fix(ai-isolate-cloudflare): port worker from unsafe_eval to worker_loader
Cloudflare gates the `unsafe_eval` binding for all customer prod
accounts (no public entitlement); the previous driver was unusable in
production and broken in `wrangler dev` on current Wrangler 4.x.
Swap `env.UNSAFE_EVAL.eval(code)` for the supported `worker_loader`
(Dynamic Workers) binding — load the wrapped code as an ES module into
a fresh child Worker isolate via `env.LOADER.load({...}).getEntrypoint()
.fetch(...)` and read the structured result back as JSON.
The HTTP tool-callback protocol, driver, and public API are unchanged.
~120 LOC change in worker; tests + wrangler.toml + README updated.
Workers Paid plan is required for any edge usage (deploy or
`wrangler dev --remote`); local `wrangler dev` works on the Free plan.
The custom Miniflare `dev-server.mjs` is removed since `wrangler dev`
now binds `worker_loader` natively.
Closes #522.
* fix(ai-isolate-cloudflare): cancel in-flight fetch on timeout + happy-path tests
Address CodeRabbit review on #523:
1. Promise.race timeout left `entrypoint.fetch` running, leaking the loaded
child Worker isolate. Add an AbortController whose signal flows into the
Request passed to entrypoint.fetch — the timeout now actually cancels the
in-flight request. Promise.race remains as a belt-and-suspenders guard.
2. Add three integration tests against a mocked LOADER binding:
- happy path: full load → getEntrypoint → fetch chain, asserts the
load() arguments (mainModule, modules, globalOutbound) and that the
Request carries an AbortSignal
- need_tools: forwards toolCalls + continuationId from sandbox
- TimeoutError: AbortSignal-driven cancellation triggers the right
error shape
Tests: 39/39 pass.
* fix(ai-isolate-cloudflare): tighten timeout test + happy-path assertions
Address CodeRabbit second-pass review:
1. happy-path test: hoist load() argument assertions out of the synchronous
mock. Inside load() they get swallowed by the worker's outer try/catch and
surface as a generic 500. Capture options into a local + assert after
worker.fetch() resolves.
2. timeout test: `expect(receivedSignal).not.toBeNull()` is trivially true
per the Fetch spec (Request.signal is always present). Drop it from the
happy-path test and instead assert `signal.aborted === true` in the
timeout test, which actually proves the outer worker's AbortController
fired.
3. worker fix: when the AbortController fires first, fetchPromise rejects
before timeoutPromise. Detect the timeout via either TIMEOUT_SENTINEL or
`controller.signal.aborted` so the right error surfaces regardless of
which branch of the race wins.
Tests: 39/39 pass.
* Updated wrangler
* chore(ai-isolate-cloudflare): drop unused esbuild and miniflare devDeps
These were only used by the deleted dev-server.mjs, which existed to
work around wrangler dev not surfacing the old unsafe_eval binding.
With the worker_loader port, wrangler dev handles dev natively.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: Tom Beckenham <34339192+tombeckenham@users.noreply.github.com>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> d3275020 fix(ai, ai-anthropic): thinking blocks missing on turn 2+ in tool loops (#391)
* fix(ai, ai-anthropic): thinking blocks missing on turn 2+ in tool loops
- Track thinking per-step via stepId instead of merging into single ThinkingPart
- Capture Anthropic signature_delta and preserve through the full stack
- Server-side TextEngine accumulates thinking + signatures per iteration
- Include thinking blocks in Anthropic message history for multi-turn context
- Add interleaved-thinking-2025-05-14 beta header when thinking is enabled
- Add tests for multi-step thinking, backward compat, and result aggregation
Closes TanStack/ai#340
* ci: apply automated fixes
* fix(ai, ai-anthropic): scope interleaved-thinking betas to beta endpoint; clear stale pendingThinkingStepId
- Move `betas: ['interleaved-thinking-2025-05-14']` out of the shared mapper
and onto the `beta.messages.create` call site in chatStream. Prevents the
non-beta structuredOutput endpoint from receiving an invalid `betas` field
when a thinking budget is configured.
- Clear `pendingThinkingStepId` when a later STEP_STARTED takes the
active-message branch, and also in `resetStreamState`, so a stale pending
id can't misattribute a later STEP_FINISHED's delta to an earlier step.
- Add covering test for the pendingThinkingStepId leak (red-green verified).
* test(e2e), changeset: multi-step thinking scenario and release note
- Add `thinking-multi-step` mock scenario to the e2e harness emitting
STEP_STARTED/STEP_FINISHED pairs for two distinct stepIds with
provider signatures, followed by a text message and RUN_FINISHED.
- Expose thinkingPartCount / thinkingStepIds on the mock chat page via
data-* attributes for assertion.
- Add tests/thinking.spec.ts asserting two ThinkingParts with distinct
stepIds and matching signatures are produced (pre-PR behavior merged
them into a single part).
- Add .changeset/thinking-blocks-per-step.md bumping @tanstack/ai,
@tanstack/ai-anthropic, and @tanstack/ai-client.
* fix(ai): consume pending stepId in REASONING_MESSAGE_CONTENT
When STEP_STARTED arrives before the assistant message exists, its
stepId is stashed in pendingThinkingStepId. handleStepFinishedEvent
already consumes it, but handleReasoningMessageContentEvent did not,
so reasoning deltas were keyed by the reasoning messageId and the
matching signature from STEP_FINISHED landed on a different
ThinkingPart. With Anthropic's interleaved thinking around tool calls
this produced two ThinkingParts per block (one unsigned content, one
signed empty). Consume the pending stepId here too so both event
paths attribute to the same step.
* ci: apply automated fixes
* refactor(ai): extract consumePendingThinkingStep helper
Both handleStepFinishedEvent and handleReasoningMessageContentEvent
need to promote a pending stepId from a STEP_STARTED that arrived
before the assistant message existed. Pull the shared logic into a
small private helper instead of duplicating it.
* fix(ai-anthropic): wrap signed STEP_FINISHED yield in asChunk
The bare yield bypassed the file's existing StreamChunk cast helper,
so after the merge from main StepFinishedEvent now extends the stricter
AG-UI base (requires stepName, no signature) and CI failed to build.
Use asChunk like every other yield in this generator and add stepName.
* test(ai): cover multi-turn anthropic reasoning
* chore: adjust thinking callback changeset bump
* fix(ai): satisfy lint for tool name fallback
---------
Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com>
Co-authored-by: Alem Tuzlak <t.zlak@hotmail.com> d3275020 fix(ai, ai-anthropic): thinking blocks missing on turn 2+ in tool loops (#391)
* fix(ai, ai-anthropic): thinking blocks missing on turn 2+ in tool loops
- Track thinking per-step via stepId instead of merging into single ThinkingPart
- Capture Anthropic signature_delta and preserve through the full stack
- Server-side TextEngine accumulates thinking + signatures per iteration
- Include thinking blocks in Anthropic message history for multi-turn context
- Add interleaved-thinking-2025-05-14 beta header when thinking is enabled
- Add tests for multi-step thinking, backward compat, and result aggregation
Closes TanStack/ai#340
* ci: apply automated fixes
* fix(ai, ai-anthropic): scope interleaved-thinking betas to beta endpoint; clear stale pendingThinkingStepId
- Move `betas: ['interleaved-thinking-2025-05-14']` out of the shared mapper
and onto the `beta.messages.create` call site in chatStream. Prevents the
non-beta structuredOutput endpoint from receiving an invalid `betas` field
when a thinking budget is configured.
- Clear `pendingThinkingStepId` when a later STEP_STARTED takes the
active-message branch, and also in `resetStreamState`, so a stale pending
id can't misattribute a later STEP_FINISHED's delta to an earlier step.
- Add covering test for the pendingThinkingStepId leak (red-green verified).
* test(e2e), changeset: multi-step thinking scenario and release note
- Add `thinking-multi-step` mock scenario to the e2e harness emitting
STEP_STARTED/STEP_FINISHED pairs for two distinct stepIds with
provider signatures, followed by a text message and RUN_FINISHED.
- Expose thinkingPartCount / thinkingStepIds on the mock chat page via
data-* attributes for assertion.
- Add tests/thinking.spec.ts asserting two ThinkingParts with distinct
stepIds and matching signatures are produced (pre-PR behavior merged
them into a single part).
- Add .changeset/thinking-blocks-per-step.md bumping @tanstack/ai,
@tanstack/ai-anthropic, and @tanstack/ai-client.
* fix(ai): consume pending stepId in REASONING_MESSAGE_CONTENT
When STEP_STARTED arrives before the assistant message exists, its
stepId is stashed in pendingThinkingStepId. handleStepFinishedEvent
already consumes it, but handleReasoningMessageContentEvent did not,
so reasoning deltas were keyed by the reasoning messageId and the
matching signature from STEP_FINISHED landed on a different
ThinkingPart. With Anthropic's interleaved thinking around tool calls
this produced two ThinkingParts per block (one unsigned content, one
signed empty). Consume the pending stepId here too so both event
paths attribute to the same step.
* ci: apply automated fixes
* refactor(ai): extract consumePendingThinkingStep helper
Both handleStepFinishedEvent and handleReasoningMessageContentEvent
need to promote a pending stepId from a STEP_STARTED that arrived
before the assistant message existed. Pull the shared logic into a
small private helper instead of duplicating it.
* fix(ai-anthropic): wrap signed STEP_FINISHED yield in asChunk
The bare yield bypassed the file's existing StreamChunk cast helper,
so after the merge from main StepFinishedEvent now extends the stricter
AG-UI base (requires stepName, no signature) and CI failed to build.
Use asChunk like every other yield in this generator and add stepName.
* test(ai): cover multi-turn anthropic reasoning
* chore: adjust thinking callback changeset bump
* fix(ai): satisfy lint for tool name fallback
---------
Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com>
Co-authored-by: Alem Tuzlak <t.zlak@hotmail.com> 963cc456 fix(ai): address otel-middleware review feedback
Critical
- C1: onUsage was a no-op in production — RUN_FINISHED closed the
iteration span before runOnUsage fired, dropping gen_ai.usage.* attrs
and the token histogram. Fix: keep the iteration span open through tool
execution and onUsage; close it on the next onConfig(beforeModel), on
onFinish, or on onError/onAbort. Token histogram is also recorded
directly from chunk.usage at RUN_FINISHED and redundantly from onUsage
so neither hook-order variant loses data.
- C3: otelMiddleware is now exported from the dedicated subpath
@tanstack/ai/middlewares/otel instead of the main middlewares barrel,
so importing toolCacheMiddleware or contentGuardMiddleware no longer
eagerly requires @opentelemetry/api.
- C4: redactor failures fail closed to the literal sentinel
"[redaction_failed]" and log a warning — raw content can no longer
leak when a PII redactor throws.
- C2: added two middleware.spec.ts scenarios that exercise otel end-to-end
through the real chat() runner (basic-text + with-tool), guarding
against the C1 regression and verifying tool-span nesting.
Important
- I1: safeCall now logs callback failures via console.warn with a label,
matching the docs' "a thrown callback becomes a log line" promise.
- I2: replaced gen_ai.completion.reason='cancelled' (not a valid semconv
attribute) with tanstack.ai.completion.reason.
- I3: onAbort now records gen_ai.client.operation.duration with
error.type='cancelled'.
- I4: docs + changeset corrected — duration histogram is per-run, token
histogram is per-iteration.
- I5: error/abort tests now assert the full onSpanEnd sequence
(iteration, tool, chat) with each span captured before .end().
- I6: tool spans still open at onFinish are swept with
tanstack.ai.tool.outcome='unknown'.
- I7: OtelSpanInfo is now a proper discriminated union; tool-only fields
narrow inside callbacks and the internal 'as OtelSpanInfo<\"tool\">'
casts are mostly gone.
- I8: iteration spans now use 'chat <model> #<iteration>' so they are
distinguishable in trace viewers.
- I9: gen_ai.response.model dropped from the duration histogram attrs
(high-cardinality).
Misc
- fake-otel: resolve parent via the explicit context arg passed to
startSpan, eliminating the ;(span as any).parent = ... test hack.
createHistogram also captures options so unit/description are assertable.
- redactException paths use 'as Exception' instead of 'as Error' so
non-Error throwables are preserved.
- OtelSpanKind kept as a deprecated alias of OtelSpanScope to avoid
shadowing OTel's built-in SpanKind.
- Public types documented with JSDoc.
- Assistant text buffer is now capped at maxContentLength (default 100k).