LibreChat

mirror of https://github.com/danny-avila/LibreChat.git synced 2026-05-13 07:46:47 +00:00

Author	SHA1	Message	Date
Danny Avila	68d80f3324	✨ v0.8.6-rc1 (#13094 )	2026-05-12 21:40:23 -04:00
Danny Avila	8eb9de011f	📦 chore: bump `@librechat/agents` to v3.1.86, npm audit, build fix (#13105 ) * 📦 chore: Bump `@librechat/agents` to v3.1.86 in package-lock.json and package.json files * 📦 chore: Update dependencies in package-lock.json to latest versions, including @protobufjs/codegen, @protobufjs/inquire, @protobufjs/utf8, and protobufjs * 📦 chore: Add `librechat-data-provider` dependency in package.json and package-lock.json, and update build dependencies in turbo.json	2026-05-12 16:19:55 -04:00
Danny Avila	6b5596ec36	🍪 refactor: Refresh CloudFront Media Cookies (#13091 ) * fix: refresh CloudFront media cookies * fix: satisfy changed-file lint * fix: centralize CloudFront image retry * fix: honor base path for CloudFront refresh * fix: bypass auth refresh for CloudFront cookie retry * fix: pass app auth header to CloudFront retry * test: cover CloudFront refresh with OpenID reuse * fix: avoid duplicate CloudFront refresh retries * fix: clear CloudFront scope cookie with matching flags	2026-05-12 13:26:05 -04:00
Ravi Kumar L	05d4e90f91	🌩️ feat: Strict CloudFront signed cookie enforcement via `requireSignedAccess` (#13078 ) * feat(cloudfront): add requireSignedAccess to enforce strict signed access Introduces cloudfront.requireSignedAccess (default false). When enabled, initializeCloudFront requires both CLOUDFRONT_KEY_PAIR_ID and CLOUDFRONT_PRIVATE_KEY, rejects the unimplemented imageSigning="url" mode, and initializeFileStorage throws to block startup on any CloudFront init failure. OSS path is unchanged: missing keys still log-and-continue when requireSignedAccess is false. Adds low-noise startup and cookie-issuance logs without leaking signed URLs, policies, signatures, private keys, or cookie values. * fix(cloudfront): reject requireSignedAccess unless imageSigning is "cookies" Previously requireSignedAccess=true was accepted with imageSigning="none" or "url", but setCloudFrontCookies() only runs for "cookies" — leaving strict mode toothless: CloudFront stayed publicly accessible, or image delivery broke on a distribution that actually requires signed access. Adds a Zod refinement plus a runtime guard in initializeCloudFront so the only currently-functional strict configuration is imageSigning "cookies". Signed URL mode can lift this restriction once implemented. * fix(cloudfront): resolve strict access type checks * chore(cloudfront): reduce strict startup log noise --------- Co-authored-by: Danny Avila <danny@librechat.ai>	2026-05-11 23:30:01 -04:00
Danny Avila	3e7262cfe0	📦 chore: Bump `@librechat/agents` to v3.1.85 and `mermaid` to v11.15.0 (#13079 ) * 📦 chore: Update @librechat/agents to version 3.1.85 in package-lock.json and package.json files * 📦 chore: Update mermaid to version 11.15.0 in package.json and package-lock.json	2026-05-11 19:14:18 -04:00
Danny Avila	0a7255b234	🎭 feat: Support OpenID Audience On Refresh Grants (#13077 )	2026-05-11 17:40:30 -04:00
Danny Avila	c385f2ba88	📦 feat: Configure Skill Import Size Limit (#13073 ) * fix: configure skill import size limit * fix: validate skill import size in ui * fix: align skill import size boundary * fix: show exact skill import limit	2026-05-11 16:24:04 -04:00
Danny Avila	8735c1763c	🧵 fix: Preserve Upload Context Across Multipart Routes (#13072 ) * fix skill multipart imports under strict isolation * fix file upload context after multipart parsing * fix skill upload tenant resolution * fix rejected upload cleanup	2026-05-11 15:46:48 -04:00
Danny Avila	0449c423a2	🗝️ fix: Enforce Skill Share Role Permission (#13062 ) * fix: enforce skill share role permission * fix: preserve share capability bypass * refactor: move share policy middleware to api package * style: order share middleware imports * fix: satisfy share middleware type checks * test: cover share policy resource types	2026-05-11 09:39:58 -04:00
Danny Avila	7631366f52	🪵 chore: Log Subagent Limit Hits (#13068 )	2026-05-11 09:25:08 -04:00
Danny Avila	70b6bb69d3	🧬 fix: Bound Subagent Expansion (#13064 ) * fix: Bound subagent expansion * fix: Preserve subagent path depth	2026-05-11 08:53:53 -04:00
Danny Avila	9797c85e23	🧮 fix: Count Rejected Skill Import Bytes (#13063 )	2026-05-11 08:40:31 -04:00
Danny Avila	822ad6c36a	🧯 fix: Bound Permission Superset Cache Inputs (#13065 )	2026-05-11 08:39:37 -04:00
Danny Avila	52ccb1379b	🪪 refactor: Require Remote OIDC Audience for Agents API OAuth (#13066 )	2026-05-11 08:38:13 -04:00
Danny Avila	7129b1b1e4	📜 refactor: Improve Skill Handling Logs (#13057 ) * refactor: Streamline batch upload error handling in `uploadCodeEnvFile` * refactor: Enhance session info error logging in `getSessionInfo` * refactor: Update error logging to use `logAxiosError` in various agent handlers and skill file processing functions * refactor: Consolidate missing resource checks in `createToolExecuteHandler` for better clarity	2026-05-11 02:15:51 -04:00
Danny Avila	846eb0aa2c	🛰️ fix: Validate Vertex Endpoint Overrides (#13054 ) * fix: Validate Vertex endpoint overrides * fix: Allow Vertex PSC endpoint overrides * fix: Allow restricted Vertex PSC endpoints	2026-05-11 01:11:14 -04:00
Danny Avila	5bab22d236	🛡️ fix: Gate Bash PTC Capabilities (#13053 )	2026-05-10 21:23:02 -04:00
Danny Avila	c3ec23f9b8	🌐 feat: Support Vertex AI Multi-Region Endpoints (#13044 ) * feat: support Vertex AI multi-region endpoints * fix: sync Vertex endpoint with final location	2026-05-10 13:41:58 -04:00
Danny Avila	8fc68ebac0	🧬 refactor: Align OpenRouter Reasoning Payloads (#13039 ) * fix: Align OpenRouter reasoning payloads * test: Update OpenRouter reasoning expectations * fix: Preserve xhigh for future Claude models * fix: Preserve OpenRouter Responses verbosity * test: Type OpenRouter verbosity fixture * fix: Preserve custom verbosity values	2026-05-09 21:04:21 -04:00
Danny Avila	715a4a5fc1	🧰 refactor: Use Bash PTC for Agent Tools (#13042 ) * fix: Use Bash PTC for programmatic agent tools * fix: Preserve legacy PTC event calls	2026-05-09 16:31:09 -04:00
Danny Avila	2e683f112b	🦘 fix: Skip OpenAI Model Fetch For User-Provided Keys (#13038 ) * fix: skip OpenAI model fetch if using user-provided key There was a check present (via `opts.userProvidedOpenAI`), but it wasn't working because `loadDefaultModels()` doesn't provide that parameter. As a result, the server would repeatedly try to request models from OpenAI and get 401 errors in return. We now check the env var directly, which matches how `getAnthropicModels()` works. * chore: remove unused OpenAI model option * fix: honor explicit OpenAI key for model fetch * fix: fall back from empty OpenAI option key --------- Co-authored-by: Dan Lew <daniel@mightyacorn.com>	2026-05-09 16:12:25 -04:00
Danny Avila	80ce956c94	📜 fix: Scope Read File Prompt For Code Agents (#13040 ) * fix: Scope read_file prompt for code agents * fix: Align code read_file prompt behavior	2026-05-09 16:09:56 -04:00
Danny Avila	c67e2b54dc	🔐 feat: Mint Code API Auth Tokens (#13028 ) * feat: Mint CodeAPI auth tokens * style: Format CodeAPI download route * fix: Prune CodeAPI token cache * fix: Propagate CodeAPI managed auth * test: Mock CodeAPI auth in traversal suite * fix: Pass auth context to invoked skill cache * feat: Mint CodeAPI plan context * chore: Refresh CodeAPI auth guidance * fix: Guard OpenID JWT fallback * fix: Default CodeAPI JWT tenant in single-tenant mode * chore: Update @librechat/agents to version 3.1.84 in package-lock.json and package.json files * chore: Standardize references to Code API in comments and tests	2026-05-09 16:09:10 -04:00
Danny Avila	8a654dc8b1	🧭 feat: Add OpenRouter Prompt Cache Setting (#13029 ) * feat: add OpenRouter prompt cache setting * fix: type OpenRouter schema lookup * fix: honor proxied OpenRouter prompt cache * refactor: flatten endpoint schema fallback * chore: Bump `@librechat/agents` to version 3.1.82 * fix: Default OpenRouter prompt cache params * test: Align OpenRouter config expectations * test: Update OpenRouter default cache expectation * fix: Align OpenRouter Detection * chore: Bump `@librechat/agents` to version 3.1.83 * docs: Remove OpenRouter prompt cache setup note * refactor: Use provider enum for OpenRouter defaults * style: Format OpenRouter defaults guard	2026-05-09 11:46:09 -04:00
Dustin Healy	0d5c2b339a	🛟 fix: Allow Empty `modelSpecs.list` to Unstick Admin-Panel Saves (#13036 ) * 🛟 fix: Allow empty modelSpecs.list to unstick admin-panel saves The unconditional `.min(1)` on `specsConfigSchema.list` rejected an empty list even when `enforce: false`, leaving admin panels (which save fields path-granularly) with no atomic way to clear the list once it had been populated. Once an admin reached `list: [entry]` and deleted the only entry, every subsequent save failed schema validation and the section became stuck. Relax the schema to `.default([])`. The `.min(1)` was added in #5218 as part of bundled cleanup, not as a deliberate rule. Every consumer of `modelSpecs.list` already handles the empty/undefined case (`?.list`, `?? []`, length-checked), and `processModelSpecs` short-circuits to `undefined` when the list is empty so the runtime treats it as "no specs configured." No call site is load-bearing on length >= 1. Tighten the `buildEndpointOption.js` enforce guard from `?.list && ?.enforce` to `?.list?.length && ?.enforce`. Empty arrays are truthy in JS, so the existing guard would have entered the enforce branch on `list: []` and returned "No model spec selected" or "Invalid model spec" had `processModelSpecs` ever been bypassed. Add a runtime warn in `processModelSpecs` when `enforce: true` is configured alongside an empty list, so operators see the resulting "enforcement disabled" state in logs rather than silently getting a permissive runtime. Add coverage for the empty-list parse path in `config-schemas.spec.ts` and for the empty-list-with-enforce branch in `buildEndpointOption.spec.js`. * chore: update import order in config-schemas.spec.ts	2026-05-09 11:39:15 -04:00
Danny Avila	c7a4e6d418	📦 chore: Bump `@babel/preset-env` to v7.29.5 (#13034 )	2026-05-08 19:51:06 -04:00
Danny Avila	b922187abb	🛟 fix: Summarization Provider misses `vertexai` + case-mismatched custom endpoints (#13025 ) `resolveSummarizationProvider` calls `getProviderConfig` to translate the agent's resolved provider into an initializer + client overrides. Three real-world inputs were unsupported and fell through to "raw provider" fallback (silently dropping client overrides): 1. `vertexai` — not in `providerConfigMap` at all. Vertex shares initialization with Google (auth-only runtime distinction). Map `Providers.VERTEXAI` to `initializeGoogle`. 2. `openrouter` (and other known custom providers) with CamelCase custom endpoint names — agent main flow looks up endpoints case-sensitively (case-preserving keys are how `loadCustomEndpointsConfig` lets users have distinct entries differing only in case). Once it succeeds, `agent.provider` is normalized to lowercase. Downstream resolvers re-enter `getProviderConfig` with the lowercased value and miss configs whose `name` is camel-cased. Add a case-insensitive fallback, narrowly scoped to known custom providers and only after the case-sensitive direct lookup fails. 3. Ambiguous case-insensitive matches (codex review feedback) — if the user has e.g. `OpenRouter` and `OPENROUTER` (neither lowercase) and the agent runtime passes `openrouter`, the case-insensitive fallback could silently route to whichever entry appears first in the array (potentially different baseURL/apiKey). Detect multiple case-insensitive matches and throw a clear error with both names rather than picking arbitrarily. ## Tests `providers.spec.ts` — new file, 7 tests: - vertexai → Google initializer - google (API key) → Google initializer (regression guard) - case-insensitive fallback when only CamelCase entry exists - exact-case match preserved when both casings exist (case identity) - exact-case lowercase entry still resolves - throws on ambiguous case-insensitive matches when no exact-case exists - still throws when no match at all	2026-05-08 18:52:01 -04:00
Dustin Healy	e262219c8f	🔄 feat: Cross-Origin Admin OAuth Refresh (#13007 ) * feat(admin-panel): add /api/admin/oauth/refresh endpoint for cross-origin BFF refresh The cookie-based /api/auth/refresh controller can't be reached cross-origin from a separately-hosted admin panel because the refresh-token cookie isn't sent on cross-origin fetches. Add a dedicated POST /api/admin/oauth/refresh endpoint that accepts the refresh token in the request body, exchanges it at the IdP via openid-client refreshTokenGrant, and returns the same response shape as /api/admin/oauth/exchange. Implementation lives in packages/api/src/auth/refresh.ts as the applyAdminRefresh helper. It validates the refreshed tokenset, looks up the admin user by openidId (with optional user_id disambiguation when multiple user docs share an openidId), mints the bearer via an injected mintToken hook, and runs an optional onRefreshSuccess hook for downstream forks that need to update server-side session state. The default mintToken passed by the OSS route signs an HS256 LibreChat JWT via generateToken so admin panel callers continue to use the existing local JWT strategy. Forks that prefer to hand back an IdP-signed token (e.g. for deployments where the JWT auth gate is JWKS-only) override mintToken without changing the helper or the route. Also threads expiresAt through AdminExchangeData and AdminExchangeResponse so admin panel clients can drive proactive refresh before the bearer expires. Defaults the OSS exchange flow to Date.now() + sessionExpiry. * fix(admin-panel): address review feedback on /api/admin/oauth/refresh mintToken now returns {token, expiresAt} so the minter is authoritative for the bearer's lifetime instead of deriving it from the IdP `exp` claim. The refresh response would otherwise lie to the admin panel and trigger premature or late refresh cycles. The helper now falls back to the inbound refresh_token when the IdP omits one on rotation (Auth0 with rotation off, Microsoft personal accounts). Without this the admin panel loses its refresh capability after one cycle. Other hardening: resolveAdminUser validates user_id with Types.ObjectId.isValid before hitting Mongoose, avoiding a CastError that would surface as a generic 500 with no useful information for the client. If user_id resolves to a user whose openidId does not match the refreshed sub, throw USER_ID_MISMATCH (401) instead of silently swapping in a different user matching the sub. Wrap tokenset.claims() in readClaims so an IdP that returns a tokenset without a usable id_token gets mapped to CLAIMS_INCOMPLETE (502) rather than bubbling a raw exception. findUsers now uses the same SAFE_USER_PROJECTION as getUserById so the fallback path no longer pulls password/totpSecret/backupCodes into memory. Removed dead fields (email on AdminRefreshClaims, id_token on RefreshTokenset) and fixed import ordering per AGENTS.md. Adds packages/api/src/auth/refresh.spec.ts: 18 tests covering the happy path, userId disambiguation (match, invalid ObjectId, null, mismatch), all error branches (IDP_INCOMPLETE, CLAIMS_INCOMPLETE for both throw and missing sub, USER_NOT_FOUND, mintToken/onRefreshSuccess propagation), and refresh-token preservation under rotation/no-rotation. * chore(admin-panel): polish per re-review on /api/admin/oauth/refresh readClaims now logs the original error name/message at warn before mapping to CLAIMS_INCOMPLETE so a programming bug doesn't get silently rebadged as an IdP problem in production logs. The route handler's JSDoc now enumerates every error response (status + error_code) so admin-panel implementors can plan for each branch without reading the source. Tightens the helper's surface: removed the now-dead `exp` field from `AdminRefreshClaims` (only `sub` is read since the v2 mintToken refactor), and tightened `AdminRefreshDeps.findUsers`'s projection parameter from `string \| null` to `string` so the contract matches actual usage. Test polish: the userId-resolves-to-null fallthrough test now asserts the exact `findUsers` and `getUserById` call arguments so a regression in the fallthrough query shape is caught. The "skips onRefreshSuccess" test now asserts a populated response shape rather than just `toBeDefined`. Declined per prior triage and re-confirmed: a role guard inside `applyAdminRefresh` (downstream `/api/admin/` already enforces ACCESS_ADMIN via requireCapability) and moving the IdP grant call out of the JS route into TypeScript (matches existing oauth.js / openidStrategy pattern; package-boundary refactor belongs in a separate PR). fix(admin-panel): reject /api/admin/oauth/refresh tokensets from foreign issuers When the route handler can resolve the configured OpenID issuer, it now threads it into applyAdminRefresh as expectedIssuer. The helper compares that against the tokenset claims iss (after normalizeOpenIdIssuer on both sides to absorb trailing-slash differences) and throws ISSUER_MISMATCH (401) on mismatch. The check is skipped when either side is unset so behavior is unchanged for IdPs that don't return iss on a refresh-grant id_token, and for older deployments where the OpenID config doesn't expose serverMetadata. This is a defense-in-depth measure for the refresh path only. The deeper OIDC posture fix (binding IUser lookup to (sub, iss) as a pair) is pre-existing debt across openidStrategy.js and the regular exchange flow as well, and belongs in a separate PR with the schema change and backfill migration. * fix(admin-panel): bind refresh user lookup to (sub, iss) and handle getOpenIdConfig throw Two fixes raised on the PR thread that I previously misdescribed: The user lookup in resolveAdminUser was keyed on openidId alone, so a tokenset from a different issuer that happened to share the same sub could resolve to a local user from a different IdP. Now exports getIssuerBoundConditions and isUserIssuerAllowed from openid.ts (the helpers findOpenIDUser already uses) and reuses them. The findUsers filter becomes ($or of getIssuerBoundConditions for openidId) when an expectedIssuer is provided, with the same legacy backward-compat clause for users whose openidIssuer field was never populated. The direct user_id path now also checks isUserIssuerAllowed and throws USER_ID_MISMATCH if the stored openidIssuer disagrees with the configured issuer. The route's getOpenIdConfig() call was previously documented as returning null when uninitialized; the actual implementation throws. That made the if (!openIdConfig) guard unreachable, and an unconfigured server would surface as 500 INTERNAL_ERROR rather than 503 OPENID_NOT_CONFIGURED. Wraps the call in try/catch so the documented 503 response is what callers actually receive. Adds 4 tests covering the new lookup binding behavior. * fix(admin-panel): re-check ACCESS_ADMIN on /api/admin/oauth/refresh The IdP refresh token can outlive a capability/role change, so the initial requireAdminAccess on the OAuth callback isn't sufficient. Inject canAccessAdmin via the existing capability model (hasCapability with SystemCapabilities.ACCESS_ADMIN, matching requireAdminAccess so custom roles and user grants are honored) and gate token minting on it. Capability backend errors are warn-and-denied to keep the bearer-mint path fail-closed. * fix(admin-panel): scope /api/admin/oauth/refresh to the request tenant The same (openidId, openidIssuer) pair is allowed across tenants by the user schema's unique index. The refresh helper was wrapping both the direct getUserById and the fallback findUsers in runAsSystem, bypassing tenant isolation, so an IdP identity that exists in two tenants could resolve to the wrong tenant's user and mint a JWT bound to that tenant. Drop the runAsSystem wrappers, add a trusted tenantId option to applyAdminRefresh, AND it into the fallback findUsers filter, and assert it against the direct getUserById result. Mount preAuthTenantMiddleware on the refresh route so the deployment's X-Tenant-Id header drives the trusted tenant via ALS. Single-tenant deploys (no header) keep the existing openidId-only behaviour. Adds TENANT_MISMATCH (401) and a regression covering duplicate (sub, iss) across tenants plus the direct-userId tenant assertion. * fix(admin-panel): gate /api/admin/oauth/refresh on OPENID_REUSE_TOKENS The OSS refreshController only refreshes OpenID tokensets when OPENID_REUSE_TOKENS is enabled. The body-based admin variant was unconditionally calling refreshTokenGrant, which made the flag ineffective for the admin OAuth flow and let admin sessions keep renewing in deployments that explicitly turned token reuse off. Add the same isEnabled(process.env.OPENID_REUSE_TOKENS) check up front and return 403 TOKEN_REUSE_DISABLED so the admin panel BFF can surface the configuration mismatch instead of silently churning through retries.	2026-05-08 17:23:02 -04:00
Danny Avila	a107520109	📦 chore: Bump `@librechat/agents` to v3.1.81 & npm audit fix (#13027 ) * 📦 chore: Bump `@librechat/agents` to v3.1.81 * chore: npm audit fix	2026-05-08 16:20:03 -04:00
Danny Avila	3d5e5348a4	🧵 fix: Include Code Outputs in Thread File Lookup (#13023 ) Code-execution outputs land on `messages.attachments` (set by `processCodeOutput`), while user uploads land on `messages.files`. The threadFileIds switch (#13004) walked only `files`, so on a single linear thread: Turn 1: assistant produces sample.xlsx → attachment with codeEnvRef Turn 2: user says "add 2 rows" → primeCodeFiles: file_ids=0 resourceFiles=0 → /exec sent files=[] → sandbox: FileNotFoundError: 'sample.xlsx' The `getThreadData` walk found zero file_ids because the assistant's codeEnvRef was on `attachments`, not `files`. Compounded by the DB select string `'messageId parentMessageId files'` which didn't pull `attachments` into memory in the first place — so even fixing the walk in isolation wouldn't have surfaced them. Both layers fixed: - `ThreadMessage` type adds `attachments?: Array<{ file_id?: string }>` - `getThreadData` walks both arrays, dedups via the same Set - `initialize.ts` selects `'messageId parentMessageId files attachments'` ## Test plan `packages/api/src/utils/message.spec.ts` (+6 cases): - collects file_ids from `attachments` - walks both `files` and `attachments` on the same message - regression: linear thread with code-output attachments across user→assistant→user→assistant produces the right file_ids - dedupes shared ids that appear in both arrays - skips attachments without file_id (mirrors `files` behavior) - empty `attachments` array `packages/api/src/agents/__tests__/initialize.test.ts` (+1 case): - locks the DB select string includes `attachments` alongside `files` / `messageId` / `parentMessageId` - [x] `npx jest src/utils/message.spec.ts` — 39/39 pass - [x] `npx jest src/agents/__tests__/initialize.test.ts` — 33/33 pass - [x] lint clean on all four touched files	2026-05-08 12:29:46 -04:00
Danny Avila	119ac9c944	📦 chore: bump `@librechat/agents` to v3.1.80 (#13021 )	2026-05-08 12:29:44 -04:00
Danny Avila	eb20d8805d	🐛 refactor: anchor code-generated file lookup on threadFileIds for branched conversations (#13004 ) * 🐛 fix: anchor getCodeGeneratedFiles on threadFileIds, not threadMessageIds In a branched conversation (regenerations producing the same code-output filename), `getCodeGeneratedFiles` would silently exclude files whose File-record `messageId` lived on a sibling branch. The user-visible symptom: "the previous file isn't persisted" — the LLM tries `load_workbook("output.xlsx")` on turn 2 and gets `FileNotFoundError` because LC sent `_injected_files: []` to codeapi instead of priming the prior turn's output. `claimCodeFile` is keyed by `(filename, conversationId, context)` — not by messageId. When sibling A first creates `output.csv`, the File record persists with `messageId = A`. When sibling N (a regeneration of A's parent) recreates `output.csv`, the claim finds A's record and `processCodeOutput` deliberately preserves `messageId = A` to keep file→original-creator provenance intact (correct behavior for the linear case where the original creator is in-thread). Turn N+1's `parentMessageId = N`. `getThreadData` walks back from N: the thread is `[N, root]` — sibling A is NOT in it. The pre-fix query filtered by `messageId IN [N, root]`, so the file was excluded. `getCodeGeneratedFiles` already lives next to `getUserCodeFiles`, which has always filtered by `file_id IN threadFileIds` (the file_ids referenced by `messages.files[]` arrays during the thread walk). The asymmetry — user-uploaded files anchored on the message's reference, code-generated files anchored on the File's own creator — was the bug. Anchoring both functions on `threadFileIds` reaches the right files regardless of which sibling first generated them. `File.messageId` stays informational ("who first generated this") for provenance and `processCodeOutput`'s "preserve original messageId on update" logic stays as-is — only the lookup key for thread-scoped fetches changes. - `packages/data-schemas/src/methods/file.ts`: signature + filter change. JSDoc spells out the branched-conversation rationale. - `packages/api/src/agents/initialize.ts`: pass `threadFileIds` instead of `threadMessageIds`. The local `threadMessageIds` declaration is removed since the only consumer is gone. - `packages/data-schemas/src/methods/file.spec.ts`: 5 new cases: - basic happy-path (file referenced by current thread) - the regression: file's creator messageId is on a sibling branch but file_id is in threadFileIds → finds it - empty/missing threadFileIds returns [] - cross-conversation isolation - non-execute_code context filter still applies (a chat attachment won't be returned even if its file_id is in threadFileIds — that's `getUserCodeFiles`'s job) Applies cleanly on top of dev. When LC #12960 (the typed CodeEnvRef cutover) lands, the only conflict is the legacy `metadata.fileIdentifier` metadata key flipping to `metadata.codeEnvRef` — same line, trivial resolve. - [x] `cd packages/data-schemas && npx jest src/methods/file.spec` — 42/42 pass (including the 5 new regression cases) - [x] `cd packages/api && npx jest src/agents` — 722/722 pass (modulo 2 pre-existing summarization e2e failures unrelated) - [x] `cd api && npx jest server/services/Files server/controllers/agents` — 432/432 pass - [x] `npx tsc --noEmit -p packages/api/tsconfig.json` — clean - [ ] Manual: branched conversation reproducer — generate a file in turn 1, regenerate the parent (sibling), then in turn N+1 ask the agent to read the file. Pre-fix: `FileNotFoundError`. Post-fix: the file is primed and load_workbook succeeds. * 🧪 test: lock initialize.ts → getCodeGeneratedFiles call shape Integration-level regression test asserting initializeAgent passes `threadFileIds` (not `threadMessageIds`) to getCodeGeneratedFiles in branched-conversation scenarios. Locks in the API shape from the previous commit, sitting one layer above the data-schemas unit test — so a future refactor to the priming chain can't silently revert to the messageId-based filter without surfacing a test failure here. Two cases: - The full call shape: agent.tools=['execute_code'], resendFiles=true, threadData mock returns distinct messageIds and fileIds. Asserts the call uses fileIds, and that getUserCodeFiles uses the same array (the symmetric design that closes the sibling-branch hole). - Empty threadFileIds: getCodeGeneratedFiles is still called with [] (its own internal early-return handles the empty case); getUserCodeFiles is gated at the call site and stays unscheduled.	2026-05-08 12:29:44 -04:00
Danny Avila	e7dbae32e5	🚦 fix: Preserve URL Auto-Submit Startup Config (#13017 ) * fix: Preserve URL auto-submit startup config * test: Cover URL auto-submit interface defaults	2026-05-08 12:29:44 -04:00
Danny Avila	0fe203aaca	🧠 fix: charge Gemini reasoning tokens in agent usage accounting (#13014 ) * 🧠 fix: charge Gemini reasoning tokens in agent usage accounting Resolves #13006. `usage.ts` previously billed `usage.output_tokens` directly. For Vertex AI Gemini thinking models, `@langchain/google-common`'s streaming path emits `output_tokens = candidatesTokenCount` only, dropping `thoughtsTokenCount`. Reasoning was billed at zero and the `total_tokens === input_tokens + output_tokens` invariant was broken. The fix lives in agents (danny-avila/agents#157) — but this is also a defense-in-depth backstop in case agents misses a path or another provider exhibits the same shape. `resolveCompletionTokens(usage)` adds `output_token_details.reasoning` back when (and only when) the gap is present (`total - input > output`), so providers that already include reasoning in `output_tokens` (OpenAI o-series, Anthropic, the Google-API wrapper) are no-ops — no double-counting. - `SplitUsage` gains a `completion` field; all four billing call sites in `processUsageGroup` use it instead of `usage.output_tokens`. - `total_output_tokens` in the result also reflects the corrected count. - `UsageMetadata` interface in `IJobStore.ts` adds the `output_token_details` field for type safety. - 4 new tests in `usage.spec.ts` cover: Vertex undercount fix, OpenAI no-double-count, structured spend path with cache + reasoning, no-op when no details present. * 🩹 fix: simplify reasoning correction to invariant-based gap check Initial fix gated the correction on `output_token_details.reasoning > 0`, which doesn't help in the live failure case: when google-common's stream emits the buggy fallback usage_metadata, output_token_details is empty ({}) and the gate exits early. Live debugging showed the reliable signal is the documented invariant itself: `total_tokens === input_tokens + output_tokens`. When buggy streams undercount output, total exceeds input + output by exactly the unbilled reasoning. Use `total - input` as the corrected output. This is provider-agnostic and stays a no-op for compliant providers (OpenAI/Anthropic/Google-via-CustomChatGoogleGenerativeAI), where the gap is zero. Live verified end-to-end against gemini-3-flash-preview: - With agents fix in place: output_tokens=437 → billed 437 (no-op) - Backstop only (no agents fix, buggy input): raw 135, billed 297 (= total 309 - input 12, matches actual API charge) Updated tests to cover both scenarios.	2026-05-08 12:29:43 -04:00
Danny Avila	93c4ef4ba8	🧱 refactor: typed CodeEnvRef + kind discriminator + principal-aware sandbox cache (#12960 ) * 🧱 refactor: typed CodeEnvRef + kind discriminator + tenant-aware sandbox cache Final cutover for the LibreChat ↔ codeapi sandbox file identity. Replaces the magic string `${session_id}/${file_id}?entity_id=...` with a typed, discriminated `CodeEnvRef`. Pre-release lockstep deploy with codeapi #1455 and agents #148; no legacy aliases retained. ## Final shape ```ts type CodeEnvRef = \| { kind: 'skill'; id: string; storage_session_id: string; file_id: string; version: number } \| { kind: 'agent'; id: string; storage_session_id: string; file_id: string } \| { kind: 'user'; id: string; storage_session_id: string; file_id: string }; ``` `kind` drives codeapi's sessionKey: `<tenant>:<kind>:<id>[✌️<version>]` for shared kinds, `<tenant>:user:<userId>` for user-private (auth context provides `userId`). `version` is statically required for `kind: 'skill'` and forbidden otherwise via discriminated union — constraint holds at compile time on every consumer, not just codeapi's runtime validator. `id` is sessionKey-meaningful for `'skill'` / `'agent'`; informational only for `'user'` (codeapi resolves user identity from auth context). ## What changed - `packages/data-provider/src/codeEnvRef.ts` — discriminated union + `CODE_ENV_KINDS` const-tuple keeps the runtime list and TS union locked together. - Schemas: `metadata.codeEnvRef` and `SkillFile.codeEnvRef` enums tightened to `['skill', 'agent', 'user']`. - `primeSkillFiles` writes `kind: 'skill'`, `id: skill._id`, `version: skill.version`. Cache-hit path reads `codeEnvRef` directly. Bumping `skill.version` on edit naturally invalidates the prior cache entry under the new sessionKey. - `processCodeOutput` writes `kind: 'user'`, `id: req.user.id`. Output bucket is always user-scoped, regardless of which skill the execution invoked. New regression test pins the asymmetry. - `primeFiles` reupload preserves `kind`/`id`/`version?` from the existing ref so a skill-cache-miss reupload doesn't silently demote to user bucket. - `crud.js` upload functions (`uploadCodeEnvFile` / `batchUploadCodeEnvFiles`) thread `kind`/`id`/`version?` to the multipart form (codeapi #1455 option α). Without these on the wire, codeapi falls back to user bucketing and skill-cache invalidation never fires. Client-side validation mirrors codeapi's validator. - `Files/process.js` — chat attachments use `kind: 'user'`; agent setup files use `kind: 'agent'`. - Drops `entity_id` everywhere (struct, schema sub-docs, write paths, upload form fields). Drops `'system'` from the kind enum (no emitter ever existed). ## Test plan - [x] `cd packages/data-provider && npx jest src/codeEnvRef.spec` — 4 / 4 - [x] `cd packages/data-schemas && npx jest` — 1447 / 1447 - [x] `cd packages/api && npx jest src/agents` — 81 / 81 in skillFiles + handlers + resources - [x] `cd api && npx jest server/services/Files server/controllers/agents` — 436 / 436 - [x] `cd api && npx jest server/services/Files/Code` — 98 / 98 (incl. new "outputs are user-scoped regardless of which skill the execution invoked" regression and "reupload forwards kind/id/version from existing ref") - [x] `npx tsc --noEmit -p packages/data-{provider,schemas}/tsconfig.json && npx tsc --noEmit -p packages/api/tsconfig.json` — clean (only pre-existing unrelated dev errors in storage/balance, untouched here) ## Deploy notes - 24h cache-miss burst on first deploy. Inputs (skill caches re-prime under new sessionKey shape) and outputs (any pre-Phase C skill-output cached files become unreadable). Bounded by codeapi's 24h TTL. - Lockstep with codeapi #1455 and agents #148. Either repo can land first since no aliases to drain, but the three deploys must overlap within the same maintenance window. - `@librechat/agents` bump to `3.1.79-dev.0` required after agents #148 lands and is published. ## What this enables Auth bridge work (JWT-based tenant/user identity between LC and codeapi) — codeapi now derives sessionKey purely from `req.codeApiAuthContext.{ tenantId, userId}`, so the next chapter is replacing the header-asserted user identity with a verified-claim path. * 🩹 fix: persist execute_code uploads under codeEnvRef metadata key Codex review P1 (chatgpt-codex-connector). `Files/process.js` was storing the upload result under `metadata.fileIdentifier` even though: - `uploadCodeEnvFile` now returns `{ storage_session_id, file_id }`, not the legacy magic string. - The post-cutover schema (`File.metadata.codeEnvRef`) only declares `codeEnvRef` — mongoose strict mode silently strips unknown keys. - All readers (`primeFiles`, `getCodeFilesByIds`, `categorizeFileForToolResources`, controller filtering) check `metadata.codeEnvRef`. Net effect of the bug: chat-attached and agent-setup execute_code files would lose their sandbox reference on save, and primeFiles would skip them on subsequent code-execution turns — the file blob would still be available locally but never re-mounted in the sandbox. Fix: construct the full `CodeEnvRef` (`{ kind, id, storage_session_id, file_id }`) at the write site and persist under `metadata.codeEnvRef`. `BaseClient`'s "is this a code-env file" presence check accepts the new shape alongside the legacy `fileIdentifier` for back-compat with any pre-cutover records still in the database. Mirrors the same change in `processAttachments.spec.ts` (which re-implements the BaseClient logic for testability). New regression tests in `process.spec.js` cover three cases: - chat attachments (`messageAttachment=true`) → `kind: 'user'` - agent setup (`messageAttachment=false`) → `kind: 'agent'` - legacy `fileIdentifier` key is NOT persisted (would be schema-stripped) * 🩹 fix: read storage_session_id on primed file refs (Codex P1) Codex review (chatgpt-codex-connector). After Phase B's per-file `session_id` → `storage_session_id` rename, `primeFiles` emits the new field — but `seedCodeFilesIntoSessions` was still reading `files[0].session_id` for the representative session and `f.session_id` for the dedupe key. In runs with only primed attachments (no skill seed), `representativeSessionId` was `undefined`, the function returned the unchanged map, and `seedCodeFilesIntoSessions` silently dropped the entire batch. The first `execute_code` call then started without `_injected_files` and the agent couldn't see prior-turn artifacts. Fix: - `codeFilesSession.ts`: read `f.storage_session_id` for both the dedupe key and the representative session id. JSDoc updated to match the new field name. - `callbacks.js`: the two output-file persistence paths read `file.session_id` to pass to `processCodeOutput` — switch to `file.storage_session_id`. The original comment explicitly says this should be the STORAGE session, which is exactly the field Phase B renamed. - `codeFilesSession.spec.ts`: fixture builder uses `storage_session_id` and `kind: 'user'` to match the post-cutover `CodeEnvFile` shape. Lockstep coordination: this matches the post-bump shape of `@librechat/agents` 3.1.79+. CI tsc errors against the currently-pinned 3.1.78 are expected and resolve when the dep bumps in this PR before merge. * 📦 chore: Bump `@librechat/agents` to version 3.1.80-dev.0 in package-lock and package.json files * 🪪 fix: thread kind/id/version through codeapi /download URLs (Phase C α) Symmetric fix for the upload-side wire change in 537725a. Codeapi's `sessionAuth` middleware now requires `kind`/`id`/`version?` on every download/freshness URL — without them it 400s with "kind must be one of: skill, agent, user" before serving the file. Three sites construct codeapi-side URLs that go through `sessionAuth`: - `processCodeOutput` (`Files/Code/process.js`): `/download/<sess>/<id>` for freshly-generated sandbox outputs. Always `kind: 'user'` + `id: req.user.id` — code-output files are always user-private, regardless of which skill the run invoked. - `getSessionInfo` (`Files/Code/process.js`): `/sessions/<sess>/objects/<id>` for the 23h freshness check. Pulls kind/id/version straight off the `codeEnvRef` already in scope — skill files stay skill-bucketed, user files stay user-bucketed. - `/code/download/:session_id/:fileId` LC route (`routes/files/files.js`): proxies to codeapi for manual downloads. Code-output files only on this route, so `kind: 'user'` + `id: req.user.id`. The `getCodeOutputDownloadStream` helper in `crud.js` now takes an `identity` param, validated by a `buildCodeEnvDownloadQuery` helper that mirrors `appendCodeEnvFileIdentity`'s shape rules: kind required from the closed `{skill, agent, user}` set, version required for 'skill' and forbidden otherwise. Bad callers fail fast on the client instead of round-tripping a 400. Also cleans up two log-noise sources reported alongside the 400: - `logAxiosError` in `packages/api/src/utils/axios.ts` was dumping `error.response.data` raw. With `responseType: 'arraybuffer'` that's a `Buffer` (~4 chars per byte after JSON-serialization); with `responseType: 'stream'` it's a `Readable` whose internal state serializes the entire ring buffer + socket. New `renderResponseData` decodes small buffers as UTF-8 (truncated past 2KB) and stubs streams as `'[stream]'`. Diagnostics stay useful, log lines stop being megabytes. - `/code/download` route's catch was bare `logger.error('...', error)`, bypassing the redactor. Switched to `logAxiosError` so it benefits from the same buffer/stream handling. Tests updated to match the new contract: - crud.spec: `getCodeOutputDownloadStream` fixtures pass `userIdentity`; new cases cover skill identity (with version), bad kind rejection, skill-without-version rejection. - process.spec: `getSessionInfo` test passes a full `codeEnvRef` object. * ♻️ refactor: extract codeEnv identity helpers into packages/api Per the project convention that new backend code lives in TypeScript under `packages/api`, moves `appendCodeEnvFileIdentity` and `buildCodeEnvDownloadQuery` from `api/server/services/Files/Code/crud.js` into a new `packages/api/src/files/code/identity.ts` module. Both helpers are pure validators that mirror codeapi's `parseUploadSessionKeyInput` server-side rules (closed kind set, `version` required for `'skill'` and forbidden otherwise) — they deserve TS support and a dedicated spec rather than living as JSDoc-typed helpers in the legacy `/api` workspace. The new module: - Exports a `CodeEnvIdentity` interface using the `librechat-data-provider` `CodeEnvKind` discriminated union. - Adds 13 unit tests in `identity.spec.ts` covering the validation matrix (skill+version, agent, user, and every rejection path) plus URL encoding for the download query. - Re-exported from `packages/api/src/files/code/index.ts` alongside `classify`, `extract`, and `form`. Consumer updates: - `api/server/services/Files/Code/crud.js`: drops the local helpers and imports them from `@librechat/api`. Net -64 lines. - `api/server/services/Files/Code/process.js`: same. - Test mocks for `@librechat/api` in three spec files now stub the helpers' validation behavior locally rather than pulling them through `requireActual` (which would drag in provider-config init-time side effects). The package's `exports` field only surfaces the root barrel, so leaf imports aren't reachable from legacy `/api` test setup. No runtime behavior change. Identity validation rules and emitted form/query shapes are byte-for-byte identical pre/post. * 🪪 fix: emit resource_id alongside id on _injected_files (skill 403 fix) Companion to codeapi #1455 fix and agents 3.1.80-dev.1 — the wire shape for shared-kind files now requires `resource_id` distinct from the storage `id`. Without this LC change, codeapi's sessionKey re-derivation on every shared-kind /exec rejects with 403 session_key_mismatch: cached: legacy:skill:69dcf561...✌️59 (signed at upload, skill _id) derived: legacy:skill:ysPwEURuPk-...✌️59 (storage nanoid) Emit sites updated: - `primeInvokedSkills` cache-hit path: `resource_id: ref.id` (the persisted skill `_id` from `codeEnvRef.id`); `id: ref.file_id` unchanged (storage uuid). - `primeInvokedSkills` fresh-upload path: `resource_id: skill._id.toString()` on every primed file (the `allPrimedFiles` builder type now carries the field). - `processCodeOutput`'s `pushFile` (Code/process.js): `resource_id: ref.id` — for `kind: 'user'` this is informational (codeapi derives sessionKey from auth context) but emitted for shape uniformity with shared kinds. Bumps `@librechat/agents` to `^3.1.80-dev.1` (the version that ships the matching `CodeEnvFile.resource_id` field). ## Test plan - [x] `cd packages/api && npx jest src/agents` — 67 / 67 pass (skillFiles fixtures updated to assert `resource_id` on the emitted CodeSessionContext.files). - [x] `cd api && npx jest server/services/Files server/controllers/agents` — 445 / 445 pass (process.spec fixtures updated for the reupload + cache-hit emission). - [x] `npx tsc --noEmit -p packages/api/tsconfig.json` — clean. * fix(skill-tool-call): carry resource_id through primeSkillFiles → artifact Codeapi was 400ing every /exec following a `handle_skill` tool call with `resource_id is invalid` (`type: 'undefined'`). Both code paths in `primeSkillFiles` (cache-hit + fresh-upload) returned files without `resource_id`/`kind`/`version`, and the artifact in `handlers.ts` forwarded the stripped shape into `tc.codeSessionContext.files` → `_injected_files`. `primeInvokedSkills` (the NL-detected loader) had already been fixed end-to-end; this commit aligns the tool-invoked path with the same contract: `resource_id` = `skill._id.toString()`, `kind: 'skill'`, `version` = the skill's monotonic counter. Tests added to `skillFiles.spec.ts` lock the contract on `primeSkillFiles` directly so future refactors can't silently drop the resource identity again. * fix(handlers.spec): align session_id → storage_session_id rename + kind discriminator Pre-existing TS errors against the post-rename `CodeEnvFile` shape: the test file still used `session_id` on per-file objects (renamed to `storage_session_id` in agents Phase B/C) and was missing the `kind` discriminator the discriminated union requires. Both inputs and the matching `expect.toEqual(...)` mirrors updated together so the runtime equality check still holds. Lines 723-732 stay as-is — they sit behind `as unknown as ToolCallRequest` and TS already skipped them. * chore: fix `@librechat/agents`, correct version to 3.1.80-dev.0 in package.json files * chore: bump `@librechat/agents` to version 3.1.80-dev.1 in package.json and package-lock.json * chore: bump `@librechat/agents` to version 3.1.80-dev.2 * feat(observability): trace file priming chain from primeCodeFiles to _injected_files Diagnosing the user-upload "files=[] on first /exec" bug requires seeing where in the LC chain a file ref disappears. Prior to this patch the chain (primeCodeFiles → primedCodeFiles → initialSessions → CodeSessionContext → _injected_files) was opaque end-to-end: - primeCodeFiles silently dropped files without `metadata.codeEnvRef` - reuploadFile catches all errors and continues with no signal - the handlers.ts handoff to codeapi never logged what it was sending After this patch, a single grep on `[primeCodeFiles]` plus `[code-env:inject]` shows the full per-file path: [primeCodeFiles] in: file_ids=N resourceFiles=M [primeCodeFiles] file=<id> path=skip reason=no-codeenvref filename=... [primeCodeFiles] file=<id> path=cache-hit-by-session storage_session_id=... [primeCodeFiles] file=<id> path=reupload reason=no-uploadtime ... [primeCodeFiles] file=<id> path=reupload reason=stale ... [primeCodeFiles] file=<id> path=reupload-success oldSession=... newSession=... newFileId=... [primeCodeFiles] file=<id> path=reupload-failed session=... [primeCodeFiles] file=<id> path=fresh-active storage_session_id=... [primeCodeFiles] out: returned=N skippedNoRef=M reuploadFailures=K [code-env:inject] tool=<name> files=N missingResourceId=K (debug) [code-env:inject] M/N files missing resource_id ... (warn) [code-env:inject] tool=<name> _injected_files=0 ... (warn) The boundary log warns when LC sends zero injected files on a code-execution tool call — that's the user's actual symptom showing up at the LC side instead of having to correlate against codeapi's `Request received { files: [] }`. Tag chosen as `[code-env:inject]` rather than `[handoff:exec]` to avoid collision with the app-level "handoff" semantic (subagent handoff workflow). Structural cleanup in primeFiles: replaced the `if (ref) { ... }` nesting with an early `if (!ref) continue` so the per-path instrumentation hooks land at top-level scope instead of indented inside a conditional. Behavior unchanged; pushFile / reuploadFile identical. Spec fixtures (handlers.spec.ts, codeFilesSession.spec.ts) updated to include `resource_id` on `CodeEnvFile` literals — required by the post-3.1.80-dev.2 type now installed. ## Test plan - [x] `cd packages/api && npx jest src/agents/handlers.spec.ts src/agents/codeFilesSession.spec.ts src/agents/skillFiles.spec.ts` — 69/69 pass - [x] `cd api && npx jest server/services/Files/Code/process.spec.js` — 84/84 pass - [x] `npx tsc --noEmit -p packages/api` — clean - [x] `npx eslint` on all four touched files — clean * chore: add CONSOLE_JSON_STRING_LENGTH to .env.example for JSON log string length configuration * fix(files): align codeapi upload filename with LC's sanitized DB filename User-attached files for code execution were uploading to codeapi under `file.originalname` (raw upload filename, may contain spaces / special chars) while LC's DB record stored the sanitized form (`sanitizeFilename(file.originalname)`, underscores). Codeapi preserves whatever filename the upload sent, so the sandbox saw `/mnt/data/<originalname>` while LC's `primeFiles` toolContext text + `_injected_files.name` referenced `file.filename` (sanitized). Visible failure: agent gets system prompt saying /mnt/data/librechat_code_api_-_active_customer_-_2025-11-05.xlsx …tries that path, hits `FileNotFoundError`, then notices the sandbox's actual `Available files` line says /mnt/data/librechat code api - active customer - 2025-11-05.xlsx …retries with spaces, succeeds. Wastes a tool call per upload and leaks raw filenames into model context. Fix: sanitize once and use the sanitized form in both the codeapi upload AND the LC DB record. Sandbox path = LC toolContext text = in-memory ref name. No drift. Reupload path (`Code/process.js` line 867 `filename: file.filename`) already uses the sanitized DB name, so it stays consistent with the fresh-upload path after this change. ## Test plan - [x] `cd api && npx jest server/services/Files/process` — 32/32 pass - [x] `npx eslint` on the touched file — clean * chore: bump `@librechat/agents` to version 3.1.80-dev.3 in package.json and package-lock.json	2026-05-08 12:29:43 -04:00
Danny Avila	9441563b95	🛡️ refactor: Scope `allowedAddresses` By Port (#13022 ) * fix: Scope allowedAddresses by port * test: Fix SSRF agent spec typing	2026-05-08 12:28:34 -04:00
Danny Avila	b39bf837a7	📦 chore: Update `@librechat/agents` to v3.1.79 (#13000 )	2026-05-07 16:27:17 -04:00
Danny Avila	40a05bbf83	📦 chore: npm audit fixes and Mongoose 8.23 TypeScript follow-ups (#12996 ) * chore: Update axios dependency to version 1.16.0 across multiple package files * chore: Update express-rate-limit and ip-address dependencies to versions 8.5.1 and 10.2.0 in package-lock.json and package.json * chore: Update mongoose and hono dependencies to versions 8.23.1 and 4.12.18 across multiple package files * fix: Add type parameters to mongoose lean queries in accessRole and aclEntry methods * fix: Add type parameters to mongoose lean queries in action, agent, and agentCategory methods * chore: Update moduleResolution to 'bundler' in tsconfig.json for api and data-schemas packages * fix: Update mongoose lean queries to include type parameters across various methods for improved type safety	2026-05-07 09:47:40 -04:00
Danny Avila	1bc2692a15	🌥️ feat: Add Optional Region-aware S3/CloudFront Storage Keys (#12987 ) * feat(files): add optional region-aware storage keys * test(files): fix region storage CI fixtures * feat(files): finalize inline CloudFront asset namespaces * fix(files): allow wildcard region CloudFront cookies * fix(files): preserve legacy storage key compatibility * fix(files): align CloudFront clear cookie cleanup * fix(files): clear legacy CloudFront cookie scopes * chore(files): clean up storage review nits * fix(files): keep inline namespaces CloudFront-only	2026-05-06 23:16:56 -04:00
Danny Avila	ddf5879ccd	⏱️ fix: Align Auto-Refill Next Date (#12980 ) * fix: Align auto-refill next date * style: Fix auto-refill lint formatting * refactor: Share auto-refill eligibility date * refactor: Consolidate refill interval units * fix: Guard malformed refill interval units * fix: Preserve refill unit fallback label	2026-05-06 21:40:18 -04:00
Danny Avila	f0ab71f4f4	⏳ fix: Preserve Temporary Chat Retention Config (#12985 )	2026-05-06 19:53:53 -04:00
Danny Avila	9c81792d25	🔐 feat: Add Signed CloudFront File Downloads (#12970 ) * feat: add signed CloudFront downloads * fix: preserve local IdP avatar paths * fix: address signed download review findings * fix: harden CloudFront cookie scope validation * fix: preserve URL save API compatibility * fix: store CDN SSO avatars under shared prefix * fix: Harden CloudFront tenant file access * fix: Preserve CloudFront download compatibility * fix: Address CloudFront review follow-ups * fix: Preserve file URL fallback user paths * fix: Address download review hardening * fix: Use file owner for S3 RAG cleanup * fix: Address final download review nits * fix: Clear stale avatar CloudFront cookies * fix: Align download filename helpers with dev * fix: Address final CloudFront review follow-ups * fix: Stream S3 URL uploads * fix: Set S3 stream upload length * fix: Preserve download metadata filepath * fix: Avoid remote content length for stream uploads * fix: Use bounded multipart URL uploads * fix: Harden S3 filename boundaries	2026-05-06 19:48:30 -04:00
Danny Avila	f2de3a219c	🌐 fix: Preserve Unicode Filenames (#12977 ) * fix: Preserve unicode filenames * fix: Cap unicode filenames by bytes * fix: Preserve clean artifact directories * fix: Disambiguate normalized artifact names	2026-05-06 14:57:38 -04:00
Danny Avila	56b87f70bd	🧮 feat: Add GPT-5.5 Token Definitions (#12973 ) * fix: add gpt-5.5 token definitions * fix: align gpt-5.5 context limit	2026-05-06 10:50:16 -04:00
Danny Avila	6c6c72def7	🚀 feat: Decouple File Attachment Persistence from Preview Rendering (#12957 ) * 🗂️ feat: add `status` lifecycle to file records for two-phase previews Schema and model foundation for decoupling the agent's final response from CPU-heavy office-format HTML extraction. - `MongoFile.status: 'pending' \| 'ready' \| 'failed'` (indexed) and `previewError?: string` mirror the lifecycle: phase-1 emits the file record at `pending` so the response is unblocked; phase-2 transitions to `ready` (with text/textFormat) or `failed` (with previewError) in the background. Absent for legacy records — clients treat that as `ready` for back-compat. - Mirror types added to `TFile` in data-provider so frontend cache consumers see the new fields. - New `sweepOrphanedPreviews(maxAgeMs)` method on the file model recovers stale `pending` records left behind by a process restart mid-extraction; transitions them to `failed` with `previewError: 'orphaned'`. Cheap because `status` is indexed. * ⚡ feat: two-phase code-execution preview flow (unblocks final response) The agent's final response no longer waits on CPU-heavy office HTML extraction. Phase-1 (download + storage save + DB record at `status: 'pending'`) is awaited as before; phase-2 (extract + `updateFile`) runs in the background with a hard 60s ceiling. Three flows, all funneling through `processCodeOutput` and updated to the new `{ file, finalize? }` return shape: - `callbacks.js` (chat-completions + Open Responses streaming): emit the phase-1 attachment immediately (carries `status: 'pending'` for office buckets so the UI shows "preparing preview…"), then fire-and-forget `finalize()`. If the SSE stream is still open when phase-2 lands, push an `attachment` update event with the same `file_id` so the client merges over the placeholder in place. - `tools.js` direct endpoint: same split — return the phase-1 metadata immediately, run extraction in the background. Client polls for the resolved record. `finalize()` wraps the existing 12s per-render timeout in a 60s outer `withTimeout`. The HTML-or-null contract from #12934 is preserved: office types that fail extraction transition to `status: 'failed'` with `previewError: 'parser-error' \| 'timeout'` rather than falling back to plain text (would be an XSS vector). Promises continue running after the HTTP response closes (Node doesn't kill them). The boot-time orphan sweep covers the only case that loses progress — actual process restart mid-extraction. `primeFiles` annotates the agent's `toolContext` line for prior-turn files: `(preview not yet generated)` for pending, `(preview unavailable: <reason>)` for failed. The model can volunteer "you can still download it" instead of pretending the preview is fine. `hasOfficeHtmlPath` exported from `@librechat/api` so `processCodeOutput` can decide whether a file expects a preview at all. * 🔍 feat: `GET /api/files/:file_id/preview` endpoint and boot orphan sweep - New `GET /api/files/:file_id/preview` route returns `{ status, text?, textFormat?, previewError? }`. The frontend's `useFilePreview` React Query hook polls this while phase-2 is in flight, then auto-stops on terminal status. ACL identical to the download route (reuses `fileAccess` middleware). Defaults `status` to `'ready'` for legacy records so back-compat is implicit. `text` only included when `status === 'ready'` and non-null — preserves the HTML-or-null security contract from #12934. - `sweepOrphanedPreviews()` invoked on boot in both `server/index.js` and `server/experimental.js`. Recovers any `pending` records left behind by a process restart mid-extraction (the only case the in-process two-phase flow can't handle on its own). Fire-and-forget so a transient sweep failure doesn't block startup. * 🖥️ feat: frontend two-phase preview consumer (polling + UI states) Wires the React side to the new lifecycle so the user sees what's happening with their file while phase-2 extraction runs in the background and after the response stream closes. - `useAttachmentHandler` upserts by `file_id` (was append-only) so the phase-2 SSE update event merges over the pending placeholder in place. Lightweight attachments without a `file_id` (web_search / file_search citations) keep the legacy append path. - `useFilePreview(file_id)` React Query hook with `refetchInterval: (data) => data?.status === 'pending' ? 2500 : false` so polling auto-stops on the first terminal response without the caller having to flip `enabled`. - `useAttachmentPreviewSync(attachment)` bridges polled data into `messageAttachmentsMap`. Polling enabled iff `status === 'pending' && isAnySubmitting` — per the design ask: active polling while the LLM is still generating, then quiet. Process-restart and post-stream cases are covered by polling on the next interaction. - `Attachment.tsx` renders a small `PreviewStatusIndicator` (spinner + "Preparing preview…" for pending, alert icon + "Preview unavailable" for failed) inside `FileAttachment`. Download button stays fully functional in both states. Two new English locale keys. - Data-provider scaffolding: `TFilePreview` type, `endpoints.filePreview`, `dataService.getFilePreview`, `QueryKeys.filePreview`. * 🧪 fix: stub `useAttachmentPreviewSync` in pre-existing Attachment test mocks The new `useAttachmentPreviewSync` hook is called unconditionally inside `FileAttachment` (added in the prior commit). Two pre-existing test files mock `~/hooks` to provide `useLocalize` only — the un-mocked preview hook reference resolved to undefined and crashed render with `(0 , _hooks.useAttachmentPreviewSync) is not a function` on the Ubuntu/Windows CI runners. Fix is local to the test mocks: add a no-op stub that returns `{ status: 'ready' }` so the component renders the legacy chip path. The two-phase preview behavior itself has its own dedicated suites (`useAttachmentHandler.spec.tsx`, `useAttachmentPreviewSync.spec.tsx`). * 🐛 fix: route phase-2 attachment update to current-run messageId Codex P1 review on PR #12957. `processCodeOutput` intentionally preserves the original DB `messageId` across cross-turn filename reuse so `getCodeGeneratedFiles` can still trace a file back to the assistant message that originally produced it. The phase-1 SSE emit already routes by the current run's messageId — `processCodeOutput` runtime-overlays it via `Object.assign(file, { messageId, toolCallId })` and the callback writes `result.file` directly. Phase-2 was passing the raw `updateFile` return through `attachmentFromFileMetadata`, which read `messageId` straight off the DB record. On a turn-N run that re-emitted a filename from turn-1 (e.g. agent writes `output.csv` again), the phase-2 SSE update routed to `turn-1-msg` instead of `turn-N-msg`. Frontend's `useAttachmentHandler` upserts under the wrong messageAttachmentsMap slot — turn-N's pending chip stays stuck at "preparing preview…" while turn-1's already-resolved attachment gets re-merged. Fix: thread `runtimeMessageId` through `attachmentFromFileMetadata` and pass `metadata.run_id` from the phase-2 emit site. Mirrors how phase-1 sources its messageId. Tests cover the cross-turn reuse case plus the writableEnded / null-finalize / no-finalize paths to lock in the broader phase-2 emit contract. * 🛠️ refactor: address codex audit findings (wire-shape parity, DRY, defensive catch) Comprehensive audit on PR #12957. Resolves all valid findings: - MAJOR #1 — Wire-shape parity: phase-1 ships the full `fileMetadata` record over SSE; phase-2 was using a tight `attachmentFromFileMetadata` projection. Drop the projection and have phase-2 spread `{...updated, messageId, toolCallId}` so both events match the long-standing legacy phase-1 shape clients depend on. - MAJOR #2 — DRY: extract `runPhase2Finalize({ finalize, fileId, onResolved })` into `process.js` (alongside `processCodeOutput` whose contract it pairs with). Both `callbacks.js` paths and `tools.js` now flow through it. Single catch path eliminates divergence surface — the fix landed in 01704d4f0 (cross-turn messageId routing) was a symptom of this duplication risk. - MINOR #3 — JSDoc accuracy: `finalizePreview`'s buffer is bounded by `fileSizeLimit`, not the 1MB extractor cap. Updated and added a note about peak heap from queued buffers. - MINOR #4 — Defensive catch: `runPhase2Finalize`'s catch attempts a best-effort `updateFile({ status: 'failed', previewError: 'unexpected' })` for the file_id, so a programming bug in `finalizePreview` doesn't leave the record stuck `'pending'` until the next boot-time orphan sweep. - NIT #6 — Stale PR refs: 12952 → 12957 in 3 places. - NIT #7 — Schema bound: `previewError` capped at `maxlength: 200` to prevent a future codepath from accidentally persisting a stack trace. Skipped per audit verdict (non-blocking): - #5 (memory pressure): documented in JSDoc; impl change was reviewer's "consider", not actionable. - #8 (double DB query per poll): low cost, indexed by_id, polling is gated narrow. - #9 (TAttachment cast): the union type is intentional; the casts are safe widening, refactoring TAttachment is invasive and out of scope. Tests: 11 new (7 `runPhase2Finalize` unit tests covering happy path, null-finalize, throws, double-fail, no-fileId, no-onResolved; +4 wire-shape parity assertions in the existing cross-turn test). 328 backend tests pass; 528 frontend tests pass; lint and typecheck clean. * 🛡️ refactor: address codex P1+P2 + rename to drop phase-1/2 jargon Codex round 2 review on PR #12957 caught two race conditions and one recovery gap, all triggered by cross-turn filename reuse (`claimCodeFile` intentionally returns the same `file_id` for the same `(filename, conversationId)` across turns). Plus naming cleanup the user requested — internal "phase 1 / phase 2" vocabulary leaks across sprints, replace it everywhere with terms describing what's actually happening. P1 — stale render overwrites newer revision (process.js) Two turns reusing `output.csv` share a `file_id`. If turn-1's background render resolves AFTER turn-2's persist step, the unconditional `updateFile` writes turn-1's stale text/status over turn-2's pending placeholder. Fix: stamp a fresh `previewRevision` UUID on every emit, thread it through `finalizePreview`, and make the commit conditional via a new optional `extraFilter` argument on `updateFile` (`{ previewRevision: <expected> }`). The defensive `updateFile` in `runPreviewFinalize`'s catch uses the same guard so a programming error from an older render also can't override a newer turn. P1 — stale React Query cache on pending remount (queries.ts) Same root cause from the frontend side. Cache key `[QueryKeys.filePreview, file_id]` may hold a prior turn's `'ready'` payload; with `refetchOnMount: false` and the polling gate on `pending`, polling never starts for the new placeholder. Fix: `useAttachmentHandler` invalidates that query whenever an attachment with a `file_id` arrives. Both initial-emit and update events trigger invalidation — uniform gate. P2 — quick-restart orphans skipped by boot sweep (files.js) Boot `sweepOrphanedPreviews` uses a 5-min cutoff for multi-instance safety. A crash + restart inside the cutoff leaves `pending` records that never get touched again. Fix: lazy sweep inside the preview endpoint — if a polled record is `pending` and `updatedAt` is older than 5 min, mark it `failed:orphaned` on the spot before responding. Conditional on the same `updatedAt` we observed so a concurrent legitimate update wins. Cheap, bounded by user activity. Naming cleanup - `runPhase2Finalize` → `runPreviewFinalize` - `PHASE_TWO_TIMEOUT_MS` → `PREVIEW_FINALIZE_TIMEOUT_MS` - All `phase-1` / `phase-2` / `two-phase` prose replaced with "the immediate emit", "the deferred render", "the persist step", "the deferred preview", etc. Skill-feature `phase 1/2` references (different feature) left alone. Tests: 10 new (4 lazy-sweep × preview endpoint, 3 cache-invalidation × useAttachmentHandler, 3 extraFilter × updateFile data-schemas). Backend 332/332, frontend 531/531, data-schemas 37/37, lint clean. * 🛠️ refactor: address comprehensive review (round 3) — stale-cache MAJOR + 3 minors Comprehensive review on PR #12957 caught a P1 follow-on bug from the prior `invalidateQueries` fix, plus 3 maintainability findings. MAJOR: stale React Query cache not actually fixed by `invalidateQueries` The previous fix called `invalidateQueries` to flush stale cached preview data on cross-turn filename reuse. But `useFilePreview` had `refetchOnMount: false`, which made the new observer read the stale-marked 'ready' data without refetching. The polling `refetchInterval` then evaluated against stale 'ready' → returned `false` → polling never started → user stuck on stale content. Fix (belt-and-suspenders): a) `useAttachmentHandler` switched to `removeQueries` — drops the cache entry entirely so the next mount has nothing to read and must fetch. b) `useFilePreview` no longer sets `refetchOnMount: false`, so the React Query default (`true`) kicks in — second line of defense if any future codepath observes stale data before the handler has a chance to evict. MINOR: `finalizePreview` JSDoc missing `previewRevision` param Added with explanation of the conditional update guard. MINOR: asymmetric stream-writable guard between SSE protocols Chat-completions delegated the gate to `writeAttachmentUpdate`; Open Responses inlined `!res.writableEnded && res.headersSent`. Extracted `isStreamWritable(res, streamId)` predicate; both paths + `writeAttachmentUpdate` now share the single source of truth. NIT: `(data as Partial<TFile>).file_id` cast repeated 4 times Extracted to a `fileId` local at the top of the handler. Tests: existing 9 invalidate-tests rewritten as remove-tests; +1 new lock-in test asserts removeQueries is called and invalidateQueries is NOT (regression guard against round-3 finding). 332 backend pass, 532 frontend pass, lint clean. Skipped findings (deferred / acceptable): - MINOR: post-submission pending state has no auto-recovery — the `isAnySubmitting` polling gate was the user's explicit design; LLM context surfaces failed/pending so the model can volunteer. Worth a follow-up if real users hit it. - NIT: double DB query per preview poll — reviewer marked acceptable; changing `fileAccess` middleware is out of scope. * 🛡️ test: address comprehensive review NITs (initial-emit guard + isStreamWritable coverage) NIT — chat-completions initial emit skips writableEnded check The Open Responses initial emit was switched to use the new `isStreamWritable` predicate in the round-3 commit, but the chat-completions initial emit kept the older narrower check (`streamId \|\| res.headersSent`). On a client disconnect mid-stream (`writableEnded === true`) it would still hit `res.write` and raise `ERR_STREAM_WRITE_AFTER_END` — caught by the outer IIFE catch but logged as noise. Switch this site to `isStreamWritable` too so both initial-emit paths share the same gate as the deferred update emits. NIT — `isStreamWritable` not directly unit-tested The predicate was only covered indirectly via the deferred-preview SSE tests (writableEnded skip, headersSent check). Export from `callbacks.js` and add 5 parametric tests pinning down each branch (streamId truthy, res null, !headersSent, writableEnded, happy path) so a future condition addition can't silently regress. * 🐛 fix: stuck "Preparing preview…" + inline the chip subtitle Two related fixes for a stuck-spinner bug a user reported in manual testing of PR #12957. Stuck spinner (the bug) The deferred preview render can complete a few seconds AFTER the SSE stream closes (typical case: PPTX render finishes ~3s after the LLM emits FINAL). When that happens, the SSE update is silently dropped (`isStreamWritable` returns false on a closed stream) and polling is the only recovery path. The earlier polling gate was `status === 'pending' && isAnySubmitting`, which mirrored the original design intent ("only query while the LLM is still generating"). But `isAnySubmitting` flips false the moment the model emits FINAL — milliseconds before the deferred render commits. Polling never runs, the chip stays "Preparing preview…" forever even though the DB has `status: 'ready'` with valid HTML. Drop the `isAnySubmitting` part of the gate. `useFilePreview`'s `refetchInterval` is already a function-form that returns `false` on the first terminal response, so polling auto-stops within one tick of resolution. The server-side render ceiling (60s) plus the lazy sweep in the preview endpoint cap the worst case to ~24 polls per pending attachment. Polling itself never blocks UX — the gate's purpose was "don't waste cycles", and capping by terminal status is the correct expression of that. Inline the chip subtitle (the visual) The previous design rendered "Preparing preview…" as a loose-feeling spinner+text BELOW the file chip. The chip itself looked done while a floating annotation said it wasn't. `FileContainer` gains an optional `subtitle?: ReactNode` prop that overrides the default file-type label. `Attachment.tsx` passes a `PreviewStatusSubtitle` (spinner + "Preparing preview…" / alert + "Preview unavailable") into that slot when the file's preview is pending or failed. The chip footprint stays identical to its `'ready'` form — just the second row swaps from "PowerPoint Presentation" to the status indicator. No floating element, no layout shift. Tests: regression test pinning down "polling stays enabled after the LLM finishes" so a future revert can't reintroduce the stuck-spinner bug. Existing FileContainer tests pass unchanged (subtitle override is opt-in). 522 frontend tests pass; lint clean. * 🐛 fix: deferred-preview survives reload + matches artifact card chrome Fixes the remaining stuck-pending case after the polling gate fix: on a reloaded conversation, message.attachments come from the DB frozen at the immediate-persist `status: 'pending'`, but `messageAttachmentsMap` is empty because no SSE handler ever fired for that messageId. Polling now INSERTS a new live entry when no record matches the file_id, and `useAttachments` merges live entries onto DB entries by file_id so the resolved text/textFormat reach `artifactTypeForAttachment` and the chip routes through the proper PanelArtifact card. Also replaces the small file chip used during the pending state with a PreviewPlaceholderCard that mirrors ToolArtifactCard chrome, so the transition to the resolved PanelArtifact no longer reshapes the UI. * ✨ feat: auto-open panel when deferred preview resolves pending→ready The legacy auto-open path is gated only on `isSubmitting`, so an office-file preview that resolves after the SSE stream closes would render in place but never auto-open the panel — even though that's exactly the moment the result becomes meaningful to the user. Adds a per-file_id one-shot signal that `useAttachmentPreviewSync` flips on the pending→ready edge; `ToolArtifactCard` consumes it on mount and auto-opens regardless of submission state. The signal is only set on the actual transition (history loads of pre-resolved files don't trigger it) and is consumed once (panel close + reopen on the same card stays user-controlled). * 🐛 fix: drop placeholder Terminal overlay + scope auto-open to fresh resolutions Two fixes for issues spotted in manual testing of the deferred-preview auto-open feature: 1. PreviewPlaceholderCard was passing `file={attachment}` to FilePreview, which triggered SourceIcon's Terminal overlay (`metadata.fileIdentifier` is set on every code-execution file). The artifact card itself doesn't show that overlay; the placeholder shouldn't either, so the pending→resolved transition is visually seamless. 2. The `previewJustResolved` flag flipped on every pending→ready transition observed by the polling hook — including stale-pending DB records that resolve via the first poll on a history load. Conversations whose immediate-persist snapshot left attachments at `status: 'pending'` would yank the panel open every revisit. Adds `mountedDuringStreamRef` to the hook (mirroring ToolArtifactCard) so the flag fires only when the hook itself was mounted during an active turn — preserving the pre-PR contract that the panel only auto-opens for results the user is actively waiting on, never for history. * 🐛 fix: don't downgrade preview to failed when only the SSE emit throws Codex P2 finding on PR #12957: the original chain placed `.catch` after `.then(onResolved)`, so a throw inside `onResolved` (transport-side errors — SSE write race after stream close, an emitter listener throwing) would propagate into the finalize catch and persist `status: 'failed'` / `previewError: 'unexpected'`. That surfaced "preview unavailable" in the UI for a perfectly valid file, and degraded next-turn LLM context to reflect a non-existent failure. Wraps `onResolved` in its own try/catch so emit errors are logged but do not affect the file's persisted status. Extraction success and emit success are now independent: if extraction succeeds and `finalizePreview` writes the terminal status, the polling layer / next page load surfaces the resolved preview even if this turn's SSE emit didn't land. * 🛡️ fix: run boot-time orphan sweep under system tenant context Codex P2 finding on PR #12957: `File` is tenant-isolated, so under `TENANT_ISOLATION_STRICT=true` the boot-time `sweepOrphanedPreviews` threw `[TenantIsolation] Query attempted without tenant context in strict mode` and the recovery path silently failed every restart. Stale `status: 'pending'` records would be stuck until a user happened to poll the preview endpoint and trigger the lazy sweep — which only covers the file the user is currently looking at, not the bulk candidate set the boot sweep is designed to recover. Wraps the sweep in `runAsSystem(...)` in both boot paths (`api/server/index.js` and `api/server/experimental.js`) and pins the contract with regression tests in `file.spec.ts` — one test asserts the bare call throws under strict mode, the other asserts the `runAsSystem`-wrapped call succeeds. * 🧹 chore: trim verbose comments from previous commit * 🧹 chore: address review findings (dead branch, lazy-sweep cutoff, stale JSDoc) - finalizePreview: drop unreachable !isOfficeBucket branch (caller already gates on hasOfficeHtmlPath, so this path is always office) - preview endpoint: drop lazy-sweep cutoff from 5min to 2min — anything past the 60s render ceiling is definitively orphaned, and per-request sweep can be tighter than the per-instance boot sweep - strip stale `isSubmitting` references from JSDoc in 3 spots (the client-side gate was removed in `9a65840`) Skipped: function-length (#3) and client-side polling cap (#4) — refactors without correctness/perf wins; remaining NITs. * 🧹 fix: trim 1 query off pending polls + clear stale lifecycle on cross-shape updates - Preview endpoint: reuse fileAccess middleware's record for the lifecycle check; only re-fetch with text on the terminal ready response. Cuts the typical poll lifecycle from 2(N+1) to N+1 queries, since the vast majority of polls hit while pending and don't need text at all. - processCodeOutput non-office branch: explicitly null out status, previewError, previewRevision (codex P2). Without this, an update at the same (filename, conversationId) where the prior emit was an office file leaves stale lifecycle fields and the client renders the wrong state for the now non-office artifact. - Tests: rewire preview.spec mocks for the new shape, add boundary test pinning the 2min cutoff, add regression test for the cross-shape update. * 🐛 fix: keep polling on transient errors but cap permanently-broken endpoint Codex P2: the previous `data?.status === 'pending' ? 2500 : false` gate killed polling on the first transient error. With `retry: false`, a 500 left `data` undefined, the callback returned false, and the chip was stuck "Preparing preview…" forever — exactly the bug the polling layer was supposed to recover from. Inverts the gate: stop on terminal success (`ready`/`failed`) or after 5 consecutive errors. Transient errors keep retrying; a permanently broken endpoint caps at ~12.5s instead of polling forever. Predicate extracted as `previewRefetchInterval` for direct unit testing without fighting React Query's timer machinery. * ✨ feat: render pending-preview files in their own row Pending deferred-preview chips now bucket into a separate row above the resolved attachments — reads as "this is still happening" rather than mixing with completed downloads. Once status flips to ready, the chip re-buckets into panelArtifacts; failed re-buckets into the file row alongside other downloads. * 🎨 fix: render pending-preview chips in the panel-artifact row, not the file row Previous bucketing put pending chips in the file row (since `artifactTypeForAttachment` returns null for empty-text records). The pending placeholder is a future panel artifact — sharing the row keeps the chip in place when it resolves instead of jumping rows. Plain files still get their own row. * 🐛 fix: phase-1 SSE replay must not regress a resolved attachment Codex P1: useEventHandlers.finalHandler iterates responseMessage.attachments at stream end and dispatches each through the attachment handler. Those records are the immediate-persist snapshot (status:pending, text:null) — if a deferred update has already moved the same file_id to ready/failed, the existing merge let the pending fields win and downgraded the resolved record. Result: chip flickers back to pending and polling restarts until the lazy sweep corrects. Pin the terminal lifecycle fields (status, text, textFormat, previewError) when existing is ready/failed and incoming is pending. Other field updates still go through. * 🐛 fix: track preview-poll error cap outside React Query state Codex P2: the previous cap relied on `query.state.fetchFailureCount`, but React Query v4's reducer resets that to 0 on every fetch dispatch (the `'fetch'` action). With `retry: false`, each failed poll left count at 1 and the next dispatch reset it back to 0, so the `>= 5` branch never fired and a permanently-broken endpoint polled forever. Track consecutive errors in a module-level Map keyed by file_id, incremented in a thin `fetchFilePreview` wrapper around the data service call. The Map is cleared on success and on cap-stop, so memory is bounded by in-flight pending file_ids per session.	2026-05-06 03:04:19 -04:00
Danny Avila	cf0657509c	🧵 feat: Enable Anthropic Tool Argument Streaming (#12962 ) * fix: Enable Anthropic Tool Argument Streaming * fix: Honor Anthropic clientOptions drops * fix: Preserve custom Anthropic beta headers * fix: Enable Bedrock Anthropic Tool Streaming	2026-05-06 01:09:14 -04:00
Danny Avila	f839a447e1	🧬 fix: Subagent MCP requestBody Propagation (bump `@librechat/agents to 3.1.78` + cleanup) (#12959 ) * 📦 chore: bump `@librechat/agents` to v3.1.78 v3.1.78 ships [danny-avila/agents#147](https://github.com/danny-avila/agents/pull/147), which makes `SubagentExecutor` inherit the parent invocation's `configurable` (with `thread_id`/`run_id`/`parent_run_id` scrubbed) into the child workflow. Subagent tool dispatches through the parent's `ON_TOOL_EXECUTE` handler now arrive with parent's `requestBody`, `user`, `userMCPAuthMap`, etc. — so `{{LIBRECHAT_BODY_}}` placeholder substitution and per-user MCP connection lookup work for subagent tool calls the same way they do for the parent agent. Note: `package-lock.json` will need an `npm install` refresh once v3.1.78 lands on the registry. The user/user_id injection added in PR #12950 stays as defense-in-depth. 🗑️ refactor: drop redundant user/user_id injection from `loadToolsForExecution` `@librechat/agents@3.1.78` (via danny-avila/agents#147) makes `SubagentExecutor` forward the parent's `configurable` verbatim into the child workflow. Subagent `ON_TOOL_EXECUTE` dispatches now arrive with parent's `user` / `user_id` already in `data.configurable` — making the host-side injection added in #12950 a no-op. Removes: - The conditional `user: createSafeUser(req.user); user_id: req.user.id` block in `loadToolsForExecution` (req.user.id-guarded so the `'api-user'` fallback in Responses/OpenAI controllers is preserved). - The unused `createSafeUser` import. - The 4 unit tests covering the now-deleted behavior. The merge in `handlers.ts` (`{ ...configurable, ...toolConfigurable }`) still produces a `mergedConfigurable` with the right user identity for both parent and subagent paths — the values just come from `configurable` (forwarded by the SDK) rather than `toolConfigurable`. Other fixes from #12950 stay (IUser.id narrowing, the env.ts / google/initialize.ts / remoteAgentAuth.ts TS-warning fixes) — they were independent of the subagent identity propagation issue. * 📦 chore: update `@librechat/agents` to v3.1.78 This update reflects the transition from the development version `3.1.78-dev.0` to the stable release `3.1.78`. The package-lock.json has been refreshed to ensure consistency with the new version, including updated integrity checks and resolved URLs for the package. This change is part of ongoing improvements to enhance the functionality and stability of the agents module.	2026-05-05 22:07:26 -04:00
Danny Avila	9efd61d57d	🔐 fix: Forward per-file entity_id through code-env priming (#12958 ) * 🔐 fix: Forward per-file `entity_id` through code-env priming Skill files and persisted code-env files now carry their `entity_id` on the in-memory file refs that seed `Graph.sessions`. Without this, an execute call that mixes a skill file (uploaded with `entity_id=skillId`) and a user attachment (uploaded with no `entity_id`) collapses onto a single request-level entity at the codeapi authorization step and one side 403s. With per-file `entity_id`, codeapi resolves sessionKey per file and both authorize. - `primeSkillFiles` / `primeInvokedSkills`: thread `entity_id` through fresh-upload, cache-hit, and per-skill-batch paths in `packages/api/src/agents/skillFiles.ts`. - `primeFiles` (Code/process.js): parse `entity_id` from the persisted `codeEnvIdentifier` query string once per iteration; forward through `pushFile`, including the reupload path which re-parses the fresh identifier returned by codeapi. - Tests: extend `skillFiles.spec.ts` with two cases — fresh-upload propagation and cached-hot-path parsing. Companion PRs in flight on `@librechat/agents` (forward `entity_id` through `_injected_files`) and codeapi (per-file authorization). All three are wire-back-compat: an absent `entity_id` falls back to the existing request-level resolution. * 🔧 chore: Update dependencies in package-lock.json and package.json - Bump `@librechat/agents` to version `3.1.78-dev.0` across multiple package files. - Upgrade `@langchain/langgraph-checkpoint` to version `1.0.2` and update its peer dependency for `@langchain/core` to `^1.1.44`. - Update `axios` to version `1.16.0` and `follow-redirects` to version `1.16.0`. - Add `@types/diff` as a new dependency at version `7.0.2` and include `diff` at version `9.0.0` in the `@librechat/agents` module. - Introduce optional peer dependency `@anthropic-ai/sandbox-runtime` for `@librechat/agents` with metadata indicating it is optional. * 🐛 fix: Make skill code-env cache persistence observable Two changes to surface the skill-bundle re-upload issue without behavioral changes to tenant scoping (root cause to be confirmed via the new warn log): 1. `primeSkillFiles` now awaits `updateSkillFileCodeEnvIds` instead of firing-and-forgetting it. The prior shape could race with the next prime (read-before-write) even when the bulkWrite itself succeeds, producing a silent cache miss. Latency cost: ~10–50ms on first prime; in exchange every subsequent prime can rely on the identifier being persisted by the time it reads. 2. `updateSkillFileCodeEnvIds` now returns `{matchedCount, modifiedCount}` from the underlying bulkWrite. `primeSkillFiles` warn-logs when `modifiedCount < updates.length`, making any silent drop visible — whether the cause is tenant filtering, a `relativePath` mismatch, schema-plugin scoping, or something else. Prior shape returned `Promise<void>` so any zero-modification result was invisible. Tests: - `skill.spec.ts`: real-MongoDB happy path (counts match), no-match case (modifiedCount=0), and empty-input contract. - `skillFiles.spec.ts`: deferred-promise harness proving the call site awaits the persist (prime stays pending until the persist resolves) and forwards partial-write counts. Deliberately narrower than the original draft of this commit, which also bypassed `tenantSafeBulkWrite` for the codeEnvIdentifier write on the speculative diagnosis that tenant filtering was the cause. That change was a behavior shift on tenant scoping without confirmation; reverted pending real-world signal from the new warn log. * 🐛 fix: Justify await for skill code-env persistence under concurrency The await on `updateSkillFileCodeEnvIds` isn't a defensive nicety — it's load-bearing for cache effectiveness under concurrent priming. Verified with an out-of-tree harness (`config/test-skill-cache.ts`, not committed) that wires `primeSkillFiles` against a real codeapi stack: - With fire-and-forget (prior shape after this branch's revert): back-to-back primes for the same skill miss the cache. Call N+1 reads SkillFile docs before Call N's write commits, sees no `codeEnvIdentifier`, re-uploads, fires its own forget that Call N+2 also races. Steady-state stays in cache miss for the full burst. - With await: the prime that does the upload commits its persist before resolving, so the next concurrent prime observes the cache pointer instead of racing the read. Latency cost ~10–50ms on the upload prime; subsequent concurrent primes save an entire batch upload. In production with primes seconds apart this race is rare; at scale with many users hitting the same skill in the same second it's the difference between M and N×M uploads. Updates the regression test to assert the await contract (deferred persist promise → prime stays pending until persist resolves). Comment in `skillFiles.ts` rewritten to document the concurrency rationale rather than the weaker "race-with-next-prime" framing the prior commit used.	2026-05-05 18:35:09 -04:00
Atef Bellaaj	187ab787da	🌩️ feat: CloudFront CDN File Strategy (#12193 ) * 🌩️ feat: CloudFront CDN File Strategy + signed cookies Squashed from PR #12193: - feat(storage): add CloudFront CDN file strategy - feat(auth): add CloudFront signed cookie support Note: package.json/package-lock.json dependency additions are intentionally omitted from this commit and will be re-added via `npm install` after rebase to avoid lock-file merge conflicts. The two new peer deps that need to be re-installed are: - @aws-sdk/client-cloudfront@^3.1032.0 - @aws-sdk/cloudfront-signer@^3.1012.0 Also fixes 4 missing destructured names in AuthService.spec.js (getUserById, generateToken, generateRefreshToken, createSession) that were referenced in tests but not imported from the mocked '~/models'. * 📦 chore: install CloudFront SDK deps for PR #12193 Adds the two AWS CloudFront packages required by the rebased CloudFront CDN strategy: - @aws-sdk/client-cloudfront - @aws-sdk/cloudfront-signer Following the @aws-sdk/client-s3 pattern: - api/package.json: regular dependency (runtime resolution) - packages/api/package.json: peerDependency Generated by `npm install` against the freshly rebased lock file to avoid the merge conflicts that came from the original PR's lock-file edits being made against an older base of dev. * 🐛 fix: CI failures + review findings on CloudFront PR #12193 CI fixes - Rename packages/data-provider/src/__tests__/cloudfront-config.test.ts → src/cloudfront-config.spec.ts. Jest's default testMatch picks up __tests__/ directories even inside dist/, so the compiled .d.ts shell was being executed as an empty test suite. Moving to .spec.ts (matching the rest of the package) avoids the dist/ pickup. - Add cookieExpiry: 1800 to CloudFront crud.test makeConfig: the schema applies a default so CloudFrontFullConfig requires it. Review findings addressed - #1 (Codex + comprehensive): Normalize CloudFront domain with /\/+$/ regex (and key with /^\/+/ regex) in buildCloudFrontUrl, matching the cookie code so resource policy and file URLs stay aligned even when the configured domain has multiple trailing slashes. Added tests. - #2: Move DEFAULT_BASE_PATH out of s3Config into shared packages/api/src/storage/constants.ts. ImageService no longer imports S3-specific config. - #3: getCloudFrontConfig() returns Readonly<CloudFrontFullConfig> \| null to discourage mutation of the cached signing config. - #4: Add cross-field refinement tests for cloudfrontConfigSchema (invalidateOnDelete-without-distributionId, imageSigning="cookies"-without-cookieDomain). - #6: Revert unrelated MCP comment re-indentation in librechat.example.yaml. - #7: Add azure_blob to the strategy list comment. Skipped - #5 (extractKeyFromS3Url with CloudFront URLs): existing deleteFileFromCloudFront tests already cover the path-equivalence assumption; renaming the helper is real refactor work beyond this PR's scope. - #8, #9 (NIT, low confidence): leaving for author judgement. * 🧹 chore: drop dead DEFAULT_BASE_PATH from s3Config test mock After moving DEFAULT_BASE_PATH to ~/storage/constants, crud.ts no longer reads it from s3Config — so the entry in the s3Config jest mock was misleading dead config. The tests still pass because the unmocked real constants module provides the value. --------- Co-authored-by: Danny Avila <danny@librechat.ai>	2026-05-05 13:21:05 -04:00
Danny Avila	4583d5a926	⚡ refactor: Bound Concurrent Office-HTML Rendering for Code Artifacts (#12951 ) * 🧰 feat: add `createConcurrencyLimiter` promise utility Lightweight, dependency-free FIFO concurrency limiter for bounding parallelism of expensive async work that fans out from a single producer. Tasks queue rather than reject when the cap is reached; slots release on both fulfillment and rejection so a single failure cannot stall the queue. Each task runs inside a thunk so timeouts and other side effects do not start until a slot is acquired. * ⚡ perf: bound concurrent office-HTML rendering for code artifacts A tool result with N office files (DOCX/XLSX/XLS/ODS/CSV/PPTX) previously fanned out into N parallel mammoth/SheetJS/PPTX renders, all CPU-bound and synchronous. Under bursty agent output this competes with the still-running inference loop for event-loop time and inflates end-of-run "finalize" waits in non-streaming flows (BaseClient chat-completions, non-streaming Responses, the tools.js direct endpoint) which all `await Promise.all(artifactPromises)` before responding. Cap the parser layer at 2 concurrent office-HTML renders process- wide via a shared `createConcurrencyLimiter`. Tasks queue in FIFO and the per-render timeout starts only after a slot is acquired so queue waits do not consume the timeout budget. The HTML-or-null contract (no text fallback for office types) is preserved.	2026-05-05 08:53:21 -04:00

1 2 3 4 5 ...

1206 commits