* 🪟 fix: Re-measure sidebar chat list on width change to fix date-group spacing
When the sidebar is expanded from a collapsed reload, virtualized rows first
measure mid-animation at a narrow width, so date-group headers wrap and cache an
inflated height. CellMeasurerCache(fixedWidth) keys heights by row, not width, so
the stale height persists once full width is reached — leaving gaps under headers.
Invalidate the measurement cache and recompute row heights whenever the measured
list width changes. Adds a Playwright mock e2e (seeds backdated convos across date
groups via a new db helper) that fails without the fix and passes with it.
* 🧪 test: Harden sidebar e2e (runtime-env path, midnight-safe seed, convo isolation)
Addresses Codex review on PR #13981:
- db.ts honors E2E_RUNTIME_ENV_PATH when locating the runtime Mongo URI.
- Seed timestamps anchor on local noon so the Today group stays in-day near midnight.
- Clear the shared user's conversations before seeding so later date-group headers
are not pushed below the virtualized viewport by other specs' leftover chats.
* 🚀 perf: Decouple Pinned Agents from Global Agents Map in Sidebar
Pinned/favorite agents in the sidebar waited for the full global agents map (useListAgentsQuery, which walks every pagination cursor) before rendering. In environments with many agents this left pinned items in a loading state even though their IDs were already known.
FavoritesList now fetches pinned agent IDs directly via getAgentById when the global map is still loading, and falls back to filtering only missing IDs once the map is available. The loading state tracks just the small set of pinned-agent queries instead of the entire catalog, so pinned agents appear as soon as their own data resolves.
Closes#13967
* 🩹 fix: Address Codex review on pinned-agent decoupling
- Stop caching the {found, agent} wrapper under the shared [QueryKeys.agent, id] key; direct fetches now return a plain Agent like useGetAgentByIdQuery, so opening/selecting a pinned agent within the stale window can no longer read a wrapper as an agent. Missing (404/403) agents are detected via the query error state.
- Gate the direct fetches on the agents endpoint being enabled, so pinned agents the endpoint list intentionally hides are not fetched, rendered, or cleaned up when the endpoint is disabled.
- Keep the loading skeleton while a direct fetch fails with a transient (non-404/403) error and the global agents map is still loading, so a pinned agent no longer disappears on a momentary 500/network error during startup.
- Remove the now-unused AgentQueryResult type.
* 🩹 fix: Address Codex round 2 on pinned-agent decoupling
- Keep the loading skeleton (not an empty/collapsed row) while the endpoints query is still loading. The endpoint gate previously treated the default empty config as disabled, so pinned-agent favorites rendered an empty row that could be measured and cached by the CellMeasurer before the config arrived. isAgentsLoading now stays true while isEndpointsLoading is true.
- Replace the blanket retry:false on direct pinned-agent fetches with a predicate that skips missing-agent (404/403) errors but still retries transient 500/network failures, restoring the prior default-retry resilience on the fast path.
- Add data-testid to the favorite skeleton and a regression test for the endpoints-loading window.
* 🛡️ fix: Don't delete pinned favorites on a global agents 403
GET /api/agents/:id runs the role-level AGENTS.USE check (checkAgentAccess) before the per-agent VIEW ACL, so a temporarily revoked role returns 403 for every agent. Because direct fetches now run while the agents map is undefined, treating those 403s as missing agents made the cleanup effect persist reorderFavorites and wipe all pinned agent favorites.
staleAgentIdsKey now returns early while agentsMap is undefined, restoring the original invariant that favorite cleanup only runs once the global map has loaded successfully (which also proves AGENTS.USE is granted). Rendering of pinned agents while the map loads is unaffected; only deletion is deferred.
Skill files were primed into the sandbox at `/mnt/data/{skillName}/...`,
but the read_file/create_file/edit_file tool descriptions and the
read_file bash-fallback hints all assume the `skills/{skillName}/...`
namespace (sandbox cwd is `/mnt/data`). Agents therefore reached for
`./skills/my-skill/...` in bash and missed ~100% of the time.
- Add shared `SKILL_FILE_PREFIX` to agents/skills.ts (moved out of
handlers.ts; single source of truth across the three layers).
- Prefix the prime upload filenames and session names with `skills/` in
skillFiles.ts so the physical mount matches the model-facing namespace;
recover the bare relativePath by stripping `skills/{name}/`.
- Canonicalize the read_file bash-fallback hints to
`/mnt/data/skills/{skillName}/{relativePath}` so the implicit
`{name}/...` addressing form is corrected too.
Closes#13961
Opening the agent editor fetched the full `versions` array (each a complete
config snapshot) alongside the agent, so agents with large histories were slow
to open. Version history is now loaded only when the user opens it.
- Add `getAgentWithVersionCount` (aggregation: version count, no versions array)
and `getAgentVersions` data-schemas methods.
- `getAgentHandler` returns the version count without the heavy array; add
`GET /agents/:id/versions` (EDIT-gated) for lazy retrieval.
- Add `useGetAgentVersionsQuery`; VersionPanel reads current config from the
cached expanded query and fetches versions on open. Revert keeps the expanded
cache and versions query in sync.
The getSharedLink query sorts by updatedAt, but the sharedlinks
collection had no updatedAt index. Azure Cosmos DB for MongoDB
(RU-based) rejects sorts on non-indexed fields, causing an immediate
500 on GET /api/share/link/:conversationId whenever a conversation is
opened. Standard MongoDB is unaffected.
* 🐛 fix: route clipboard paste through upload-option guards
Pasting a file skipped the composer's attachment guards, so unsupported types such as
csv and xlsx reached the provider as document blocks and were rejected. Paste, drag, and
the upload modal now share getViableUploadOptions to decide routing: zero viable
destinations shows a toast, one auto-routes, several open the upload-type modal.
* 🐛 fix: key ephemeral agent state by NEW_CONVO in upload-option flow
useFileUploadRouter writes ephemeral capability state under
`conversationId ?? Constants.NEW_CONVO`, but useUploadOptions and DragDropModal
read it under `?? ''`, so on a new conversation the option resolver missed
capabilities enabled by auto-routing. Align the reads on Constants.NEW_CONVO.
* 🐛 fix: harden paste upload routing for assistants, custom endpoints, and toasts
Bypass option resolution for Assistants endpoints on paste, matching drag-and-drop,
so non-image assistant uploads use the assistants upload path instead of mis-routing
to context or the unsupported toast. Honor a custom endpoint's configured
supportedMimeTypes for direct provider attach instead of hardcoding image and PDF.
Stop asserting upload success before validation runs; the single-route notice is now
an informational "Attached as text" for the text-extraction case only.
* 🐛 fix: refine paste upload routing for direct chats, custom endpoints, and disabled uploads
Restore Code Interpreter and File Search options in direct and ephemeral chats by
defaulting their permissions to allowed unless a saved agent omits the tool; selecting
one still enables the ephemeral capability. Treat a custom endpoint as broad provider
support only when its file config is permissive (matching the file picker), so an
inherited default no longer offers zip/audio/video for direct attach. Short-circuit
paste with the disabled-upload error before resolving options or opening the modal.
* ✨ feat: Mirror send-path pruning in the over-window context estimate
For a snapshot-less branch whose tokens exceed the window, the send path
prunes oldest-first (getMessagesWithinTokenLimit), so the next call can sit
well under the window. The gauge previously clamped the full sum to 100%,
hiding that headroom. Add prunedBranchTokens — a newest->oldest walk that
keeps messages until the next would overflow the message budget (max minus the
summary baseline), mirroring the pruner — and use it on the estimate path in
place of the clamp. Approximation: omits the instruction/tool overhead and
tool-call pairing the real pruner accounts for (unknowable for a snapshot-less
branch); superseded by an exact snapshot once the branch is generated.
* ✨ feat: Reserve cached instruction/tool overhead in the snapshot-less estimate
The over-window prune mirror and the gauge couldn't account for the fixed
instruction + tool-schema overhead the next call always sends, because a
snapshot-less branch has no breakdown. The backend already emits that overhead
in the ON_CONTEXT_USAGE breakdown, so cache it per agent/model (keyed
endpoint::model::agentId, already inclusive of tool schemas) from the live
usage events, then reserve it from the prune budget and add it to used so the
estimate is consistent with snapshots. Falls back to message-only until the
agent has run once this session. Surfaced as a System row in the estimate
breakdown.
* 🩹 fix: Address Codex review on the over-window estimate
- Key the overhead cache by agentId when present. useTokenLimits resolves an
agent to its real provider/model, so the reader keyed `provider::model::agent`
while the writer stored `agents::::agent` — a cache miss for the main agents
case. Both sides now resolve to `agent:<id>` (non-agent configs: endpoint:model).
- Skip the overhead reserve when a summary baseline exists: computeSummaryUsedTokens
already folds instruction/tool overhead into that marker, so adding it again
double-counted on summarized branches.
- Collapse the breakdown's input/output/estimated rows into one pruned Messages
row when over-window pruning ran, so the popover matches the gauge instead of
summing to the discarded pre-prune history.
The message-edit route recomputed a user message's tokenCount from the edited
`text` alone, ignoring its persisted `quotes`. But the send path re-prepends
those quotes into the prompt on every turn (mergeQuotedText), so after editing a
quoted message the stored tokenCount under-reported by the whole quote block,
skewing the context gauge and any other tokenCount consumer.
The full-recount path now fetches the message's quotes and counts the merged
text+quotes via a new `mergeQuotedTextForCount` helper in packages/api (mirrors
the send path), so the stored count stays authoritative. The incremental
content-part path is left as-is: it deltas only the edited part and preserves the
rest of the count (incl. the quote contribution), and applies to content-array
messages rather than text+quotes user turns.
Deferred follow-up from #13953.
* ♻️ refactor: Compute Context Gauge Client-Side, Drop Projection Endpoint
The /api/endpoints/context-projection endpoint re-fetched a conversation's
messages from Mongo and re-tokenized them to project the context gauge for
snapshot-less branches. The browser already holds those messages and their
per-message tokenCounts, so this duplicated work on the request path (an
unbounded read + server-side BPE tokenization until it was later capped).
Move the snapshot-less estimate fully client-side, from the in-memory index:
- sumBranch accumulates an uncalibrated char/4 estimate (estTokens) for
count-less messages (imports / pre-feature) under the same summary cutoff
- useTokenUsage folds estTokens (calibrated via the existing calibrationFamily
ratio) into the existing fallback; known per-message counts render unchanged
- delete the endpoint, controller, rate limiter, route, the getMessageTextStats
data-schemas method, and the data-provider surface (endpoint/key/type/service/query)
No DB read, no server tokenization, no rate-limit knobs; the gauge recomputes
reactively from the index. Net -793 lines.
* 🩹 fix: Count quotes and object-form content in client context estimate
Address Codex review on the client-side context estimate:
- messageChars now reads object-form content text (part.text.value), not
only string text/think, so imported / pre-feature messages whose body
lives in content parts are no longer estimated as zero.
- Count-less user messages include their merged quote excerpts in the
estimate, mirroring what the send path prepends into the prompt.
* 🩹 fix: Cap over-window estimate and surface estimated tokens in breakdown
Address remaining Codex review on the client-side context estimate:
- Clamp the snapshot-less estimate's displayed usedTokens to maxTokens. The
send path prunes an over-window branch before calling the model, so the
gauge never actually exceeds the window; this avoids impossible values
(e.g. 50k / 8k) without re-introducing client-side pruning.
- Surface the calibrated count-less estimate as its own "Estimated" row in
the breakdown popover, so a branch of only count-less imported / pre-feature
messages is no longer shown as Input 0 / Output 0 under a non-zero header.
* 🩹 fix: Refine client context estimate per Codex re-review
- Drop calibration from the snapshot-less estimate. The removed projection
never actually calibrated (the client never sent a ratio), and a ratio
inflated by provider-injected context over-estimates visible imported text.
- Exclude reasoning (think) / error parts from the estimate; the send path
strips them, so they are not part of the next call's context.
- Fold quote text into the estimate even when a tokenCount is present, since
the edit route recounts tokenCount from text only and drops the merged quote.
* 🩹 fix: Recount quoted user turns instead of topping up the stored count
The previous round added quote chars on top of a quoted message's stored
tokenCount, which double-counts the common (unedited) case where the count
already includes the merged quote prompt. Match the removed projection
instead: for quoted user turns, ignore the stored count and estimate the
full merged text. This both avoids the double-count and still corrects the
stale text-only count an edit leaves behind.
* 🩹 fix: Trust stored counts for quoted turns; count tool-call parts
- Quoted user turns: revert to trusting a present tokenCount. The send path's
stored count already includes the merged quote (and any calibration), and
the client's char/4 path is coarser, so recounting regressed normal turns.
Only count-less messages estimate quotes from text.
- Count tool-call name/args/output for count-less assistant messages; the
formatter sends them back as context, so omitting them under-reported
imported branches with tool history.
* 🩹 fix: Exclude in-flight tail from estimate to avoid resume double-count
On resume the live path seeds liveTokens from the partial response and also
writes that content into the messages cache, where the count-less response
is estimated into estTokens too — double-counting the in-flight output on the
snapshot-less estimate path. sumBranch now exposes the tail message's own
estimate (tailEstTokens); the estimate path drops it while a stream is live,
so the in-flight response is counted once (via liveTokens). The breakdown's
Estimated row uses the same in-flight-adjusted value.
* 🩹 fix: Recount quoted user turns in context estimate (match send path)
A quoted user turn's stored tokenCount is unreliable for the gauge: a
text-only Save edit recomputes it from text alone, and the send path
(needsCanonicalTokenCount in agents/client.js) recounts the quote-merged
prompt every turn regardless of the stored value. Mirror that on the client
— estimate quoted turns from the merged text+quotes and ignore the stored
count — so snapshot-less branches don't under-report by the quote block.
Reverts the earlier "trust the count" assumption, which the server disproves.
* 🧹 chore: Route useResumableSSE diagnostics through the frontend logger
Convert the [ResumableSSE]/[Debug] console.log and console.error diagnostics
to the gated frontend `logger` (client/src/utils/logger), splitting the tag
from the message so object arguments are passed through as real args (logged
expandably, not stringified) and the logs stay tag-filterable and off the
production console unless explicitly enabled. All log statements preserved;
nothing removed.
* 🩹 fix: Prefer content over text when estimating count-less messages
A stopped agent response is saved with both a `text` field and a structured
`content` array, and the send path formats from content. messageChars
early-returned on `text`, dropping the content array (and the tool-call tokens
it carries) from the snapshot-less estimate — also making the tool_call
handling dead for such messages. Prefer content when present, fall back to text.
* 🔑 fix: Honor User-Provided MCP API Key Instead of Forcing OAuth
OAuth auto-detection probes the server without credentials and treats a
`WWW-Authenticate: Bearer` 401 as an OAuth requirement. A static bearer
API-key server answers an unauthenticated probe with the same challenge,
so servers configured with "API Key / each user provides their own / Bearer"
were misclassified as `requiresOAuth: true` and connected via the OAuth path,
ignoring the user's saved key (status stuck yellow, tool calls demand OAuth).
The API-key exemption in detection was scoped to `source === 'admin'` only.
Broaden it to any `apiKey` config in both detection sites (inspector startup
detection and runtime placeholder-URL detection), since API-key and OAuth auth
are mutually exclusive in the schema.
* 🔒 fix: Skip inspection probe for user API keys; honor explicit OAuth
Addresses two Codex findings on the API-key OAuth-detection fix:
- Skip the capability probe during inspection when apiKey.source is 'user'.
The user's key is supplied per-user at connect time, so an unauthenticated
probe at create/update would 401 against a bearer server and fail the save
(servers are inspected on the raw, pre-transform config with no auth header).
Same treatment already applied to customUserVars/obo/OAuth servers.
- Only short-circuit detection to non-OAuth when no explicit 'oauth' block is
configured, so an explicit OAuth config takes precedence if both are set.
Applied to both detection sites for consistency.
DataTable.spec failed with "Too many re-renders" (35 tests). Root cause: @tanstack/react-virtual is measurement-driven, and jsdom has no real layout, so its re-render loop never converges. This went unnoticed because packages/client had no jest CI job (only the client workspace runs jest in frontend-review.yml).
- DataTable: only read the virtualizer (getVirtualItems/getTotalSize) when virtualization is active; the non-virtualized branch renders rows directly, so engaging it for small tables was wasted render-phase work.
- Spec: mock @tanstack/react-virtual, since jsdom can't exercise real virtualization layout.
- Add a test:ci script to @librechat/client and a Tests: @librechat/client CI job so packages/client specs run on every frontend PR.
* ⬆️ chore: Migrate off deprecated @ariakit/react-core to @ariakit/react-components
@ariakit/react-core and its dependency @ariakit/core are deprecated (split into successor packages) and emit install-time warnings. @ariakit/react already ships the non-deprecated @ariakit/react-components transitively; the only direct use of react-core was the SelectRenderer deep import in ControlCombobox, which is now sourced from @ariakit/react-components/select/select-renderer (identical symbol and subpath). Both deprecated packages drop out of the lockfile and react-components dedupes to the single version @ariakit/react pins.
* ✅ test: Resolve ESM-only @ariakit split packages in jest
@ariakit/react-components and its peers are ESM-only (type: module) and declare only an import export condition, so jest's CJS resolver can't load them when @librechat/client's CJS build requires SelectRenderer. Add a custom jest resolver that resolves these @ariakit/* split packages with the import condition, and extend transformIgnorePatterns so babel transpiles them to CJS. Applied to both the client and packages/client jest configs.
* 🧠 feat: Memory Agent Capability with Inline Tools and Ephemeral Badge
Add `AgentCapabilities.memory`, which expands into the inline set_memory/delete_memory tool pair (mirroring the execute_code expansion via registerMemoryTools) when a run-level memoryAvailable gate holds: capability enabled, memory configured, MEMORIES.USE permission, and personalization not opted out. Surfaces the memory artifact as an attachment in the agents tool-end callback.
Adds the ephemeral path (TEphemeralAgent.memory, load/added agent tool injection), a fully-gated memory badge plus tools-dropdown entry, the agent-builder Memory toggle with form round-trip, and a mock e2e test asserting the badge reaches the request payload. Additive to and independent of the existing post-turn memory extraction agent.
* 🩹 fix: Address Codex review on memory capability (gating, validKeys, usage guard)
- Strip the memory capability from the served agents capabilities when memory is not configured/enabled, so the badge, tools dropdown, agent-builder toggle, and backend capability gate stay consistent instead of exposing an inert toggle on default installs (where MEMORIES.USE defaults true).
- Surface configured memory.validKeys in the inline tool definitions so the model is told the allowed keys up front, matching the runtime createMemoryTool schema.
- Append a strict explicit-request usage guard to the agent instructions when inline memory tools are registered, preserving the memory-agent's privacy behavior.
- Add AppService tests covering memory-capability stripping.
* ✅ test: Update AppService capability snapshots for memory strip
AppService now strips the memory capability from the served agents defaults when no memory block is configured; update the spec's expected capability lists to defaultAgentCapabilitiesWithoutMemory for the no-memory-config cases.
* 🛡️ fix: Address Codex re-review on memory capability (round 2)
- Strip the memory capability from the FINAL served agents config, not just defaults; loadEndpoints reparses any endpoints.agents block, so memory was still exposed in that common shape (packages/data-schemas/src/app/service.ts) + regression test.
- Re-check the full memory gate (config, opt-out, MEMORIES.USE) inside handleTools before constructing set_memory/delete_memory, so an unsolicited tool call from a model/custom endpoint can't bypass the runtime gates (api/app/clients/tools/util/handleTools.js).
- Restore the persisted memory toggle for model-spec conversations via applyModelSpecEphemeralAgent (client/src/utils/endpoints.ts).
- Clear LAST_MEMORY_TOGGLE_ on logout and clear-all-chats so a stale memory preference can't leak across users on a shared browser (client/src/utils/localStorage.ts).
* 🧠 fix: Address Codex re-review on memory capability (round 3)
- Serialize set_memory writes and advance a running token total inside createMemoryTool, so parallel batched calls in one event-driven turn can't each pass the limit check against a stale total and collectively exceed memory.tokenLimit (packages/api/src/agents/memory.ts) + tests.
- Inject the keyed memory context (withKeys) instead of withoutKeys when the running agent has the inline memory capability, so delete_memory has a visible key to target (api/server/controllers/agents/client.js).
* 🔐 fix: Address Codex re-review on memory capability (round 4)
- Detect inline memory by tool NAME (set_memory/delete_memory) across an initialized agent's tools + toolDefinitions, since the 'memory' marker is expanded at init and the prior string check never matched; inject the keyed memory context for any primary OR sub-agent that carries the inline memory tools (api/server/controllers/agents/client.js).
- Enforce memory WRITE permissions in the inline tool gate: set_memory requires CREATE+UPDATE and delete_memory requires UPDATE (matching the REST memory routes), so a USE-only role can't mutate/delete memories via agent tool calls (api/app/clients/tools/util/handleTools.js).
* 🔒 fix: Address Codex re-review on memory capability (round 5)
- Gate inline memory registration (memoryAvailable) on the memory WRITE permissions (USE+CREATE+UPDATE), so a read-only-memory role no longer has set_memory/delete_memory shown to the model only for the runtime loader to refuse them (api/server/services/Endpoints/agents/initialize.js).
- Enforce the per-agent memory opt-in at execution: handleTools now refuses to construct set_memory/delete_memory unless the agent actually declared them (toolDefinitions/tools), blocking hallucinated/undeclared memory tool calls from mutating memory.
- Fail closed when getFormattedMemories errors with a configured tokenLimit, instead of writing as if storage were empty and bypassing the cap (api/app/clients/tools/util/handleTools.js).
* 🩹 fix: Address Codex re-review on memory capability (round 6)
- Fix a P1 regression from the prior round: the execution-context agent keeps the raw 'memory' capability marker (not the expanded set_memory/delete_memory names), so the opt-in check now matches the marker. This restores memory writes/deletes AND avoids hijacking an MCP tool that merely shares the set_memory/delete_memory name (api/app/clients/tools/util/handleTools.js).
- Count repeated set_memory writes to the same key as replacements, not additions, against tokenLimit — set_memory upserts, so a same-key rewrite swaps its prior token contribution instead of double-counting (packages/api/src/agents/memory.ts) + test.
- Gate the memory badge, tools dropdown, and agent-builder toggle on the full memory write permissions (USE+CREATE+UPDATE) via a shared useHasMemoryAccess hook, so a read-only-memory role no longer sees an enabled Memory control the backend would refuse to wire up.
* 🧷 fix: Address Codex re-review on memory capability (round 7)
- Recognize inline memory across both execution-context agent shapes: initializeAgent now sets a LibreChat-only memoryToolsRegistered flag on the InitializedAgent, and the opt-in/detection checks accept that flag OR the raw 'memory' marker. Fixes memory failing for processAddedConvo agents (which store the initialized config, marker already expanded) while staying MCP-name-collision-safe (api/app/clients/tools/util/handleTools.js, packages/api/src/agents/initialize.ts, api/server/controllers/agents/client.js).
- Scope keyed memory context to memory-enabled agents only: useMemory now returns both keyed and unkeyed contexts, and buildMessages injects the keyed one (memory keys + token metadata) only to agents that can call delete_memory, while the primary/post-turn path keeps the unkeyed values — so a primary without memory tools no longer sees memory keys it doesn't need.
* 🔏 fix: Address Codex re-review on memory capability (round 8)
- Enforce memory size limits on inline writes: createMemoryTool now rejects keys over 1000 chars and values over memory.charLimit, matching the REST memory routes, so an inline-memory agent can't persist blobs the memory UI/API would reject (packages/api/src/agents/memory.ts, api/app/clients/tools/util/handleTools.js) + test.
- Recheck the agents 'memory' endpoint capability at execution time, so a stale/hallucinated set_memory/delete_memory call can't mutate memory after an admin removes the capability while the agent document still carries the marker (api/app/clients/tools/util/handleTools.js).
* ♻️ refactor: Move inline-memory backend logic into packages/api + share memory load
Workspace boundary: the inline-memory gating/detection logic that had crept into /api now lives in packages/api/src/agents/memory.ts (TS), with /api kept as thin wrappers.
- Add agentHasInlineMemoryTools, isMemoryToolAllowed, and buildInlineMemoryTool to packages/api; handleTools.js now calls buildInlineMemoryTool instead of constructing/gating the tools inline, and client.js imports agentHasInlineMemoryTools instead of redefining it.
- Optimize repeated memory loads: getRequestMemories memoizes getFormattedMemories per request (WeakMap keyed by req), so the run's memory-context load and every memory-enabled agent's set_memory token-usage load share a single DB fetch instead of one per agent.
* 🧠 fix: Invalidate request memory cache after inline writes
Inline set_memory/delete_memory now invalidate the request-scoped
getFormattedMemories cache on a successful write, so a later tool round
in the same response is seeded with the post-write usage total instead
of the stale pre-write one (multi-round writes no longer collectively
exceed tokenLimit, and a set after a delete is not over-counted). The
within-round sharing across multiple memory-enabled agents is preserved.
* 🧠 fix: Persist memory capability on saved agents; honor registration flag
- Add Tools.memory to the v1 systemTools allowlist so filterAuthorizedTools
no longer silently drops the memory marker when an agent with the Memory
capability is created/updated/duplicated through the builder (previously
the capability only worked for ephemeral chats, not persisted agents).
- agentHasInlineMemoryTools now honors an explicit memoryToolsRegistered
boolean before falling back to the raw `memory` marker, so an initialized
config whose registration was denied (memoryAvailable false) is not given
keyed memory context just because the marker survives in tools.
* 🧩 fix: Bring memory tool to parity with other ephemeral tools
- Add `memory` to the model-spec schema/type and honor `modelSpec.memory`
in both ephemeral paths (load.ts, added.ts) and the frontend spec
application, so admins can pre-enable Memory from a model spec exactly
like webSearch/fileSearch/executeCode.
- Add LAST_MEMORY_TOGGLE_ to the timestamped-storage cleanup list so stale
per-conversation memory toggles are purged on startup like the others.
- Hide the agent-builder Memory toggle for users who disabled memory in
personalization (memories === false), mirroring the chat badge's opt-out
gate, so the setting isn't shown as inert/misleading.
* ✅ test: Cover memory in applyModelSpecEphemeralAgent spec defaults
Update the exact-object assertions to include the new `memory` field and
add positive coverage that `modelSpec.memory` maps to the ephemeral
agent's `memory` flag. Fixes the shard 2/4 failure from 672a03b05.
* 🪝 feat: HITL Tool Approval Scaffolding
Adds the foundational types, job-state, config schema, and policy module
for human-in-the-loop tool approval. Purely additive — no behavior change
on existing runs. Lands ahead of the agents-SDK interrupt/checkpointer
integration so both tracks can land independently.
- LangChain HumanInterrupt-shaped types in `Agents.*` namespace
(`HumanInterruptPayload`, `ToolApprovalRequest`, `ToolReviewConfig`,
`PendingAction`, `ToolApprovalResolution`); `ToolCall`/`ToolCallDelta`
gain an optional `approval` field.
- New `requires_action` job status (non-terminal) plus `pendingAction`
field on `SerializableJobData` and `GenerationJobMetadata`. Both stores
treat the status as paused-but-alive; Redis `updateJob` has explicit
`requires_action`/`running` transition branches that refresh the hash
TTL, manage the `runningJobs` set, and `HDEL pendingAction` on resume.
Both stores include `requires_action` in `getActiveJobIdsByUser`.
- `GenerationJobManager` gains `markRequiresAction`, `getPendingAction`,
`clearPendingAction`; `getJobCountByStatus` aggregates the new status.
- `endpoints.agents.toolApproval` config (`default`/`required`/`excluded`)
and a policy module exporting `decideToolApproval`, `requiresApproval`,
and `buildPendingAction` (the LangChain-shaped payload builder).
- 20 unit tests covering policy resolution and the manager lifecycle.
* 🧭 refactor: Align HITL Surface with Agents SDK Permissions Model
Reshapes Slice A on top of the agents SDK's now-landed HITL surface
(`createToolPolicyHook`, discriminated `HumanInterruptPayload`, `'bypass'`
mode naming). Host stops reimplementing evaluation logic and becomes a
config mapper + payload wrapper.
Schema (data-provider):
- `toolApproval` shape now mirrors SDK `ToolPolicyConfig` 1:1:
`mode: 'default' | 'dontAsk' | 'bypass'`, plus `allow` / `deny` / `ask`
glob lists and an optional `reason` template. `enabled` is the
LibreChat-only admin kill switch.
- `'bypass'` (not `'bypassPermissions'`) — matches the SDK's surface.
Types (`Agents.*` namespace):
- `HumanInterruptType` extended to `'tool_approval' | 'ask_user_question'`.
- `HumanInterruptPayload` is now a discriminated union — `tool_approval`
carries `action_requests` + `review_configs`; `ask_user_question`
carries a free-form question with optional curated options.
- New: `AskUserQuestionRequest`, `AskUserQuestionOption`,
`AskUserQuestionResolution`.
- `ToolApprovalDecision` (string union) renamed to
`ToolApprovalDecisionType` to free the `Decision` name for the SDK's
discriminated object union later.
- `ToolApprovalResolution` gains `reason?` and `scope?: 'once' | 'session'
| 'always'` so route signatures stabilize before persistence lands.
Policy module (`packages/api/src/agents/hitl/policy.ts`):
- Drop `decideToolApproval` / `requiresApproval` / `ToolRef` — the SDK's
`createToolPolicyHook` handles full evaluation
(`deny → bypass → allow → ask → dontAsk → fallthrough(ask)`).
- Add `isHITLEnabled(policy)` — the kill-switch predicate that gates the
SDK's `humanInTheLoop: { enabled: false }` opt-out in Slice B.
- Add `mapToolApprovalPolicy(policy)` — strips `enabled`, returns a
`ToolPolicyConfig` to feed `createToolPolicyHook`. Structural mirror of
the SDK type so this compiles before the SDK upgrade ships.
- Reshape `buildPendingAction(payload, ctx)` to wrap any
`HumanInterruptPayload` with job context — accepts SDK output directly.
- Add `buildToolApprovalPayload(...)` and `buildAskUserQuestionPayload(...)`
helpers for synthesizing payloads in tests / pre-SDK flows.
Tests:
- 22 new unit tests covering the mapper, predicate, and payload builders;
20 → 27 total pass across policy + manager-lifecycle suites.
* 🪢 chore: Import ToolPolicyConfig From `@librechat/agents`
The SDK type now ships in 3.1.77 (already pinned on `dev`), so the
structural mirror in `policy.ts` is redundant. Drop the local interface
and import directly so future SDK changes to `ToolPolicyConfig` propagate
without our `mapToolApprovalPolicy` going stale.
* 🔑 fix: Carry tool_call_id On ToolReviewConfig (HITL)
`ToolReviewConfig` was joining with `ToolApprovalRequest` by position only.
That breaks the moment a single batch contains the same tool called twice
(e.g. a model fanning out parallel `mcp:server:search` calls): the UI can't
tell which review config applies to which action request once it filters
or reorders.
Mirrors the SDK's `ToolApprovalReviewConfig` shape — `tool_call_id` is the
join key, `action_name` is retained for display only.
Also: drop a JSDoc warning on `isHITLEnabled` so a future contributor doesn't
wire `humanInTheLoop: { enabled: true }` without supplying a host
checkpointer — the SDK's `MemorySaver` fallback is process-local and
silently breaks resume across worker hops.
- `Agents.ToolReviewConfig` adds `tool_call_id: string`
- `buildToolApprovalPayload` populates `tool_call_id` per review config
- New test covers the duplicate-tool batch case (two parallel calls to
the same tool); 27 → 28 tests
* fix: Address HITL review findings
* fix: Refresh paused HITL Redis state
* test: Stabilize HITL abort fallback specs
* 🎨 style: Sort imports to satisfy dev lint gate (HITL)
* 🏛️ refactor: Deepen HITL approval lifecycle into one race-safe seam
Architecture-review candidate #1 (+ #4). The requires_action lifecycle was
three shallow pass-throughs over updateJob with the legal transitions
smeared across JSDoc, the JobStatus union, and each store adapter — and the
resume transition was NOT race-safe: the Redis lua checked existence, not
status, so two concurrent approval submits both drove the run (re-executing
tools / double-billing).
- IJobStore.transitionStatus: atomic compare-and-set status transition that
only fires if the job is currently `from`. InMemory: sync compare. Redis:
single-node lua with a status guard (cluster best-effort, matching the
existing posture); reconciles membership sets + TTLs to `to`.
- New ApprovalLifecycle module: pause / peek / resolve / expire — guarded,
race-safe transitions behind one interface. resolve() returns true to
exactly one concurrent caller; the previously-undefined
requires_action → aborted expiry edge is now explicit; peek treats
past-expiresAt as gone (lazy expiry).
- GenerationJobManager exposes `approvals` and delegates; the three shallow
methods (mark/get/clearPendingAction) are removed — callers cross the deep
interface.
- #4: typeContract.spec asserts the SDK <-> data-provider HITL types stay
compatible (fails the build on drift); RedisJobStore validates the
pendingAction shape on deserialize instead of a bare JSON.parse (defends
the cold-resume path against malformed/stale records).
- Tests rewritten at the deep interface: double-resolve wins once,
pause-on-terminal rejected, explicit expiry, lazy-expiry peek.
No Slice B wiring — this deepens the existing scaffolding so the future
resume route and run seam are born crossing one race-safe interface.
* 🛡️ fix: Address Codex review on the HITL approval lifecycle
Seven findings on the lifecycle deepening (089ba09f9), all valid:
- F3 actionId guard: resolve/expire take an expectedActionId; pause records a
flat `pendingActionId` the atomic CAS guards on, so a stale decision can't
resume a job that has since paused for a different action.
- F4 cluster single-winner: transitionStatus now decides the winner with an
atomic CAS on the single-slot job hash (one Lua, cluster-safe), then
reconciles cross-slot membership sets — two concurrent resolves can no
longer both win on Redis Cluster.
- F1 resume reaping: resolve refreshes `lastActiveAt`; both stores' stale-
running failsafes key off it, so a long-paused approval isn't reaped right
after resuming.
- F2 expire completedAt: expire writes completedAt so terminal cleanup
reclaims the job (InMemory only cleans terminal jobs with completedAt set).
- F5 facade: buildJobFacade copies pendingAction into metadata so status/
resume routes can render the prompt.
- F6 resume metadata: PendingAction + buildPendingAction carry the SDK
interruptId/threadId needed to rebuild Command({ resume }) cross-process.
- F7 mirror: data-provider AskUserQuestionRequest gains optional description.
Tests added at the interface: stale-actionId resolve rejected, expire sets
completedAt. tsc + lint clean; policy + type-contract specs pass.
* 🛡️ fix: Address Codex round 2 on the HITL Redis adapter
Five P2 findings on abf4b86291, all valid Redis-adapter consequences of
round 1:
- G1 terminal cleanup on expiry: transitionStatus's terminal path now runs
the same chunk/run-step/userJobs cleanup as updateJob (extracted into a
shared applyTerminalContentCleanup). Expired approvals no longer leave
Redis stream contents around for the full running TTL.
- G2 pause via updateJob mirrors pendingActionId, so a pause through the
generic path carries the flat field the stale-decision guard compares.
- G3 resume via updateJob refreshes lastActiveAt (and clears pendingActionId),
matching transitionStatus so a long-paused job isn't reaped post-resume.
- G4 getActiveJobIdsByUser excludes a requires_action job whose pendingAction
is past expiry (both stores), via shared isPendingActionExpired — the client
stops polling an expired prompt.
- G5 createJob clears stale pendingAction/pendingActionId/lastActiveAt on a
reused streamId, so a fresh run never exposes a prior run's approval metadata
and cleanup keys off the new createdAt.
Tests added: expired pending-approval excluded from the active set. tsc +
lint clean; policy + type-contract specs pass.
* 🛡️ fix: Address Codex round 3 — approval expiry lifecycle completeness
Three P2 findings on 780833d908, all valid:
- H1 status consistency: /chat/status now treats a non-expired
requires_action job as active (matching /chat/active), so a client
refreshing while an approval is pending resumes/subscribes instead of
treating the run as finished and stranding it.
- H2 active expiry: cleanup now finalizes past-expiry requires_action jobs
(→ aborted) in both stores instead of only filtering them from the active
list — an expired prompt no longer lingers resident until key TTL. Redis
routes through transitionStatus (terminal content cleanup); in-memory marks
terminal + reclaims.
- H3 resumed liveness: in-memory stale-running check uses
max(lastActivity, lastActiveAt, createdAt), so a just-resumed job isn't
reaped on a stale per-chunk lastActivity entry before the next chunk.
Test added: in-memory cleanup finalizes + reclaims a past-expiry approval.
tsc + lint clean; policy + type-contract specs pass.
* 🛡️ fix: Address Codex round 4 — paused-job edge cases across the stack
Five P2 findings on 4324a4e776, all valid:
- I1 message validation: validateMessageReq's active-job read bypass now
accepts a live requires_action job, so a new-conversation run that pauses
before its final save can recover the prompt instead of 404ing.
- I2 expire targets the observed record: resolve()'s expired path passes
`expectedActionId ?? job.pendingAction.actionId`, so a concurrent
resume+re-pause can't let expire abort a different action.
- I3 stale/malformed prompts: new isPendingActionStale (missing OR expired)
drives active-listing exclusion + cleanup expiry in both stores, and the
status route + middleware require a live pendingAction — a requires_action
job whose pendingAction was dropped on deserialize no longer reads active.
- I4 in-memory parity: InMemory updateJob mirrors pendingActionId on pause and
clears it + refreshes lastActiveAt on resume (matching RedisJobStore), so a
pause via the generic path is still resolvable by actionId.
- I5 long approval windows: paused-job live TTL (job/chunks/run-steps) now
covers pendingAction.expiresAt + grace (pauseTtlSeconds), on both the
transitionStatus and updateJob pause paths, so Redis can't evict a paused
job before its decision window closes.
tsc + lint clean; policy + type-contract specs pass.
* 🛡️ fix: Codex round 5 — refuse unresolvable resolves; expose pending action
Two of three findings on c8abd826e1 (the third deferred to Slice B):
- J3 resolve() refuses a requires_action job that has lost its pendingAction
(e.g. a malformed record dropped on deserialize): it expires/finalizes the
job instead of driving a resumed run with no reviewed interrupt payload —
consistent with how active-listing + cleanup already treat a stale prompt.
- J2 /chat/status returns the live pendingAction for a paused stream, so a
client rebuilding from status (reload / cross-replica) has the action id +
payload to render and submit the prompt, not just "paused".
Deferred (Slice B): J1 — emitting a terminal SSE event on approval expiry so
already-subscribed clients close. The store-level lifecycle can't emit
transport events, and there are no live SSE subscribers to a paused stream
until the Slice B runtime wiring exists; tracked for that work.
tsc + lint clean; policy + type-contract specs pass.
* 🛡️ fix: Codex final round — paused-job TTL + pendingAction in resume contract
Two of three findings on e7d9cf21b6 (third deferred to Slice B):
- K2 paused-job TTL: a paused (requires_action) job no longer inherits the
20-minute running TTL — it uses a dedicated requires_action backstop
(default 24h, configurable) so a no-expiry approval (the buildPendingAction
default), which the API treats as live, isn't evicted by Redis mid-window.
A longer pendingAction.expiresAt still extends beyond the backstop.
- K3 resume contract: pendingAction is now carried on the typed ResumeState
(data-provider) and populated by getResumeState for a live paused job, so a
reloading / cross-replica client can rebuild the prompt from resumeState
(the contract useResumeOnLoad actually reads), not just a loose status field.
Deferred (Slice B): K1 — emit a terminal SSE event on expiry so already-
subscribed clients close. Requires the manager/eventTransport layer (the
store-level lifecycle and cleanup loops have no transport access) and has no
live subscriber until the Slice B subscribe/resume path exists; tracked there.
tsc + lint clean; policy + type-contract specs pass.
* ♻️ refactor: dedup HITL transition path + liveness predicate (arch review)
Two follow-ups from the post-hardening architecture re-review — both pure
dedup, no behavior change:
A — collapse the dual status-transition path. transitionStatus is now the
sole membership-aware transition (running ⇄ requires_action). Removed the
updateJob requires_action/running branches and the now-orphaned
transitionToRequiresAction / transitionToRunning / refreshLiveJobTtls, plus
the per-store pause/resume mirror logic that had to be re-synced into parity
across review rounds (G2/G3/I4/I5). updateJob is back to a plain field
writer + terminal cleanup. The Redis integration tests that drove
updateJob({status}) now drive transitionStatus (the real path).
B — one canonical "is this approval live?" predicate. isPendingActionStale /
isPendingActionExpired are exported from @librechat/api and used by the
stores, ApprovalLifecycle (dropped its private isExpired), the /chat/status
route, and validateMessageReq — replacing 3 inlined re-derivations that were
the drift source behind several review findings.
tsc + lint clean; policy + type-contract specs pass. Redis integration specs
(migrated) are CI-verified.
Adds a "Continue this chat" button to the shared conversation view that forks
the shared conversation into a new conversation owned by the viewer and opens it
to continue (issue #13001).
- POST /api/share/:shareId/fork, gated by requireJwtAuth, the fork rate
limiters, and the canAccessSharedLink ACL (view access = fork access).
- forkSharedConversation clones from the anonymized getSharedMessages payload,
so only share-visible data is copied.
- Strips file ids from cloned files/attachments so a fork grants no more file
access than viewing the read-only share, and honors the global shared-file
kill switch via the snapshotFiles option.
- Reduces the clone to the viewer's active branch, located by its index in the
shared payload (shared ids are re-anonymized per request and createdAt can
collide, while the payload order is stable).
- Resolves config/retention, persists, and reads back under the requesting
user's tenant, not the share owner's; canAccessSharedLink also falls back to
a system-wide share lookup so cross-tenant public shares resolve (ACL still
enforced under the share's own tenant).
- Resolves a usable endpoint/model from the viewer's models config instead of
hard-coding OpenAI, so deployments without OpenAI can send the first message.
- Routes the fork's 401s (logged-out or cold-loaded viewers) through login,
including when the refresh itself is rejected for a stale session.
- Hides the Temporary Chat toggle once a conversation has a real id, and
portals the share-settings theme/language dropdowns above the dialog.
Rebased onto dev; collapses the share-fork feature and its review fixes into a
single commit.
* feat: add terms acceptance timestamp tracking and migration script
* feat: update migration script to use countUsers method for user count
* Update config/migrate-terms-timestamp.js
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
* feat: enhance terms acceptance response to include acceptance timestamp
* fix: make terms acceptance idempotent and fail migration on partial errors
Preserve the original termsAcceptedAt on repeat accepts within a terms
cycle so retried or duplicate requests no longer overwrite the first
acceptance time. Exit the migration script with a non-zero status when
any per-user update fails so partial failures are not reported as
successful.
* style: fix import ordering in data-provider mutations
* refactor: record terms acceptance atomically to preserve first-accept time
Replace the read-then-write in acceptTermsController with a single
atomic acceptTerms method that conditionally stamps termsAcceptedAt via
an $ifNull aggregation update. This removes the TOCTOU window where two
concurrent first-time accepts could overwrite the earlier acceptance
timestamp, while still preserving an existing timestamp and backfilling
legacy accepted users.
* fix: run terms timestamp migration under system tenant context
Wrap the count, cursor scan, and per-user updates in runAsSystem so the
tenant isolation plugin does not throw under TENANT_ISOLATION_STRICT or
scope the cross-tenant migration to a non-existent tenant, matching the
other maintenance migrations.
* fix: guard terms backfill against concurrent acceptances
Add the missing-timestamp predicate to the per-user updateOne filter so
a user who accepts through the API between the cursor read and the write
keeps their real acceptance time instead of being overwritten with
createdAt. Track modified vs skipped so the summary reflects skips.
* fix: scope terms backfill to still-accepted users
Add termsAccepted: true to the per-user updateOne filter so a reset that
clears acceptance between the cursor read and the write is not re-stamped
with createdAt, which would otherwise poison the next acceptance cycle
through the $ifNull preserve in acceptTerms.
---------
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
* 🛡️ fix: Prevent ReDoS in YouTube URL extraction for URL Context
The YouTube detection/strip regexes ran as a single global pass over
authenticated, user-controlled chat text. The engine could restart at every
`youtube.com/watch?` occurrence and the lazy `\S*?&` rescanned the rest of a
long non-whitespace token each time, giving quadratic CPU behavior that blocks
the Node event loop (DoS) for Google/Vertex agents with url_context enabled.
- Tokenize on whitespace and skip tokens longer than a real URL, and cap the
total text scanned, so work is bounded to O(n). URLs never contain whitespace,
so per-token matching is equivalent.
- Replace the lazy unbounded `(?:\S*?&)?` with the delimiter-bounded
`(?:[^\s&]*&)*` (no behavior change for real URLs).
- Apply the same discipline to the strip path.
- Add ReDoS regression tests; a 3MB crafted input now completes in <10ms.
* 🛡️ fix: Bound the YouTube strip scan by the same total budget
Address Codex P1: the strip path applied only the per-token cap, so a valid URL
followed by many sub-cap malformed tokens still regex-scanned the entire message
(~1s on 3MB). Injected ids only come from the first MAX_YOUTUBE_SCAN_CHARS
(extraction's cap), so a link beyond that is never in injectedIds anyway; cap the
strip scan at the same budget and leave the tail verbatim. 3MB PoC: ~1s -> ~14ms.
* 🧬 fix: Make YouTube URL matching linear instead of capping the scan
The previous fix bounded the scan with per-token + total-scan caps, but the
total-scan cap discarded content: a URL near the end of a long prompt was missed
(extraction sliced to 100k), and large prepended file/quote context exhausted the
strip budget before the real URL (strip skipped it). Codex round 2 (P2 x2).
Replace the backtracking-prone matcher with a linear one: a single regex captures
host + path/query (greedy `[^\s]*`, bounded `{1,63}`/`{0,10}` subdomain repetition,
no lazy/ambiguous quantifier), and the video id is parsed from the capture
afterwards. This is O(n) over arbitrary input, so the scan caps (and the content
they discarded) are removed entirely. Extraction and stripping now scan the whole
message linearly.
Benchmarks (no caps): 3MB attack token ~3ms, 3MB many-token ~4ms, valid URL at end
of 3MB found in ~18ms. Adds regression tests for long-prompt extraction and
stripping past large prepended context.
* 🔡 fix: Match adjacent + capitalized YouTube URLs after linear rewrite
Codex round 3 (regressions from the linear matcher):
- Stop the path capture at URL-list delimiters (`,` `)` `]` `<` `>`, none of which
occur in a real YouTube URL) so adjacent links in one token (comma-separated or
markdown `](url1)](url2)`) are matched separately instead of swallowed.
- Lowercase the path segment before matching route names, since the detection regex
is case-insensitive (`/WATCH?v=`, `/EMBED/`).
* 🔒 fix: Allowlist URL chars + bounded path parsing for YouTube matching
Codex round 4:
- Replace the path stop-char blocklist with an allowlist of characters that occur
in real YouTube URLs, so adjacent links separated by any prose delimiter
(`;`, `|`, etc.) are matched separately instead of swallowed.
- Parse the route with anchored, bounded regexes instead of `path.split('/')`, so a
malformed path of millions of slashes no longer allocates a huge array / blocks
the event loop. Also bounds the `v=` param read.
* 🎯 fix: Restrict YouTube matcher to recognized video routes
Codex round 5: a nested video URL inside an unrecognized YouTube URL
(`youtube.com/redirect?q=https://youtu.be/<id>`) was swallowed by the greedy
match and missed. Restrict the matcher to recognized single-video forms
(youtu.be/<id>, /(shorts|live|embed|v)/<id>, /watch?<query>) so an unrecognized
route doesn't match and the global scan continues into the nested link. Stays
linear (verified: 3MB redirect/slash/host floods all <25ms) and keeps the
allowlist tail so adjacent links still split. Adds nested-URL + unrecognized-route
regression tests.
* 🎬 fix: Find nested watch links + skip malformed v= duplicates
Codex round 6 (P3 watch-query edges):
- Drop `:` from the path allowlist. It never occurs in a real YouTube path/query,
but `://` of a nested URL does — so `watch?url=https://youtu.be/<id>` now stops
the watch match and the scan finds the nested link.
- Scan every `v=` param and return the first valid 11-char id, so a malformed
earlier `v=` (e.g. `watch?v=tooShort&v=<valid>`) no longer shadows a later valid one.
* 🧹 fix: Strip whole YouTube URL incl. colon-containing trailing params
Codex round 7: dropping `:` from the tail (round 6) made the strip path stop mid-URL
on a URL-valued param (`watch?v=<id>&next=https://example.com`), leaving `://example.com`
orphaned. Use a separate strip matcher whose tail re-includes `:` so the whole URL token
is removed, while detection keeps the `:`-excluded tail to still find nested video links.
Also corrects a stale "per-token cap" comment left over from the linear rewrite.
The "Add to chat" popup lingered over an empty caret after a selection collapsed through a path that fires no mouse/key event — most often a streaming markdown re-render replacing the selected text node. The selection state only updated on mouseup/dblclick/keyup/scroll/resize, so a silent collapse left the button stranded ("showing up with nothing selected").
Add a `selectionchange` listener that hides the popup the instant the selection collapses or empties. It only hides, never shows, so an in-progress drag-select still won't flicker the popup.
Adds an e2e that collapses the selection without a mouse event and asserts the popup disappears.
* 🛡️ fix: Guard Prompts and Mention popovers against empty-result navigation
* 🛡️ fix: Prevent Tab default and clear stale filter on empty popover close
* ✨ feat: Add Google url_context Param with Native YouTube Video Understanding
Mirror the web_search grounding wiring for a new Google/Gemini `url_context`
model param (resolves to the native `urlContext` tool). When enabled, YouTube
URLs in the latest user message are injected as Gemini video parts (fileData),
since the URL Context tool does not support YouTube.
* 🎞️ fix: Provider-aware YouTube injection limits for url_context
Address Codex review on the YouTube video-understanding path:
- Cap injected YouTube parts per request by provider/model (Vertex: 1; Gemini
Developer API: 10 on 2.5+, 1 on earlier models) so multi-link messages cannot
exceed the provider limit and get rejected.
- Set a video/mp4 mimeType on Vertex YouTube fileData (matching Vertex samples);
the Developer API still omits it.
* 🧩 fix: Round-trip url_context for Google-compatible custom endpoints
Add url_context to openAIBaseSchema so the per-chat value persists for custom
endpoints configured with customParams.defaultParamsEndpoint: 'google', matching
how web_search is already picked there.
* 🚦 fix: Gate url_context tool to Gemini 2.5+ models
Per Google's URL Context supported-models list (2.5+/3.x only), skip the native
urlContext tool on earlier models (debug-log + no-op) instead of sending it and
triggering a provider 400. This also gates the coupled YouTube video-understanding
injection to 2.5+, since it keys off the resolved urlContext tool.
* ✂️ fix: Strip YouTube URLs from urlContext text; keep url_context out of OpenAI schema
- Remove url_context from the shared openAIBaseSchema (revert): it is Google-only
and would otherwise leak as an unsupported param to OpenAI/Azure/OpenRouter
requests. On Google-compatible custom endpoints url_context is enabled via admin
addParams/defaultParams, same as web_search.
- When injecting YouTube video parts, strip the matched YouTube URLs from the prompt
text so the urlContext tool (which reads URLs from text and cannot fetch YouTube)
does not consume its URL budget on them. Non-YouTube URLs are left intact.
* 🎯 fix: Refine url_context model gating and YouTube injection edges
Address Codex round 4:
- Exclude non-text modality variants (image/live/tts) from URL Context support,
mirroring the Google tool-combination modality exclusion.
- Use the resolved run model (model_parameters.model) for YouTube injection limits
instead of the saved base model.
- Strip only the YouTube links actually routed to video (id-aware); keep over-limit
links in the text so the model can still reason about them.
- Keep timestamped YouTube links (?t=/&start=) in the text so the moment cue survives.
- Recognize youtube-nocookie.com/embed links.
* 🎚️ fix: Exclude audio Gemini variants + preserve pre-id YouTube timestamps
Address Codex round 5:
- Add `audio` to the url_context modality exclusion so audio-only Gemini variants
(e.g. gemini-2.5-flash-preview-native-audio-dialog) skip the tool instead of 400ing.
- Detect YouTube timestamps anywhere in the matched URL (incl. before `v=`, e.g.
watch?t=90&v=<id>), so timestamped links are kept in the prompt text as intended.
* 🧠 feat: Configurable Reasoning Replay for Custom Endpoints
Adds customParams.includeReasoningContent so OpenAI-compatible custom endpoints (e.g. Xiaomi MiMo, Kimi) can replay reasoning_content on tool-call turns natively, without impersonating the moonshot provider.
* 🔁 feat: Replay reasoning_content across turns for opted-in custom endpoints
Extends the DeepSeek reasoning-content format spoof to honor customParams.includeReasoningContent, so custom OpenAI-compatible endpoints (Xiaomi MiMo, Kimi) reconstruct reasoning_content from persisted history on later turns, matching DeepSeek thinking-mode parity. Adds shouldReplayReasoningContent predicate (tested) and surfaces the flag on the initialized agent.
* 🪢 refactor: Split within-run vs cross-turn reasoning replay flags
moonshot only replays reasoning_content within a run's tool calls, not across turns. Decouples the two: includeReasoningContent = within-run replay (exact moonshot parity), new includeReasoningHistory = cross-turn reconstruction from persisted history (implies includeReasoningContent, since reconstruction is a no-op without the within-run replay flag).
* 🩹 fix: Apply reasoning replay across all param-format branches
Move the within-run includeReasoningContent application out of the OpenAI-only branch in getOpenAIConfig to after the branch dispatch, so custom endpoints using anthropic/google defaultParamsEndpoint gateway modes also honor includeReasoningContent/includeReasoningHistory. Addresses Codex finding.
* chore: Update @librechat/agents to v3.2.46
* 🧽 refactor: De-spoof reasoning replay via explicit preserveReasoningContent
Now that @librechat/agents 3.2.46 exposes an explicit preserveReasoningContent option on formatAgentMessages, pass it directly instead of impersonating provider: deepseek. Behavior is unchanged (shouldReplayReasoningContent still gates DeepSeek + the custom includeReasoningHistory flag); also corrects the comment to reference includeReasoningHistory.
* 🌳 fix: Walk subagents in the reasoning-history replay gate
The gate only checked the primary agent and top-level handoff/parallel configs, so an opted-in custom endpoint used solely as a nested subagent had its persisted reasoning dropped on later turns. New exported anyAgentReplaysReasoningContent walks subagentAgentConfigs (cycle-safe, mirrors anyAgentHasCodeEnv); client.js uses it. Addresses Codex finding.
Otherwise, it's possible for a config to override the `isValidAgentId` check.
Without that check, it's possible to query `getAgentById()` with a blank `agent_id`,
which can result in polluting the `QueryKeys.agent` cache with a full list of agents
(instead of just a single agent result).
* 🐛 fix: Prevent Infinite Render Loop on Code-Execution File Preview
Loading a conversation that contains a large (>1MB) code-execution
office file crashed the whole app with React error #185 ("Maximum
update depth exceeded") on hard refresh.
Root cause (client-only): the terminal-write effect in
useAttachmentPreviewSync writes the resolved preview record back into
messageAttachmentsMap with a fresh object identity on every run, and
`attachment` is in the effect's dependency array. useAttachments
re-derives `attachment` ({...db, ...liveEntry}) with a new identity on
every map write, so once polling resolves (pending -> ready on a loaded
conversation) the effect ping-pongs forever:
setAttachmentsMap -> re-derive -> effect -> setAttachmentsMap.
Only files large/slow enough to defer extraction are persisted at
status: 'pending', which is why small documents never triggered it.
Fix: an idempotency gate that bails before setAttachmentsMap when the
merged attachment already carries the resolved status/text/textFormat/
previewError. The write happens once and then settles.
Tests:
- useAttachmentPreviewSync.loop.spec.tsx wires the real
useAttachments -> hook feedback to reproduce the loop (verified to
throw #185 without the gate, settle with it).
- e2e/specs/mock/attachment-preview-loop.spec.ts loads a conversation
with a pending code-exec attachment whose preview resolves ready and
asserts the app does not crash.
Closes#13916
* 🔧 feat: Make Office Preview Extraction Cap Configurable (default 2MB)
The inline code-execution preview extraction ceiling was a hardcoded 1MB
constant (MAX_TEXT_EXTRACT_BYTES). Office/text artifacts over that skip
the inline preview and resolve to "Preview unavailable" (download-only).
Make it configurable via FILE_PREVIEW_MAX_EXTRACT_BYTES and raise the
default to 2MB so larger documents get an inline preview out of the box.
The rendered HTML remains independently capped at MAX_TEXT_CACHE_BYTES
(512KB), so image-heavy files over that still fall back to the existing
"preview too large" banner rather than rendering unbounded output.
- resolveMaxTextExtractBytes(env) parses the override, falling back to
2MB on missing/non-numeric/non-positive values (warns on invalid).
- Documented in .env.example next to the other file-size limits.
- Unit tests cover default, valid override, fractional flooring, and
invalid fallback.
* 🐛 fix: Guard sub-byte preview cap from flooring to zero
A fractional FILE_PREVIEW_MAX_EXTRACT_BYTES in (0, 1) passed the
positive-number check then floored to 0, making MAX_TEXT_EXTRACT_BYTES
zero and treating every non-empty artifact as oversized. Floor first,
then require the result to be >= 1 byte before accepting it; otherwise
fall back to the 2 MB default. Adds coverage for the sub-byte case.
* ✅ test: Make exported-ceiling assertion env-independent
The "exported ceiling" assertion compared MAX_TEXT_EXTRACT_BYTES to a
literal 2 MB, but that const is initialized from
FILE_PREVIEW_MAX_EXTRACT_BYTES at module load — so the suite would
falsely fail when run with the override set. Assert the export tracks
resolveMaxTextExtractBytes(env) for the current environment instead; the
undefined-case test continues to pin the 2 MB default.
* 🖱️ fix: Summon Quote Popup on Double-Click Word Selection
Chromium commits a double-click word selection on the `dblclick` event, after `mouseup` has already read a still-collapsed range, so the "Add to chat" popup never appeared for double-click selections. Listen for `dblclick` in addition to `mouseup`/`keyup`.
Adds an e2e covering a native double-click word selection (measured-coordinate dblclick exercises the real browser path, unlike the programmatic-Range helper).
* 🎯 test: Target Reply Text Node in Double-Click Quote E2E
Walk to the text node containing the needle (not the first text node in .message-render, which may be a select-none screen-reader/model-label header) and measure the needle's first character, so the native double-click lands on the reply word rather than metadata.
* fix: withhold custom endpoint headers for user URLs
* fix: require user key for user custom URLs
* test: type custom endpoint header cases
* fix: prompt for keys on user custom URLs
Resolve the new-chat default spec from the most recent conversation setup
(LAST_CONVO_SETUP_0) instead of reconstructing intent from accumulated
cross-endpoint history. Removes hasStoredModelValue, hasStoredPrefixValue,
hasStoredModelSelection, the sticky LAST_SPEC read, the nested
resolveSoftDefault closure, and the duplicated prioritize/modelSelect branches.
Fixes the soft default being dropped on New Chat ("Select a model") when its
preset endpoint sits outside modelSpecs.addedEndpoints alongside a custom
endpoint: a model lingering in LAST_MODEL for that endpoint no longer
suppresses the soft default.
Clear All Chats now also clears LAST_SPEC/LAST_MODEL/LAST_TOOLS so a new chat
afterward cleanly returns to the soft default. Adds the cross-endpoint unit
case, a clearAllConversationStorage test, and a cold-load e2e regression test.
* fix: Demote user abort logging
* fix: Handle abort causes
* fix: Demote user-aborted agent completion to debug log
The error users still saw originated in AgentClient's completion catch,
which logged every caught error (including user aborts) at error level
before checking the abort signal. Branch on abortController.signal.aborted
so user-initiated aborts log at debug while real failures stay error-classified.
Also give the handleAbortError it.each cases distinct titles.
* fix: require admin panel session secret
* 🩹 fix: Plain-Expand Admin SESSION_SECRET So Compose Maintenance Commands Run
The `${VAR:?}` required form fails interpolation for every deploy-compose
subcommand (down/pull/config), breaking `npm run update:deployed` for installs
whose .env predates ADMIN_PANEL_SESSION_SECRET. Plain expansion keeps those
commands working; the admin-panel image fail-fasts on an empty secret, so the
panel still refuses to start without it.
* feat: add useKeyboardShortcuts hook and showShortcutsDialog atom
Implements the core keyboard shortcuts hook with 11 shortcuts:
- General: new chat, focus input, copy last response
- Navigation: toggle sidebar, model selector, search, settings
- Chat: stop generating, scroll to bottom, temporary chat, copy code
Also adds the showShortcutsDialog atom to control dialog visibility.
Closes#3664
* feat: add KeyboardShortcutsDialog component
Renders a modal dialog listing all available keyboard shortcuts
grouped by category (General, Navigation, Chat). Features:
- Platform-aware key labels (⌘ on Mac, Ctrl on others)
- Clean kbd-style key badges with subtle shadows
- Grouped sections with separators
- Sticky footer with shortcut to open the dialog itself
- Single close button, Escape to dismiss
* feat: integrate keyboard shortcuts into Root layout and account menu
- Mount useKeyboardShortcuts and KeyboardShortcutsDialog in Root.tsx
via a KeyboardShortcutsProvider wrapper (only renders post-auth)
- Add 'Keyboard Shortcuts' menu item with Keyboard icon to the
account settings popover for discoverability
* chore: add data-testid to model selector button
Adds data-testid="model-selector-button" to the model selector
trigger for reliable DOM targeting by keyboard shortcuts and tests.
* i18n: add keyboard shortcuts localization keys
Adds 12 new com_shortcut_* translation keys for the keyboard
shortcuts feature: group labels, action labels, and dialog title.
* style: fix keyboard shortcuts dialog dark mode
Replace token-based dark mode styling with explicit white-alpha
values for kbd badges, borders, and separators:
- Kbd: dark:bg-white/[0.06] dark:border-white/[0.08] dark:shadow-none
- Separators: dark:border-white/[0.06]
- Dialog border: dark:border-white/[0.06] dark:shadow-2xl
Ensures the key badges blend naturally into the dark surface
instead of appearing as harsh bright rectangles.
* feat(shortcuts): add definitions for 8 new keyboard shortcuts
Add shortcut definitions and localization keys for:
- Upload file (Cmd/Ctrl+Shift+U)
- Toggle right sidebar (Cmd/Ctrl+Shift+R)
- Regenerate response (Cmd/Ctrl+Shift+E)
- Edit last message (Cmd/Ctrl+Shift+I)
- Scroll to top (Cmd/Ctrl+Shift+↑)
- Archive conversation (Cmd/Ctrl+Shift+A)
- Delete conversation (Cmd/Ctrl+Shift+Backspace)
Addresses #3664
* feat(shortcuts): implement handlers for all new shortcuts
New handlers:
- Upload file: triggers attach-file button click
- Toggle right sidebar: clicks parameters-button
- Regenerate response: clicks regenerate-generation-button
- Edit last message: finds last user-turn and clicks edit button
- Scroll to top: scrolls main[role=main] to top
- Archive conversation: calls archive mutation + navigates to new chat
- Delete conversation: calls delete mutation + navigates to new chat
Improvements:
- Use getMainScrollContainer() helper targeting main[role=main]
instead of fragile class-based selectors
- Use data-testid selectors instead of aria-label substring matching
for stop-generation and model-selector buttons
- Use id-based selectors (button[id^=edit-]) for edit buttons
- Add isEditing guard to skip shortcuts when user is typing in
inputs, textareas, or contentEditable elements
- Refactor handler from if/return chain to switch statement for
cleaner flow control
* fix(shortcuts): increase dialog scroll height for expanded shortcut list
With 20 shortcuts across 3 groups, the previous 480px max was tight.
Increase to 560px / 70vh so all shortcuts are visible without
excessive scrolling.
* refactor(shortcuts): use data-testid selectors for reliable targeting
Add data-testid="nav-settings" to the Settings menu item in
AccountSettings so the open-settings shortcut no longer relies on
fragile text-content matching ('Settings' but not 'Keyboard').
* refactor(shortcuts): two-column layout for shortcuts dialog
Split the shortcuts dialog into a two-column grid layout:
- Left column: General + Navigation groups
- Right column: Chat group (which has the most shortcuts)
Reduces vertical height so the full list is visible without scrolling.
Widen dialog to max-w-4xl (w-11/12) to accommodate both columns.
Simplify Kbd/group styling for cleaner visual density.
* refactor(shortcuts): adjust padding in KeyboardShortcutsDialog content
* feat(shortcuts): customizable keyboard shortcuts with recorder UI
Add per-shortcut overrides stored in localStorage, a recorder component
for capturing new key combos with conflict detection, and a per-row
edit/reset affordance in the shortcuts dialog.
* test(shortcuts): fix specs broken by keyboard shortcut hooks
- ExpandedPanel: add customShortcuts atom to the store mock so
useShortcutDisplay/useShortcutAriaKey can read state
- AttachFileMenu: update queries to the new 'Attach Files' aria-label
- Button (Generations): wrap renders in RecoilRoot now that the
component reads shortcut state
* feat(shortcuts): add panel/submit/bookmark/continue/read-aloud shortcuts
- Wire stop, regenerate, continue, and read-aloud handlers to existing
buttons via data-testid, fixing handlers that previously queried
selectors with no matching DOM nodes.
- Add data-testid='nav-panel-${id}' to expanded sidebar nav buttons so
the panel-opener shortcuts can target them.
- Add new shortcut definitions and handlers: submitMessage,
bookmarkConversation, continueResponse, readAloudLastResponse, and
the open* panel openers (assistants, agents, prompts, memories,
parameters, files, bookmarks, MCP).
- Drop the toggleRightSidebar shortcut — there is no right sidebar to
toggle in this codebase.
- Refresh the KeyboardShortcutsDialog layout and ShortcutRecorder for
the new groups, tighten ShortcutKeyCombo styling, and surface the
shortcuts hint chips in the account menu.
* chore(shortcuts): remove unused translation keys
Drop com_shortcut_dialog_subtitle, com_shortcut_not_set, and
com_shortcut_reset_aria — no remaining references in the codebase.
* fix(shortcuts): resolve keyboard shortcut and footer regressions
- Guard the temporary-chat toggle so the shortcut mirrors the UI, only
toggling when the conversation has no messages and is not submitting.
- Stop Ctrl/Cmd+Enter from double-submitting: the main chat textarea
already submits via its own handler, and submit is blocked from
unrelated inputs while still working in the chat box.
- Ignore repeated keydown events (e.repeat) so held keys no longer
re-run toggles or destructive actions.
- Scope archive/delete shortcuts to the conversation in the active
route using useMatch, preventing mutations of a stale background
conversation on non-chat routes.
- Keep the recorder conflict controls clickable by including the whole
editing row in the outside-click containment check.
- Restore privacy policy and terms of service links on public share
pages via an opt-in Footer prop.
- Expand the sidebar before activating panel shortcuts so they are
visible on mobile, and avoid toggling an already-active panel.
* fix(shortcuts): reject bare non-printable shortcut bindings
A recorded non-printable key (Tab, Enter, Backspace, Delete, arrows,
Space) with no Cmd/Ctrl/Alt was treated as valid, so it could be saved
and then hijack navigation or fire destructive actions since the global
handler preventDefaults it outside text inputs. Require Shift at minimum
for these keys, which keeps Shift+Escape (focusChat) valid while
rejecting bare single-key bindings.
* style: fix import order drift across keyboard shortcut files
* fix(shortcuts): guard actions behind dialog and resolve reset conflicts
- Ignore global shortcut actions while the shortcuts dialog is open
(except the toggle that closes it), so a combo like delete/archive
can no longer fire on the conversation behind the modal.
- When resetting a shortcut to its default, unbind any other action
whose custom binding collides with that restored default, so Reset
after a Replace can't leave two rows sharing one binding with one
action unreachable.
* fix(shortcuts): keep attach menu button accessible name stable
The shortcut pass changed the attach menu button's aria-label from the hardcoded "Attach File Options" to localize('com_sidepanel_attach_files') ("Attach Files"), which changed its accessible name and broke the provider-file e2e specs that locate it by name. Restore the original label and keep only the added aria-keyshortcuts.
* fix(shortcuts): gate temporary chat toggle to chat routes
The Root-level listener runs on non-chat routes (search, settings, panels) where the last loaded conversation may be empty, so Ctrl/Cmd+Shift+T could flip the hidden isTemporary state without the TemporaryChat control being visible. Require an active chat route (routeConvoId) before toggling.
* test(shortcuts): align attach menu spec with button accessible name
The attach menu button's aria-label was restored to "Attach File Options" (matching dev and the provider-file e2e specs), so update the unit test's button queries from /attach files/i to /attach file options/i. All 26 cases pass.
* fix(shortcuts): target conversation bookmark and reveal search panel
- Bookmark: query the unique #bookmark-menu-button so the shortcut
bookmarks the current conversation. The previous
querySelector('[data-testid="bookmark-menu"]') matched the sidebar
tag-filter button first (same testid, earlier in the DOM), toggling
the filter instead of bookmarking.
- Focus search: activate the conversations panel before focusing, since
the search input only mounts there and the sidebar renders just the
active panel. Route through the nav-panel-conversations button (the
listener is outside ActivePanelProvider) and settle before focusing,
so Ctrl/Cmd+/ works from any panel.
* fix(shortcuts): preserve footer links, cross-platform bindings, modal guard
- restore unconditional legal footer links (drop showLegalLinks gate)
- keep untouched platform's default when customizing a binding
- round-trip bindings whose key is the plus character
- suppress global shortcuts while any modal dialog is open
- tag read-aloud test id only on assistant turns
* fix(shortcuts): include non-Radix dialogs in the modal guard
The guard only matched Radix dialogs via data-state="open", missing
Headless UI dialogs (e.g. the redesigned Settings modal) that render
role="dialog" without data-state. Iterate all dialog/alertdialog nodes
and treat one as open unless it is inert or data-state="closed", which
also avoids false positives from always-mounted inert panels.
* fix(shortcuts): gate temporary chat toggle behind TEMPORARY_CHAT permission
* fix(shortcuts): only prevent native key event when shortcut action runs
* fix(shortcuts): rebind temporary chat, open settings without toggling menu, release no-op keys
* fix(shortcuts): confirm conversation delete, use clipboard fallback, add tests
* fix(shortcuts): navigate to new chat after keyboard-confirmed delete
* fix(shortcuts): copy last response via message button, guard unavailable controls
* fix(shortcuts): keep custom Enter-based submit bindings working in the composer
* fix(shortcuts): restrict shift-only bindings to safe keys
* fix(shortcuts): submit custom Enter chords in the composer without inserting a newline
* fix(shortcuts): block global shortcuts while a menu overlay is focused
* fix(shortcuts): rebind archive off the browser-reserved Ctrl+Shift+A
* fix(shortcuts): honor submitMessage overrides in the composer