feat: Immediate Conversation Title Generation (#13395)
Some checks are pending
Docker Dev Branch Images Build / build (Dockerfile, lc-dev, node) (push) Waiting to run
Docker Dev Branch Images Build / build (Dockerfile.multi, lc-dev-api, api-build) (push) Waiting to run
GitNexus Index / index (push) Waiting to run
GitNexus Index / post-index (push) Blocked by required conditions
Docker Dev Images Build / build (Dockerfile, librechat-dev, node) (push) Waiting to run
Docker Dev Images Build / build (Dockerfile.multi, librechat-dev-api, api-build) (push) Waiting to run
Sync Locize Translations & Create Translation PR / Sync Translation Keys with Locize (push) Waiting to run
Sync Locize Translations & Create Translation PR / Create Translation PR on Version Published (push) Blocked by required conditions
Sync Helm Chart Tags / Ignore non-main push (push) Waiting to run
Sync Helm Chart Tags / Sync chart tags (push) Waiting to run

*  feat: Immediate Conversation Title Generation

Generate conversation titles as soon as the request is made (in parallel
with the response, from the user's first message) as the new default,
fixing the #13318 race where a transient /gen_title 404 left new chats
stuck on "New Chat".

- Add per-endpoint `titleTiming` ('immediate' | 'final') to baseEndpointSchema;
  `endpoints.all` acts as the global default, unset = immediate. Resolve via
  a new `resolveTitleTiming` helper (`all` takes precedence).
- Fire title generation in parallel with `sendMessage`; `titleConvo` waits
  (bounded, abortable) for the agent run and titles from the user input only.
  Persist after the conversation row exists; defer `disposeClient` until the
  title settles.
- Expose `titleGenerationTiming` via startup config; `useTitleGeneration`
  fetches eagerly in immediate mode with a bounded 404 retry and never treats
  a transient 404 as final. Skip title queueing for temporary conversations.
- Supersedes #13329 while incorporating its bounded 404-retry.

* 🩹 fix: Address Copilot review findings on title timing

- Guard against an undefined conversationId in addTitle (skip + warn) so the
  gen_title cache key can't collide as `userId-undefined` and saveConvo is
  never called without a conversationId.
- Gate the title `useQueries` on `enabled` so no /gen_title request fires while
  unauthenticated (e.g. after logout) even if the module queue holds IDs.
- Drop the stale `conversationId` param from the titleConvo JSDoc.
- Add a regression test for the undefined-conversationId guard.

* 🧵 fix: Harden immediate-title edge cases from codex review

- Cancel in-flight immediate title generation when the request aborts: thread
  job.abortController.signal through addTitle so pressing Stop on a new chat
  neither consumes the title model nor surfaces a title for a cancelled turn.
- Preserve a locally-applied title when the final SSE event's conversation
  carries no title yet (built before the title was saved), so long immediate-mode
  responses no longer revert the chat to "New Chat" until reload.
- Guarantee one full post-completion gen_title fetch cycle before giving up, so a
  `final`-mode title (generated only after the stream ends) is still fetched under
  a global `immediate` default instead of being stranded.
- Add regression tests for the abort propagation and the undefined-conversationId guard.

* 🔁 fix: Correct title abort, post-completion refetch, and replacement ordering

Follow-up to codex review of the immediate-title fixes:

- Use a dedicated title AbortController instead of `job.abortController`. The
  latter is also aborted by `completeJob` on *successful* completion, which
  cancelled any title slower than a short response. The title is now cancelled
  only on a real user Stop or when the stream is replaced; a completed-then-
  aborted title is discarded (no save, cache cleared) rather than persisted.
- Reset (not remove) the post-completion title query: `resetQueries` refetches
  the mounted observer with a fresh retry budget, whereas `removeQueries` left it
  stuck in its error state, so the promised post-completion cycle never ran.
- Run the job-replacement check before resolving `convoReady`, and on a replaced
  stream cancel/discard the stale title so a discarded prompt can't persist a title.

* 🧷 fix: Tighten title abort ordering and endpoint-level timing resolution

Follow-up to codex review:

- Abort the title controller before resolving `convoReady` on a stopped turn, so
  the title task can't resume and persist before the later abort.
- Cancel the title and unblock its waits on ANY send failure (not just user
  aborts): a preflight/quota failure before the run exists otherwise hangs
  `_waitForRun`, deferring client disposal until the 45s title timeout.
- Resolve `titleTiming` for custom endpoints via `getCustomEndpointConfig`
  (their config lives under `endpoints.custom[]`, not `endpoints[endpoint]`).
- Derive the startup `titleGenerationTiming` via `resolveTitleTiming` for the
  agents endpoint so an endpoint-level `final` (without `endpoints.all`) is honored
  client-side instead of defaulting to immediate and burning eager gen_title polls.

* 🪢 fix: Per-agent title timing and safer abort/replacement handling

Follow-up to codex review:

- Resolve `titleTiming` from the agent's actual endpoint after initialization, so a
  per-endpoint `final` override on a custom/provider endpoint backing an (ephemeral)
  agent is honored instead of always using the `agents` endpoint's value.
- Don't preserve a locally-fetched title on a stopped (unfinished) turn: the server
  cancels and discards that title, so keeping it client-side would diverge from
  server state and leave the stopped chat titled until reload.
- On abort/replacement, only delete the cached title if it still holds THIS task's
  value — a replacement stream shares the `userId-conversationId` key and may have
  already cached its own valid title that must not be removed.

* 🪞 fix: Mirror AgentClient title-config resolution for titleTiming

Per maintainer guidance, keep titleTiming resolution identical to how
`AgentClient#titleConvo` already resolves the endpoint config — `endpoints.all`
is the intended global override and the agent's actual provider endpoint is used:

- Resolve via `endpoints.all ?? endpoints[endpoint] ?? getProviderConfig(endpoint)
  .customEndpointConfig` (was using `getCustomEndpointConfig` directly). Going
  through `getProviderConfig` picks up its case-insensitive fallback for normalized
  provider names (e.g. `openrouter` → `OpenRouter`), so a custom endpoint's
  `titleTiming` is honored like its other title settings.
- Add `titleTiming` to the Azure endpoint schema `.pick()` so
  `endpoints.azureOpenAI.titleTiming` is no longer silently stripped by Zod.

Note: per-endpoint title settings being skipped when `endpoints.all` is present is
the existing, intended global-override behavior — not changed here.

* 🧪 test: Cover useTitleGeneration effect logic (integration)

Adds a deterministic white-box integration test that drives the real hook's
React effects with a controllable react-query surface, locking down the
stateful decisions that previously had no coverage:

- immediate mode fetches a queued conversation while its stream is still active
- final mode gates until the stream completes, then becomes eligible
- success applies the fetched title to the conversation caches
- a 404 while active defers (removeQueries) instead of giving up
- a 404 after completion forces a fresh fetch via resetQueries (post-completion remount)

* feat: Stream immediate title events

* style: Format title SSE handler

* test: Preserve data-provider exports in OAuth mock

* test: Isolate OAuth route API mock

* test: Keep OAuth callback factory capture

* fix: Replay streamed title events on resume

* fix: Honor agents title timing precedence

* style: Format title timing fixes
This commit is contained in:
Danny Avila 2026-06-02 16:40:57 -04:00 committed by GitHub
parent b45e4aeae5
commit 2ef7bdfbc2
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
22 changed files with 1437 additions and 52 deletions

View file

@ -82,6 +82,14 @@ class AgentClient extends BaseClient {
/** @type {AgentRun} */
this.run;
/** Resolves with the agent run once `chatCompletion` initializes it (or
* `null` if initialization fails), letting immediate-mode title generation
* await the run instead of throwing when fired before the run exists.
* @type {Promise<AgentRun | null> | null} */
this._runReady = null;
/** @type {((run: AgentRun | null) => void) | null} */
this._resolveRun = null;
const {
agentConfigs,
contentParts,
@ -1039,6 +1047,10 @@ class AgentClient extends BaseClient {
}
this.run = run;
if (this._resolveRun) {
this._resolveRun(run);
this._resolveRun = null;
}
const streamId = this.options.req?._resumableStreamId;
if (streamId && run.Graph) {
@ -1170,6 +1182,10 @@ class AgentClient extends BaseClient {
err,
);
}
if (this._resolveRun) {
this._resolveRun(this.run ?? null);
this._resolveRun = null;
}
run = null;
config = null;
memoryPromise = null;
@ -1177,14 +1193,58 @@ class AgentClient extends BaseClient {
}
/**
*
* Resolves with the agent run once it is initialized, or `null` if
* initialization fails. Lets immediate-mode title generation await the run
* instead of throwing when fired before `chatCompletion` assigns `this.run`.
* Rejects promptly if the provided signal aborts before the run is ready.
* @param {AbortSignal} [signal]
* @returns {Promise<AgentRun | null>}
*/
_waitForRun(signal) {
if (this.run) {
return Promise.resolve(this.run);
}
if (!this._runReady) {
this._runReady = new Promise((resolve) => {
this._resolveRun = resolve;
});
}
if (!signal) {
return this._runReady;
}
if (signal.aborted) {
return Promise.reject(new Error('Aborted before run initialization'));
}
return new Promise((resolve, reject) => {
const onAbort = () => reject(new Error('Aborted before run initialization'));
signal.addEventListener('abort', onAbort, { once: true });
this._runReady.then((run) => {
signal.removeEventListener('abort', onAbort);
resolve(run);
});
});
}
/**
* @param {Object} params
* @param {string} params.text
* @param {string} params.conversationId
* @param {AbortController} params.abortController
* @param {boolean} [params.immediate] When true, the title is generated as soon
* as the request is made the run is awaited (instead of throwing) and the
* title derives from the user's input only (`contentParts` is empty).
*/
async titleConvo({ text, abortController }) {
async titleConvo({ text, abortController, immediate = false }) {
if (!this.run) {
throw new Error('Run not initialized');
if (!immediate) {
throw new Error('Run not initialized');
}
await this._waitForRun(abortController?.signal);
if (!this.run) {
logger.debug(
'[api/server/controllers/agents/client.js #titleConvo] Run unavailable for immediate title generation',
);
return;
}
}
const { handleLLMEnd, collected: collectedMetadata } = createMetadataAggregator();
const { req, agent } = this.options;
@ -1324,7 +1384,7 @@ class AgentClient extends BaseClient {
provider,
clientOptions,
inputText: text,
contentParts: this.contentParts,
contentParts: immediate ? [] : this.contentParts,
titleMethod: endpointConfig?.titleMethod,
titlePrompt: endpointConfig?.titlePrompt,
titlePromptTemplate: endpointConfig?.titlePromptTemplate,

View file

@ -128,6 +128,52 @@ describe('AgentClient - titleConvo', () => {
).rejects.toThrow('Run not initialized');
});
it('waits for the run in immediate mode instead of throwing', async () => {
client.run = null;
const abortController = new AbortController();
const titlePromise = client.titleConvo({ text: 'Test', abortController, immediate: true });
// Simulate `chatCompletion` assigning the run (client.js: `this.run = run`).
client.run = mockRun;
client._resolveRun(mockRun);
await titlePromise;
expect(mockRun.generateTitle).toHaveBeenCalled();
});
it('passes empty contentParts in immediate mode (title from the user input only)', async () => {
client.contentParts = [{ type: 'text', text: 'Streaming response so far' }];
const abortController = new AbortController();
await client.titleConvo({ text: 'Hello there', abortController, immediate: true });
const call = mockRun.generateTitle.mock.calls[0][0];
expect(call.contentParts).toEqual([]);
expect(call.inputText).toBe('Hello there');
});
it('uses live contentParts in non-immediate (final) mode', async () => {
client.contentParts = [{ type: 'text', text: 'Full response' }];
const abortController = new AbortController();
await client.titleConvo({ text: 'Hello there', abortController });
const call = mockRun.generateTitle.mock.calls[0][0];
expect(call.contentParts).toEqual([{ type: 'text', text: 'Full response' }]);
});
it('rejects promptly when aborted before the run initializes in immediate mode', async () => {
client.run = null;
const abortController = new AbortController();
abortController.abort();
await expect(
client.titleConvo({ text: 'Test', abortController, immediate: true }),
).rejects.toThrow('Aborted before run initialization');
expect(mockRun.generateTitle).not.toHaveBeenCalled();
});
it('should use titlePrompt from endpoint config', async () => {
const text = 'Test conversation text';
const abortController = new AbortController();

View file

@ -4,6 +4,7 @@ const {
sendEvent,
getViolationInfo,
buildMessageFiles,
resolveTitleTiming,
GenerationJobManager,
decrementPendingRequest,
sanitizeMessageForTransmit,
@ -93,6 +94,12 @@ const ResumableAgentController = async (req, res, next, initializeClient, addTit
const userId = req.user.id;
/** When to generate the conversation title. `immediate` (default) fires title
* generation in parallel with the response, from the user's first message;
* `final` defers it until the full response completes (legacy behavior).
* Resolved from the agent's actual endpoint once the client is initialized. */
let titleTiming = 'immediate';
const { allowed, pendingRequests, limit } = await checkAndIncrementPendingRequest(userId);
if (!allowed) {
const violationInfo = getViolationInfo(pendingRequests, limit);
@ -213,6 +220,13 @@ const ResumableAgentController = async (req, res, next, initializeClient, addTit
client = result.client;
// Resolve title timing from the public agents endpoint first, then fall
// back to the agent's actual backing provider/custom endpoint.
titleTiming = resolveTitleTiming({
appConfig: req.config,
endpoint: [endpointOption?.endpoint, client?.options?.agent?.endpoint],
});
if (client?.sender) {
GenerationJobManager.updateMetadata(streamId, { sender: client.sender });
}
@ -243,6 +257,56 @@ const ResumableAgentController = async (req, res, next, initializeClient, addTit
);
}
/** Immediate-mode title generation runs in parallel with the response, so
* the conversation row may not exist when the title resolves. `convoReady`
* resolves once the response (and thus the conversation) has been saved,
* gating the title's `saveConvo`. Declared here so both the success tail
* and the catch block can settle it and gate `disposeClient` on the title. */
let immediateTitlePromise = null;
let titleEventPromise = null;
let acceptsTitleEvents = true;
let resolveConvoReady;
const convoReady = new Promise((resolve) => {
resolveConvoReady = resolve;
});
/** Dedicated controller so a user Stop (or a replaced stream) cancels the
* in-flight title kept separate from `job.abortController`, which
* `completeJob` also aborts on *successful* completion and would otherwise
* cancel a title that is merely slower than a short response. */
const titleAbortController = new AbortController();
const abortTitleOnJobAbort = () => titleAbortController.abort();
if (job.abortController.signal.aborted) {
titleAbortController.abort();
} else {
job.abortController.signal.addEventListener('abort', abortTitleOnJobAbort, { once: true });
}
const titleEligible =
addTitle && parentMessageId === Constants.NO_PARENT && isNewConvo && !req.body?.isTemporary;
const emitTitleEvent = ({ conversationId: titleConversationId, title }) => {
titleEventPromise = (async () => {
if (!acceptsTitleEvents || titleAbortController.signal.aborted) {
return;
}
const currentJob = await GenerationJobManager.getJob(streamId);
if (!currentJob || currentJob.createdAt !== jobCreatedAt) {
return;
}
if (titleAbortController.signal.aborted) {
return;
}
await GenerationJobManager.emitChunk(streamId, {
event: 'title',
data: {
conversationId: titleConversationId,
title,
},
});
})().catch((err) => {
logger.error('[ResumableAgentController] Error emitting title event', err);
});
return titleEventPromise;
};
try {
const onStart = (userMsg, respMsgId, _isNewConvo) => {
userMessage = userMsg;
@ -289,7 +353,23 @@ const ResumableAgentController = async (req, res, next, initializeClient, addTit
},
};
const response = await client.sendMessage(text, messageOptions);
const sendPromise = client.sendMessage(text, messageOptions);
if (titleEligible && titleTiming === 'immediate') {
immediateTitlePromise = addTitle(req, {
text,
conversationId,
client,
immediate: true,
convoReady,
signal: titleAbortController.signal,
onTitleGenerated: emitTitleEvent,
}).catch((err) => {
logger.error('[ResumableAgentController] Error in immediate title generation', err);
});
}
const response = await sendPromise;
const messageId = response.messageId;
const endpoint = endpointOption.endpoint;
@ -355,11 +435,45 @@ const ResumableAgentController = async (req, res, next, initializeClient, addTit
originalCreatedAt: jobCreatedAt,
currentCreatedAt: currentJob?.createdAt,
});
// Discard the stale title from this replaced stream: cancel it and
// unblock its persistence wait without letting it save (the newer job
// owns the conversation now).
titleAbortController.abort();
job.abortController.signal.removeEventListener('abort', abortTitleOnJobAbort);
acceptsTitleEvents = false;
resolveConvoReady();
// Still decrement pending request since we incremented at start
await decrementPendingRequest(userId);
if (immediateTitlePromise) {
immediateTitlePromise.finally(() => {
if (client) {
disposeClient(client);
}
});
} else if (client) {
disposeClient(client);
}
return;
}
// If the user stopped this turn, cancel the title BEFORE unblocking its
// persistence wait — otherwise resolving `convoReady` lets the title task
// resume and save before the later abort runs.
if (wasAbortedBeforeComplete) {
titleAbortController.abort();
} else {
job.abortController.signal.removeEventListener('abort', abortTitleOnJobAbort);
}
// The conversation row now exists and this stream is authoritative; allow
// any in-flight immediate title generation to persist (saveConvo uses noUpsert).
resolveConvoReady();
acceptsTitleEvents = false;
if (titleEventPromise) {
await titleEventPromise;
}
if (!wasAbortedBeforeComplete) {
const finalEvent = {
final: true,
@ -402,7 +516,20 @@ const ResumableAgentController = async (req, res, next, initializeClient, addTit
await decrementPendingRequest(userId);
}
if (shouldGenerateTitle) {
if (titleTiming === 'immediate') {
// Title was fired in parallel above (if eligible); a stopped turn already
// aborted it before `resolveConvoReady`. Defer disposal until it settles
// so the run/req aren't torn down mid-generation.
if (immediateTitlePromise) {
immediateTitlePromise.finally(() => {
if (client) {
disposeClient(client);
}
});
} else if (client) {
disposeClient(client);
}
} else if (shouldGenerateTitle) {
addTitle(req, {
text,
response: { ...response },
@ -422,6 +549,15 @@ const ResumableAgentController = async (req, res, next, initializeClient, addTit
}
}
} catch (error) {
// Any failure (user Stop, or a preflight/quota failure before the run is
// even created) must cancel the title and unblock its waits: the title's
// `_waitForRun` would otherwise never resolve, deferring client disposal
// until the 45s title timeout, and no title should persist for a failed turn.
titleAbortController.abort();
job.abortController.signal.removeEventListener('abort', abortTitleOnJobAbort);
acceptsTitleEvents = false;
resolveConvoReady();
// Check if this was an abort (not a real error)
const wasAborted = job.abortController.signal.aborted || error.message?.includes('abort');
@ -436,7 +572,14 @@ const ResumableAgentController = async (req, res, next, initializeClient, addTit
await decrementPendingRequest(userId);
if (client) {
// Defer disposal until any immediate title settles (it holds the run/req).
if (immediateTitlePromise) {
immediateTitlePromise.finally(() => {
if (client) {
disposeClient(client);
}
});
} else if (client) {
disposeClient(client);
}