mirror of
https://github.com/danny-avila/LibreChat.git
synced 2026-06-09 17:31:19 +00:00
⚡ feat: Immediate Conversation Title Generation (#13395)
Some checks are pending
Docker Dev Branch Images Build / build (Dockerfile, lc-dev, node) (push) Waiting to run
Docker Dev Branch Images Build / build (Dockerfile.multi, lc-dev-api, api-build) (push) Waiting to run
GitNexus Index / index (push) Waiting to run
GitNexus Index / post-index (push) Blocked by required conditions
Docker Dev Images Build / build (Dockerfile, librechat-dev, node) (push) Waiting to run
Docker Dev Images Build / build (Dockerfile.multi, librechat-dev-api, api-build) (push) Waiting to run
Sync Locize Translations & Create Translation PR / Sync Translation Keys with Locize (push) Waiting to run
Sync Locize Translations & Create Translation PR / Create Translation PR on Version Published (push) Blocked by required conditions
Sync Helm Chart Tags / Ignore non-main push (push) Waiting to run
Sync Helm Chart Tags / Sync chart tags (push) Waiting to run
Some checks are pending
Docker Dev Branch Images Build / build (Dockerfile, lc-dev, node) (push) Waiting to run
Docker Dev Branch Images Build / build (Dockerfile.multi, lc-dev-api, api-build) (push) Waiting to run
GitNexus Index / index (push) Waiting to run
GitNexus Index / post-index (push) Blocked by required conditions
Docker Dev Images Build / build (Dockerfile, librechat-dev, node) (push) Waiting to run
Docker Dev Images Build / build (Dockerfile.multi, librechat-dev-api, api-build) (push) Waiting to run
Sync Locize Translations & Create Translation PR / Sync Translation Keys with Locize (push) Waiting to run
Sync Locize Translations & Create Translation PR / Create Translation PR on Version Published (push) Blocked by required conditions
Sync Helm Chart Tags / Ignore non-main push (push) Waiting to run
Sync Helm Chart Tags / Sync chart tags (push) Waiting to run
* ⚡ feat: Immediate Conversation Title Generation Generate conversation titles as soon as the request is made (in parallel with the response, from the user's first message) as the new default, fixing the #13318 race where a transient /gen_title 404 left new chats stuck on "New Chat". - Add per-endpoint `titleTiming` ('immediate' | 'final') to baseEndpointSchema; `endpoints.all` acts as the global default, unset = immediate. Resolve via a new `resolveTitleTiming` helper (`all` takes precedence). - Fire title generation in parallel with `sendMessage`; `titleConvo` waits (bounded, abortable) for the agent run and titles from the user input only. Persist after the conversation row exists; defer `disposeClient` until the title settles. - Expose `titleGenerationTiming` via startup config; `useTitleGeneration` fetches eagerly in immediate mode with a bounded 404 retry and never treats a transient 404 as final. Skip title queueing for temporary conversations. - Supersedes #13329 while incorporating its bounded 404-retry. * 🩹 fix: Address Copilot review findings on title timing - Guard against an undefined conversationId in addTitle (skip + warn) so the gen_title cache key can't collide as `userId-undefined` and saveConvo is never called without a conversationId. - Gate the title `useQueries` on `enabled` so no /gen_title request fires while unauthenticated (e.g. after logout) even if the module queue holds IDs. - Drop the stale `conversationId` param from the titleConvo JSDoc. - Add a regression test for the undefined-conversationId guard. * 🧵 fix: Harden immediate-title edge cases from codex review - Cancel in-flight immediate title generation when the request aborts: thread job.abortController.signal through addTitle so pressing Stop on a new chat neither consumes the title model nor surfaces a title for a cancelled turn. - Preserve a locally-applied title when the final SSE event's conversation carries no title yet (built before the title was saved), so long immediate-mode responses no longer revert the chat to "New Chat" until reload. - Guarantee one full post-completion gen_title fetch cycle before giving up, so a `final`-mode title (generated only after the stream ends) is still fetched under a global `immediate` default instead of being stranded. - Add regression tests for the abort propagation and the undefined-conversationId guard. * 🔁 fix: Correct title abort, post-completion refetch, and replacement ordering Follow-up to codex review of the immediate-title fixes: - Use a dedicated title AbortController instead of `job.abortController`. The latter is also aborted by `completeJob` on *successful* completion, which cancelled any title slower than a short response. The title is now cancelled only on a real user Stop or when the stream is replaced; a completed-then- aborted title is discarded (no save, cache cleared) rather than persisted. - Reset (not remove) the post-completion title query: `resetQueries` refetches the mounted observer with a fresh retry budget, whereas `removeQueries` left it stuck in its error state, so the promised post-completion cycle never ran. - Run the job-replacement check before resolving `convoReady`, and on a replaced stream cancel/discard the stale title so a discarded prompt can't persist a title. * 🧷 fix: Tighten title abort ordering and endpoint-level timing resolution Follow-up to codex review: - Abort the title controller before resolving `convoReady` on a stopped turn, so the title task can't resume and persist before the later abort. - Cancel the title and unblock its waits on ANY send failure (not just user aborts): a preflight/quota failure before the run exists otherwise hangs `_waitForRun`, deferring client disposal until the 45s title timeout. - Resolve `titleTiming` for custom endpoints via `getCustomEndpointConfig` (their config lives under `endpoints.custom[]`, not `endpoints[endpoint]`). - Derive the startup `titleGenerationTiming` via `resolveTitleTiming` for the agents endpoint so an endpoint-level `final` (without `endpoints.all`) is honored client-side instead of defaulting to immediate and burning eager gen_title polls. * 🪢 fix: Per-agent title timing and safer abort/replacement handling Follow-up to codex review: - Resolve `titleTiming` from the agent's actual endpoint after initialization, so a per-endpoint `final` override on a custom/provider endpoint backing an (ephemeral) agent is honored instead of always using the `agents` endpoint's value. - Don't preserve a locally-fetched title on a stopped (unfinished) turn: the server cancels and discards that title, so keeping it client-side would diverge from server state and leave the stopped chat titled until reload. - On abort/replacement, only delete the cached title if it still holds THIS task's value — a replacement stream shares the `userId-conversationId` key and may have already cached its own valid title that must not be removed. * 🪞 fix: Mirror AgentClient title-config resolution for titleTiming Per maintainer guidance, keep titleTiming resolution identical to how `AgentClient#titleConvo` already resolves the endpoint config — `endpoints.all` is the intended global override and the agent's actual provider endpoint is used: - Resolve via `endpoints.all ?? endpoints[endpoint] ?? getProviderConfig(endpoint) .customEndpointConfig` (was using `getCustomEndpointConfig` directly). Going through `getProviderConfig` picks up its case-insensitive fallback for normalized provider names (e.g. `openrouter` → `OpenRouter`), so a custom endpoint's `titleTiming` is honored like its other title settings. - Add `titleTiming` to the Azure endpoint schema `.pick()` so `endpoints.azureOpenAI.titleTiming` is no longer silently stripped by Zod. Note: per-endpoint title settings being skipped when `endpoints.all` is present is the existing, intended global-override behavior — not changed here. * 🧪 test: Cover useTitleGeneration effect logic (integration) Adds a deterministic white-box integration test that drives the real hook's React effects with a controllable react-query surface, locking down the stateful decisions that previously had no coverage: - immediate mode fetches a queued conversation while its stream is still active - final mode gates until the stream completes, then becomes eligible - success applies the fetched title to the conversation caches - a 404 while active defers (removeQueries) instead of giving up - a 404 after completion forces a fresh fetch via resetQueries (post-completion remount) * feat: Stream immediate title events * style: Format title SSE handler * test: Preserve data-provider exports in OAuth mock * test: Isolate OAuth route API mock * test: Keep OAuth callback factory capture * fix: Replay streamed title events on resume * fix: Honor agents title timing precedence * style: Format title timing fixes
This commit is contained in:
parent
b45e4aeae5
commit
2ef7bdfbc2
22 changed files with 1437 additions and 52 deletions
|
|
@ -82,6 +82,14 @@ class AgentClient extends BaseClient {
|
|||
/** @type {AgentRun} */
|
||||
this.run;
|
||||
|
||||
/** Resolves with the agent run once `chatCompletion` initializes it (or
|
||||
* `null` if initialization fails), letting immediate-mode title generation
|
||||
* await the run instead of throwing when fired before the run exists.
|
||||
* @type {Promise<AgentRun | null> | null} */
|
||||
this._runReady = null;
|
||||
/** @type {((run: AgentRun | null) => void) | null} */
|
||||
this._resolveRun = null;
|
||||
|
||||
const {
|
||||
agentConfigs,
|
||||
contentParts,
|
||||
|
|
@ -1039,6 +1047,10 @@ class AgentClient extends BaseClient {
|
|||
}
|
||||
|
||||
this.run = run;
|
||||
if (this._resolveRun) {
|
||||
this._resolveRun(run);
|
||||
this._resolveRun = null;
|
||||
}
|
||||
|
||||
const streamId = this.options.req?._resumableStreamId;
|
||||
if (streamId && run.Graph) {
|
||||
|
|
@ -1170,6 +1182,10 @@ class AgentClient extends BaseClient {
|
|||
err,
|
||||
);
|
||||
}
|
||||
if (this._resolveRun) {
|
||||
this._resolveRun(this.run ?? null);
|
||||
this._resolveRun = null;
|
||||
}
|
||||
run = null;
|
||||
config = null;
|
||||
memoryPromise = null;
|
||||
|
|
@ -1177,14 +1193,58 @@ class AgentClient extends BaseClient {
|
|||
}
|
||||
|
||||
/**
|
||||
*
|
||||
* Resolves with the agent run once it is initialized, or `null` if
|
||||
* initialization fails. Lets immediate-mode title generation await the run
|
||||
* instead of throwing when fired before `chatCompletion` assigns `this.run`.
|
||||
* Rejects promptly if the provided signal aborts before the run is ready.
|
||||
* @param {AbortSignal} [signal]
|
||||
* @returns {Promise<AgentRun | null>}
|
||||
*/
|
||||
_waitForRun(signal) {
|
||||
if (this.run) {
|
||||
return Promise.resolve(this.run);
|
||||
}
|
||||
if (!this._runReady) {
|
||||
this._runReady = new Promise((resolve) => {
|
||||
this._resolveRun = resolve;
|
||||
});
|
||||
}
|
||||
if (!signal) {
|
||||
return this._runReady;
|
||||
}
|
||||
if (signal.aborted) {
|
||||
return Promise.reject(new Error('Aborted before run initialization'));
|
||||
}
|
||||
return new Promise((resolve, reject) => {
|
||||
const onAbort = () => reject(new Error('Aborted before run initialization'));
|
||||
signal.addEventListener('abort', onAbort, { once: true });
|
||||
this._runReady.then((run) => {
|
||||
signal.removeEventListener('abort', onAbort);
|
||||
resolve(run);
|
||||
});
|
||||
});
|
||||
}
|
||||
|
||||
/**
|
||||
* @param {Object} params
|
||||
* @param {string} params.text
|
||||
* @param {string} params.conversationId
|
||||
* @param {AbortController} params.abortController
|
||||
* @param {boolean} [params.immediate] When true, the title is generated as soon
|
||||
* as the request is made — the run is awaited (instead of throwing) and the
|
||||
* title derives from the user's input only (`contentParts` is empty).
|
||||
*/
|
||||
async titleConvo({ text, abortController }) {
|
||||
async titleConvo({ text, abortController, immediate = false }) {
|
||||
if (!this.run) {
|
||||
throw new Error('Run not initialized');
|
||||
if (!immediate) {
|
||||
throw new Error('Run not initialized');
|
||||
}
|
||||
await this._waitForRun(abortController?.signal);
|
||||
if (!this.run) {
|
||||
logger.debug(
|
||||
'[api/server/controllers/agents/client.js #titleConvo] Run unavailable for immediate title generation',
|
||||
);
|
||||
return;
|
||||
}
|
||||
}
|
||||
const { handleLLMEnd, collected: collectedMetadata } = createMetadataAggregator();
|
||||
const { req, agent } = this.options;
|
||||
|
|
@ -1324,7 +1384,7 @@ class AgentClient extends BaseClient {
|
|||
provider,
|
||||
clientOptions,
|
||||
inputText: text,
|
||||
contentParts: this.contentParts,
|
||||
contentParts: immediate ? [] : this.contentParts,
|
||||
titleMethod: endpointConfig?.titleMethod,
|
||||
titlePrompt: endpointConfig?.titlePrompt,
|
||||
titlePromptTemplate: endpointConfig?.titlePromptTemplate,
|
||||
|
|
|
|||
|
|
@ -128,6 +128,52 @@ describe('AgentClient - titleConvo', () => {
|
|||
).rejects.toThrow('Run not initialized');
|
||||
});
|
||||
|
||||
it('waits for the run in immediate mode instead of throwing', async () => {
|
||||
client.run = null;
|
||||
const abortController = new AbortController();
|
||||
|
||||
const titlePromise = client.titleConvo({ text: 'Test', abortController, immediate: true });
|
||||
|
||||
// Simulate `chatCompletion` assigning the run (client.js: `this.run = run`).
|
||||
client.run = mockRun;
|
||||
client._resolveRun(mockRun);
|
||||
|
||||
await titlePromise;
|
||||
expect(mockRun.generateTitle).toHaveBeenCalled();
|
||||
});
|
||||
|
||||
it('passes empty contentParts in immediate mode (title from the user input only)', async () => {
|
||||
client.contentParts = [{ type: 'text', text: 'Streaming response so far' }];
|
||||
const abortController = new AbortController();
|
||||
|
||||
await client.titleConvo({ text: 'Hello there', abortController, immediate: true });
|
||||
|
||||
const call = mockRun.generateTitle.mock.calls[0][0];
|
||||
expect(call.contentParts).toEqual([]);
|
||||
expect(call.inputText).toBe('Hello there');
|
||||
});
|
||||
|
||||
it('uses live contentParts in non-immediate (final) mode', async () => {
|
||||
client.contentParts = [{ type: 'text', text: 'Full response' }];
|
||||
const abortController = new AbortController();
|
||||
|
||||
await client.titleConvo({ text: 'Hello there', abortController });
|
||||
|
||||
const call = mockRun.generateTitle.mock.calls[0][0];
|
||||
expect(call.contentParts).toEqual([{ type: 'text', text: 'Full response' }]);
|
||||
});
|
||||
|
||||
it('rejects promptly when aborted before the run initializes in immediate mode', async () => {
|
||||
client.run = null;
|
||||
const abortController = new AbortController();
|
||||
abortController.abort();
|
||||
|
||||
await expect(
|
||||
client.titleConvo({ text: 'Test', abortController, immediate: true }),
|
||||
).rejects.toThrow('Aborted before run initialization');
|
||||
expect(mockRun.generateTitle).not.toHaveBeenCalled();
|
||||
});
|
||||
|
||||
it('should use titlePrompt from endpoint config', async () => {
|
||||
const text = 'Test conversation text';
|
||||
const abortController = new AbortController();
|
||||
|
|
|
|||
|
|
@ -4,6 +4,7 @@ const {
|
|||
sendEvent,
|
||||
getViolationInfo,
|
||||
buildMessageFiles,
|
||||
resolveTitleTiming,
|
||||
GenerationJobManager,
|
||||
decrementPendingRequest,
|
||||
sanitizeMessageForTransmit,
|
||||
|
|
@ -93,6 +94,12 @@ const ResumableAgentController = async (req, res, next, initializeClient, addTit
|
|||
|
||||
const userId = req.user.id;
|
||||
|
||||
/** When to generate the conversation title. `immediate` (default) fires title
|
||||
* generation in parallel with the response, from the user's first message;
|
||||
* `final` defers it until the full response completes (legacy behavior).
|
||||
* Resolved from the agent's actual endpoint once the client is initialized. */
|
||||
let titleTiming = 'immediate';
|
||||
|
||||
const { allowed, pendingRequests, limit } = await checkAndIncrementPendingRequest(userId);
|
||||
if (!allowed) {
|
||||
const violationInfo = getViolationInfo(pendingRequests, limit);
|
||||
|
|
@ -213,6 +220,13 @@ const ResumableAgentController = async (req, res, next, initializeClient, addTit
|
|||
|
||||
client = result.client;
|
||||
|
||||
// Resolve title timing from the public agents endpoint first, then fall
|
||||
// back to the agent's actual backing provider/custom endpoint.
|
||||
titleTiming = resolveTitleTiming({
|
||||
appConfig: req.config,
|
||||
endpoint: [endpointOption?.endpoint, client?.options?.agent?.endpoint],
|
||||
});
|
||||
|
||||
if (client?.sender) {
|
||||
GenerationJobManager.updateMetadata(streamId, { sender: client.sender });
|
||||
}
|
||||
|
|
@ -243,6 +257,56 @@ const ResumableAgentController = async (req, res, next, initializeClient, addTit
|
|||
);
|
||||
}
|
||||
|
||||
/** Immediate-mode title generation runs in parallel with the response, so
|
||||
* the conversation row may not exist when the title resolves. `convoReady`
|
||||
* resolves once the response (and thus the conversation) has been saved,
|
||||
* gating the title's `saveConvo`. Declared here so both the success tail
|
||||
* and the catch block can settle it and gate `disposeClient` on the title. */
|
||||
let immediateTitlePromise = null;
|
||||
let titleEventPromise = null;
|
||||
let acceptsTitleEvents = true;
|
||||
let resolveConvoReady;
|
||||
const convoReady = new Promise((resolve) => {
|
||||
resolveConvoReady = resolve;
|
||||
});
|
||||
/** Dedicated controller so a user Stop (or a replaced stream) cancels the
|
||||
* in-flight title — kept separate from `job.abortController`, which
|
||||
* `completeJob` also aborts on *successful* completion and would otherwise
|
||||
* cancel a title that is merely slower than a short response. */
|
||||
const titleAbortController = new AbortController();
|
||||
const abortTitleOnJobAbort = () => titleAbortController.abort();
|
||||
if (job.abortController.signal.aborted) {
|
||||
titleAbortController.abort();
|
||||
} else {
|
||||
job.abortController.signal.addEventListener('abort', abortTitleOnJobAbort, { once: true });
|
||||
}
|
||||
const titleEligible =
|
||||
addTitle && parentMessageId === Constants.NO_PARENT && isNewConvo && !req.body?.isTemporary;
|
||||
const emitTitleEvent = ({ conversationId: titleConversationId, title }) => {
|
||||
titleEventPromise = (async () => {
|
||||
if (!acceptsTitleEvents || titleAbortController.signal.aborted) {
|
||||
return;
|
||||
}
|
||||
const currentJob = await GenerationJobManager.getJob(streamId);
|
||||
if (!currentJob || currentJob.createdAt !== jobCreatedAt) {
|
||||
return;
|
||||
}
|
||||
if (titleAbortController.signal.aborted) {
|
||||
return;
|
||||
}
|
||||
await GenerationJobManager.emitChunk(streamId, {
|
||||
event: 'title',
|
||||
data: {
|
||||
conversationId: titleConversationId,
|
||||
title,
|
||||
},
|
||||
});
|
||||
})().catch((err) => {
|
||||
logger.error('[ResumableAgentController] Error emitting title event', err);
|
||||
});
|
||||
return titleEventPromise;
|
||||
};
|
||||
|
||||
try {
|
||||
const onStart = (userMsg, respMsgId, _isNewConvo) => {
|
||||
userMessage = userMsg;
|
||||
|
|
@ -289,7 +353,23 @@ const ResumableAgentController = async (req, res, next, initializeClient, addTit
|
|||
},
|
||||
};
|
||||
|
||||
const response = await client.sendMessage(text, messageOptions);
|
||||
const sendPromise = client.sendMessage(text, messageOptions);
|
||||
|
||||
if (titleEligible && titleTiming === 'immediate') {
|
||||
immediateTitlePromise = addTitle(req, {
|
||||
text,
|
||||
conversationId,
|
||||
client,
|
||||
immediate: true,
|
||||
convoReady,
|
||||
signal: titleAbortController.signal,
|
||||
onTitleGenerated: emitTitleEvent,
|
||||
}).catch((err) => {
|
||||
logger.error('[ResumableAgentController] Error in immediate title generation', err);
|
||||
});
|
||||
}
|
||||
|
||||
const response = await sendPromise;
|
||||
|
||||
const messageId = response.messageId;
|
||||
const endpoint = endpointOption.endpoint;
|
||||
|
|
@ -355,11 +435,45 @@ const ResumableAgentController = async (req, res, next, initializeClient, addTit
|
|||
originalCreatedAt: jobCreatedAt,
|
||||
currentCreatedAt: currentJob?.createdAt,
|
||||
});
|
||||
// Discard the stale title from this replaced stream: cancel it and
|
||||
// unblock its persistence wait without letting it save (the newer job
|
||||
// owns the conversation now).
|
||||
titleAbortController.abort();
|
||||
job.abortController.signal.removeEventListener('abort', abortTitleOnJobAbort);
|
||||
acceptsTitleEvents = false;
|
||||
resolveConvoReady();
|
||||
// Still decrement pending request since we incremented at start
|
||||
await decrementPendingRequest(userId);
|
||||
if (immediateTitlePromise) {
|
||||
immediateTitlePromise.finally(() => {
|
||||
if (client) {
|
||||
disposeClient(client);
|
||||
}
|
||||
});
|
||||
} else if (client) {
|
||||
disposeClient(client);
|
||||
}
|
||||
return;
|
||||
}
|
||||
|
||||
// If the user stopped this turn, cancel the title BEFORE unblocking its
|
||||
// persistence wait — otherwise resolving `convoReady` lets the title task
|
||||
// resume and save before the later abort runs.
|
||||
if (wasAbortedBeforeComplete) {
|
||||
titleAbortController.abort();
|
||||
} else {
|
||||
job.abortController.signal.removeEventListener('abort', abortTitleOnJobAbort);
|
||||
}
|
||||
|
||||
// The conversation row now exists and this stream is authoritative; allow
|
||||
// any in-flight immediate title generation to persist (saveConvo uses noUpsert).
|
||||
resolveConvoReady();
|
||||
acceptsTitleEvents = false;
|
||||
|
||||
if (titleEventPromise) {
|
||||
await titleEventPromise;
|
||||
}
|
||||
|
||||
if (!wasAbortedBeforeComplete) {
|
||||
const finalEvent = {
|
||||
final: true,
|
||||
|
|
@ -402,7 +516,20 @@ const ResumableAgentController = async (req, res, next, initializeClient, addTit
|
|||
await decrementPendingRequest(userId);
|
||||
}
|
||||
|
||||
if (shouldGenerateTitle) {
|
||||
if (titleTiming === 'immediate') {
|
||||
// Title was fired in parallel above (if eligible); a stopped turn already
|
||||
// aborted it before `resolveConvoReady`. Defer disposal until it settles
|
||||
// so the run/req aren't torn down mid-generation.
|
||||
if (immediateTitlePromise) {
|
||||
immediateTitlePromise.finally(() => {
|
||||
if (client) {
|
||||
disposeClient(client);
|
||||
}
|
||||
});
|
||||
} else if (client) {
|
||||
disposeClient(client);
|
||||
}
|
||||
} else if (shouldGenerateTitle) {
|
||||
addTitle(req, {
|
||||
text,
|
||||
response: { ...response },
|
||||
|
|
@ -422,6 +549,15 @@ const ResumableAgentController = async (req, res, next, initializeClient, addTit
|
|||
}
|
||||
}
|
||||
} catch (error) {
|
||||
// Any failure (user Stop, or a preflight/quota failure before the run is
|
||||
// even created) must cancel the title and unblock its waits: the title's
|
||||
// `_waitForRun` would otherwise never resolve, deferring client disposal
|
||||
// until the 45s title timeout, and no title should persist for a failed turn.
|
||||
titleAbortController.abort();
|
||||
job.abortController.signal.removeEventListener('abort', abortTitleOnJobAbort);
|
||||
acceptsTitleEvents = false;
|
||||
resolveConvoReady();
|
||||
|
||||
// Check if this was an abort (not a real error)
|
||||
const wasAborted = job.abortController.signal.aborted || error.message?.includes('abort');
|
||||
|
||||
|
|
@ -436,7 +572,14 @@ const ResumableAgentController = async (req, res, next, initializeClient, addTit
|
|||
|
||||
await decrementPendingRequest(userId);
|
||||
|
||||
if (client) {
|
||||
// Defer disposal until any immediate title settles (it holds the run/req).
|
||||
if (immediateTitlePromise) {
|
||||
immediateTitlePromise.finally(() => {
|
||||
if (client) {
|
||||
disposeClient(client);
|
||||
}
|
||||
});
|
||||
} else if (client) {
|
||||
disposeClient(client);
|
||||
}
|
||||
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue