mirror of
https://github.com/danny-avila/LibreChat.git
synced 2026-05-13 16:07:30 +00:00
* 💎 fix: Stop Double-Counting Cache Tokens for Gemini/OpenAI in Usage Spend (#12855) Different providers report `usage_metadata.input_tokens` with different semantics: - Anthropic / Bedrock: `input_tokens` EXCLUDES cache; cache reads/writes arrive separately and must be added to get the total prompt size. - Gemini / OpenAI: `input_tokens` ALREADY INCLUDES cached tokens (Google's `promptTokenCount`, OpenAI's `prompt_tokens`). Their `input_token_details.cache_*` are subsets of `input_tokens`. `recordCollectedUsage` treated both schemes as additive, so for cache-hit requests on Gemini/OpenAI it added cache tokens on top of an `input_tokens` value that already contained them — overcharging users by the cache_hit_rate (e.g., ~67% cache hit ≈ 1.67x overcharge). This matches the issue reporter's GCP billing comparison. Adds a small `splitUsage` helper that classifies the provider by model name and computes `inputOnly` (the non-cached portion) plus the all-inclusive `totalInput` for both the spend math and the returned `input_tokens` summary. The helper defaults to additive semantics (the historical behavior) so unknown providers are unaffected. Updates existing OpenAI-shaped tests that previously asserted the buggy additive math, and adds Gemini regression tests using the exact numbers from the issue report (input=11125, cache_read=7441 → input=3684). Anthropic / Bedrock paths remain bit-identical to before. * 🔧 refactor: Classify Cache-Token Semantics by Provider, Not Model Name Follows up the previous commit. Replaces a model-name regex (`gemini|gpt|o[1-9]|chatgpt`) with an explicit `Providers` enum lookup keyed off the `usage.provider` field — `UsageMetadata.provider` already exists in `IJobStore.ts` but was never being populated. - `callbacks.js#ModelEndHandler` now attaches `usage.provider` from `agentContext.provider` alongside `usage.model`. - `usage.ts` uses a `SUBSET_PROVIDERS` set (`openAI`, `azureOpenAI`, `google`, `vertexai`, `xai`, `deepseek`, `openrouter`, `moonshot`) backed by the canonical `Providers` enum from `librechat-data-provider`. - `xai`, `deepseek`, `openrouter`, `moonshot` extend `ChatOpenAI` so they inherit subset semantics (verified in node_modules). - Defaults to additive when `usage.provider` is missing, so the title flow (which doesn't propagate provider) and any pre-this-PR usage entries keep their existing behavior. Tests: switch fixtures from model-name signaling to explicit `provider` field, plus a Vertex AI case and a "missing provider" fallback case. |
||
|---|---|---|
| .. | ||
| __tests__ | ||
| agents | ||
| assistants | ||
| auth | ||
| AuthController.js | ||
| AuthController.spec.js | ||
| Balance.js | ||
| EndpointController.js | ||
| FavoritesController.js | ||
| FavoritesController.spec.js | ||
| mcp.js | ||
| ModelController.js | ||
| PermissionsController.js | ||
| PluginController.js | ||
| PluginController.spec.js | ||
| SkillStatesController.js | ||
| tools.js | ||
| TwoFactorController.js | ||
| UserController.js | ||
| UserController.spec.js | ||