LibreChat

mirror of https://github.com/danny-avila/LibreChat.git synced 2026-05-13 16:07:30 +00:00

History

Danny Avila 89bf2ab7b4 💎 fix: Stop Double-Counting Cache Tokens for Gemini/OpenAI in Usage Spend (#12868 ) * 💎 fix: Stop Double-Counting Cache Tokens for Gemini/OpenAI in Usage Spend (#12855) Different providers report `usage_metadata.input_tokens` with different semantics: - Anthropic / Bedrock: `input_tokens` EXCLUDES cache; cache reads/writes arrive separately and must be added to get the total prompt size. - Gemini / OpenAI: `input_tokens` ALREADY INCLUDES cached tokens (Google's `promptTokenCount`, OpenAI's `prompt_tokens`). Their `input_token_details.cache_` are subsets of `input_tokens`. `recordCollectedUsage` treated both schemes as additive, so for cache-hit requests on Gemini/OpenAI it added cache tokens on top of an `input_tokens` value that already contained them — overcharging users by the cache_hit_rate (e.g., ~67% cache hit ≈ 1.67x overcharge). This matches the issue reporter's GCP billing comparison. Adds a small `splitUsage` helper that classifies the provider by model name and computes `inputOnly` (the non-cached portion) plus the all-inclusive `totalInput` for both the spend math and the returned `input_tokens` summary. The helper defaults to additive semantics (the historical behavior) so unknown providers are unaffected. Updates existing OpenAI-shaped tests that previously asserted the buggy additive math, and adds Gemini regression tests using the exact numbers from the issue report (input=11125, cache_read=7441 → input=3684). Anthropic / Bedrock paths remain bit-identical to before. 🔧 refactor: Classify Cache-Token Semantics by Provider, Not Model Name Follows up the previous commit. Replaces a model-name regex (`gemini\|gpt\|o[1-9]\|chatgpt`) with an explicit `Providers` enum lookup keyed off the `usage.provider` field — `UsageMetadata.provider` already exists in `IJobStore.ts` but was never being populated. - `callbacks.js#ModelEndHandler` now attaches `usage.provider` from `agentContext.provider` alongside `usage.model`. - `usage.ts` uses a `SUBSET_PROVIDERS` set (`openAI`, `azureOpenAI`, `google`, `vertexai`, `xai`, `deepseek`, `openrouter`, `moonshot`) backed by the canonical `Providers` enum from `librechat-data-provider`. - `xai`, `deepseek`, `openrouter`, `moonshot` extend `ChatOpenAI` so they inherit subset semantics (verified in node_modules). - Defaults to additive when `usage.provider` is missing, so the title flow (which doesn't propagate provider) and any pre-this-PR usage entries keep their existing behavior. Tests: switch fixtures from model-name signaling to explicit `provider` field, plus a Vertex AI case and a "missing provider" fallback case.		2026-04-29 08:36:00 +09:00
..
__tests__	🧰 refactor: Unify code-execution tools (#12767 )	2026-04-25 04:02:01 -04:00
agents	💎 fix: Stop Double-Counting Cache Tokens for Gemini/OpenAI in Usage Spend (#12868 )	2026-04-29 08:36:00 +09:00
assistants	💰 fix: Lazy-Initialize Balance Record at Check Time for Overrides (#12474 )	2026-03-30 22:51:07 -04:00
auth	🛡️ fix: Add Origin Binding to Admin OAuth Exchange Codes (#12469 )	2026-03-30 16:54:00 -04:00
AuthController.js	🔏 fix: Remove Federated Tokens from OpenID Refresh Response (#12264 )	2026-03-16 09:23:46 -04:00
AuthController.spec.js	🔏 fix: Remove Federated Tokens from OpenID Refresh Response (#12264 )	2026-03-16 09:23:46 -04:00
Balance.js	📦 refactor: Consolidate DB models, encapsulating Mongoose usage in `data-schemas` (#11830 )	2026-03-21 14:28:53 -04:00
EndpointController.js	✨ refactor: Integrate Capabilities into Agent File Uploads and Tool Handling (#5048 )	2024-12-19 13:04:48 -05:00
FavoritesController.js	📌 feat: Add Pin Support for Model Specs (#11219 )	2026-04-09 18:37:25 -04:00
FavoritesController.spec.js	📌 feat: Add Pin Support for Model Specs (#11219 )	2026-04-09 18:37:25 -04:00
mcp.js	🏗️ feat: 3-Tier MCP Server Architecture with Config-Source Lazy Init (#12435 )	2026-03-28 10:36:43 -04:00
ModelController.js	🏗️ refactor: Remove Redundant Caching, Migrate Config Services to TypeScript (#12466 )	2026-03-30 16:49:48 -04:00
PermissionsController.js	🧹 chore: Move direct model usage from PermissionsController to data-schemas	2026-03-21 15:20:15 -04:00
PluginController.js	🏗️ refactor: Remove Redundant Caching, Migrate Config Services to TypeScript (#12466 )	2026-03-30 16:49:48 -04:00
PluginController.spec.js	🏗️ refactor: Remove Redundant Caching, Migrate Config Services to TypeScript (#12466 )	2026-03-30 16:49:48 -04:00
SkillStatesController.js	🎚️ feat: Per-User Skill Active/Inactive Toggle with Ownership-Aware Defaults (#12692 )	2026-04-25 04:02:00 -04:00
tools.js	🛂 fix: Skip Inherited / Mark Skill Files Read-Only in Code-Env Pipeline (#12866 )	2026-04-29 08:26:25 +09:00
TwoFactorController.js	🔑 fix: Require OTP Verification for 2FA Re-Enrollment and Backup Code Regeneration (#12223 )	2026-03-14 01:51:31 -04:00
UserController.js	🧬 feat: Scaffold Skills CRUD with ACL Sharing and File Schema (#12613 )	2026-04-25 04:01:59 -04:00
UserController.spec.js	🧬 feat: Scaffold Skills CRUD with ACL Sharing and File Schema (#12613 )	2026-04-25 04:01:59 -04:00