LibreChat/api/server/controllers
Danny Avila 89bf2ab7b4
💎 fix: Stop Double-Counting Cache Tokens for Gemini/OpenAI in Usage Spend (#12868)
* 💎 fix: Stop Double-Counting Cache Tokens for Gemini/OpenAI in Usage Spend (#12855)

Different providers report `usage_metadata.input_tokens` with different
semantics:

  - Anthropic / Bedrock: `input_tokens` EXCLUDES cache; cache reads/writes
    arrive separately and must be added to get the total prompt size.
  - Gemini / OpenAI: `input_tokens` ALREADY INCLUDES cached tokens
    (Google's `promptTokenCount`, OpenAI's `prompt_tokens`). Their
    `input_token_details.cache_*` are subsets of `input_tokens`.

`recordCollectedUsage` treated both schemes as additive, so for cache-hit
requests on Gemini/OpenAI it added cache tokens on top of an
`input_tokens` value that already contained them — overcharging users by
the cache_hit_rate (e.g., ~67% cache hit ≈ 1.67x overcharge). This
matches the issue reporter's GCP billing comparison.

Adds a small `splitUsage` helper that classifies the provider by model
name and computes `inputOnly` (the non-cached portion) plus the
all-inclusive `totalInput` for both the spend math and the returned
`input_tokens` summary. The helper defaults to additive semantics (the
historical behavior) so unknown providers are unaffected.

Updates existing OpenAI-shaped tests that previously asserted the buggy
additive math, and adds Gemini regression tests using the exact numbers
from the issue report (input=11125, cache_read=7441 → input=3684).

Anthropic / Bedrock paths remain bit-identical to before.

* 🔧 refactor: Classify Cache-Token Semantics by Provider, Not Model Name

Follows up the previous commit. Replaces a model-name regex
(`gemini|gpt|o[1-9]|chatgpt`) with an explicit `Providers` enum lookup
keyed off the `usage.provider` field — `UsageMetadata.provider` already
exists in `IJobStore.ts` but was never being populated.

  - `callbacks.js#ModelEndHandler` now attaches `usage.provider` from
    `agentContext.provider` alongside `usage.model`.
  - `usage.ts` uses a `SUBSET_PROVIDERS` set (`openAI`, `azureOpenAI`,
    `google`, `vertexai`, `xai`, `deepseek`, `openrouter`, `moonshot`)
    backed by the canonical `Providers` enum from
    `librechat-data-provider`.
  - `xai`, `deepseek`, `openrouter`, `moonshot` extend `ChatOpenAI` so
    they inherit subset semantics (verified in node_modules).
  - Defaults to additive when `usage.provider` is missing, so the title
    flow (which doesn't propagate provider) and any pre-this-PR usage
    entries keep their existing behavior.

Tests: switch fixtures from model-name signaling to explicit `provider`
field, plus a Vertex AI case and a "missing provider" fallback case.
2026-04-29 08:36:00 +09:00
..
__tests__ 🧰 refactor: Unify code-execution tools (#12767) 2026-04-25 04:02:01 -04:00
agents 💎 fix: Stop Double-Counting Cache Tokens for Gemini/OpenAI in Usage Spend (#12868) 2026-04-29 08:36:00 +09:00
assistants 💰 fix: Lazy-Initialize Balance Record at Check Time for Overrides (#12474) 2026-03-30 22:51:07 -04:00
auth 🛡️ fix: Add Origin Binding to Admin OAuth Exchange Codes (#12469) 2026-03-30 16:54:00 -04:00
AuthController.js 🔏 fix: Remove Federated Tokens from OpenID Refresh Response (#12264) 2026-03-16 09:23:46 -04:00
AuthController.spec.js 🔏 fix: Remove Federated Tokens from OpenID Refresh Response (#12264) 2026-03-16 09:23:46 -04:00
Balance.js 📦 refactor: Consolidate DB models, encapsulating Mongoose usage in data-schemas (#11830) 2026-03-21 14:28:53 -04:00
EndpointController.js refactor: Integrate Capabilities into Agent File Uploads and Tool Handling (#5048) 2024-12-19 13:04:48 -05:00
FavoritesController.js 📌 feat: Add Pin Support for Model Specs (#11219) 2026-04-09 18:37:25 -04:00
FavoritesController.spec.js 📌 feat: Add Pin Support for Model Specs (#11219) 2026-04-09 18:37:25 -04:00
mcp.js 🏗️ feat: 3-Tier MCP Server Architecture with Config-Source Lazy Init (#12435) 2026-03-28 10:36:43 -04:00
ModelController.js 🏗️ refactor: Remove Redundant Caching, Migrate Config Services to TypeScript (#12466) 2026-03-30 16:49:48 -04:00
PermissionsController.js 🧹 chore: Move direct model usage from PermissionsController to data-schemas 2026-03-21 15:20:15 -04:00
PluginController.js 🏗️ refactor: Remove Redundant Caching, Migrate Config Services to TypeScript (#12466) 2026-03-30 16:49:48 -04:00
PluginController.spec.js 🏗️ refactor: Remove Redundant Caching, Migrate Config Services to TypeScript (#12466) 2026-03-30 16:49:48 -04:00
SkillStatesController.js 🎚️ feat: Per-User Skill Active/Inactive Toggle with Ownership-Aware Defaults (#12692) 2026-04-25 04:02:00 -04:00
tools.js 🛂 fix: Skip Inherited / Mark Skill Files Read-Only in Code-Env Pipeline (#12866) 2026-04-29 08:26:25 +09:00
TwoFactorController.js 🔑 fix: Require OTP Verification for 2FA Re-Enrollment and Backup Code Regeneration (#12223) 2026-03-14 01:51:31 -04:00
UserController.js 🧬 feat: Scaffold Skills CRUD with ACL Sharing and File Schema (#12613) 2026-04-25 04:01:59 -04:00
UserController.spec.js 🧬 feat: Scaffold Skills CRUD with ACL Sharing and File Schema (#12613) 2026-04-25 04:01:59 -04:00