A maintained data contract · not a calculator
When you push real traffic you don't fail on cost — you fail on 429 Too Many Requests, and which limit binds first (RPM vs ITPM vs OTPM vs TPM) decides how you re-architect. No incumbent pricing dataset publishes that number. This one does — versioned, freshness-stamped, CORS-open, built for agents and CI to fetch and cite.
# which rate-limit dimension 429s first, per model, per org tier curl -s https://llmcapplanner.vercel.app/v1/rate-limits.json # pricing + limits combined (byte-stable alias: /snapshot.json) curl -s https://llmcapplanner.vercel.app/v1/models.json
No key. No signup. No build step. Access-Control-Allow-Origin: * — fetch it straight from a browser, an agent, or a CI step.
/snapshot.json is a byte-identical stable alias.
models.dev and the LiteLLM catalog cover pricing + context window only. Neither carries per-org/per-tier rate-limit ceilings. The provider docs state them as prose, per page, undated. This dataset is the single citable, machine-fetchable JSON of which limit binds first and at what number — re-verified by hand against the official provider docs whenever a model launches or a limit changes.
Anthropic publishes ITPM/OTPM per model; OpenAI enforces RPM/TPM per tier. The dataset preserves both shapes faithfully rather than flattening them into a misleading single "TPM":
| provider | tier | binding dimensions (snapshot 2026-05-15) |
|---|---|---|
| anthropic | t1 → t4 | rpm, itpm, otpm per model (opus-4-7 / sonnet-4-6 / haiku-4-5) |
| openai | t1 → t5 | rpm, tpm per tier |
Every payload carries schema_version, last_verified, freshness_policy, and the official sources array it was verified against. You always know exactly how fresh the numbers are — the opposite of a hard-coded table that silently rots.
An MCP (Model Context Protocol) stdio server wraps the same dated snapshot so an agent can answer "what tier do I need for 600 rpm of claude-sonnet-4-6 at 2k in / 500 out, and what 429s first?" with real arithmetic instead of a hallucinated guess. Offline, deterministic, every response stamped with snapshot_version.
mcp/ — one tool: llm_capacity_plan(provider, model, tier, rpm, in_tok, out_tok)dist/index.js is committed — runs straight from a clone, no build step.