A maintained data contract · not a calculator

The per-org/per-tier rate-limit ceiling for Claude & GPT — as dated, machine-fetchable JSON.

When you push real traffic you don't fail on cost — you fail on 429 Too Many Requests, and which limit binds first (RPM vs ITPM vs OTPM vs TPM) decides how you re-architect. No incumbent pricing dataset publishes that number. This one does — versioned, freshness-stamped, CORS-open, built for agents and CI to fetch and cite.

The one curl that proves it

# which rate-limit dimension 429s first, per model, per org tier
curl -s https://llmcapplanner.vercel.app/v1/rate-limits.json

# pricing + limits combined (byte-stable alias: /snapshot.json)
curl -s https://llmcapplanner.vercel.app/v1/models.json

No key. No signup. No build step. Access-Control-Allow-Origin: * — fetch it straight from a browser, an agent, or a CI step.

Endpoints

GET /v1/rate-limits.json First-class dataset — per-org/per-tier rate-limit ceilings only. No pricing noise. The narrowly-unowned surface. GET /v1/models.json Combined dataset — the same rate-limit ceilings plus pricing per model. /snapshot.json is a byte-identical stable alias.

Why this is the only place you can get it

models.dev and the LiteLLM catalog cover pricing + context window only. Neither carries per-org/per-tier rate-limit ceilings. The provider docs state them as prose, per page, undated. This dataset is the single citable, machine-fetchable JSON of which limit binds first and at what number — re-verified by hand against the official provider docs whenever a model launches or a limit changes.

schema_version 1.0 last_verified 2026-05-15 freshness_policy ✓ embedded versioned · prior schema stays reachable

Shape of the data

Anthropic publishes ITPM/OTPM per model; OpenAI enforces RPM/TPM per tier. The dataset preserves both shapes faithfully rather than flattening them into a misleading single "TPM":

provider	tier	binding dimensions (snapshot 2026-05-15)
anthropic	`t1` → `t4`	`rpm`, `itpm`, `otpm` per model (opus-4-7 / sonnet-4-6 / haiku-4-5)
openai	`t1` → `t5`	`rpm`, `tpm` per tier

Every payload carries schema_version, last_verified, freshness_policy, and the official sources array it was verified against. You always know exactly how fresh the numbers are — the opposite of a hard-coded table that silently rots.

For AI agents — MCP server

An MCP (Model Context Protocol) stdio server wraps the same dated snapshot so an agent can answer "what tier do I need for 600 rpm of claude-sonnet-4-6 at 2k in / 500 out, and what 429s first?" with real arithmetic instead of a hallucinated guess. Offline, deterministic, every response stamped with snapshot_version.

github.com/SolvoHQ/llmcapplanner → mcp/ — one tool: llm_capacity_plan(provider, model, tier, rpm, in_tok, out_tok)
Compiled dist/index.js is committed — runs straight from a clone, no build step.