## §SELF — miscsubjects (paste without context)

**Principle:** Self-explaining payload — no external context required. This _self block describes what you are reading and where to look next.

**This widget:** `article_bundle` — **LLM article bundle**
Paste-ready package: body + claims + sources + voxels + provenance + manifest + constitution.
- **article slug:** `oip-what-is-rate-limiting`
- **contains:** body, claims, sources, voxels, provenance, question graph, constitution, llm_manifest
- **how to use:** Paste entire block into Grok/GPT/Gemini. Section §SELF explains the system.
- **read:** https://miscsubjects.com/api/articles/oip-what-is-rate-limiting/bundle?format=markdown

### Logical proof (verify each step)
1. Articles are voxel graphs of tiered claims, not prose blobs. → https://miscsubjects.com/api/articles/constitution
2. Claims link to hash-chained sources via source_ids. → https://miscsubjects.com/api/articles/oip-what-is-rate-limiting/sources
3. Ask reads topology; ingest/claim append to ledger. → https://miscsubjects.com/api/protocol
4. Models queue growth: populate → collaborate → repair → reflex. → https://miscsubjects.com/api/protocol/grow
5. Graph proves its own shape (reflex) and $/claim (yield). → https://miscsubjects.com/graph.html?layer=reflex
6. Full feature index + _explain on every API response. → https://miscsubjects.com/api/articles/system-map

### Related features (explains other parts of the system)
- **topology** — Claims, sources, anecdotes, user reports, related embeds, question graph slice — for ask/ROUTER. · https://miscsubjects.com/api/articles/oip-what-is-rate-limiting/topology
- **voxels** — Claims as atoms, sources as edges (supported_by, posted_by). Per-claim provenance. · https://miscsubjects.com/api/articles/oip-what-is-rate-limiting/voxels
- **ask** — Answer only from topology; creates question_node with gaps and ingest_hint. · https://miscsubjects.com/api/articles/oip-what-is-rate-limiting/prompts
- **ingest** — Parse pasted evidence → source ledger + claims + evidence_ingest node.
- **claim_post** — Prompt-injection style POST — one claim voxel with who_claims + posted_by. · https://miscsubjects.com/api/articles/oip-what-is-rate-limiting/voxels
- **llm_manifest** — Machine-readable read/write contract for external LLMs. · https://miscsubjects.com/api/articles/llm-manifest

### Full index
- JSON: https://miscsubjects.com/api/articles/system-map
- Markdown: https://miscsubjects.com/api/articles/system-map?format=markdown

*Not medical advice. Tier-honest. Cite claim/source ids.*

---

# miscsubjects article bundle

> Paste this entire block into Grok, GPT, or Gemini. They can READ the ledger below and RETURN evidence via ingest (see § LLM manifest).

## Article
- **slug:** `oip-what-is-rate-limiting`
- **title:** What Is Rate Limiting
- **url:** https://miscsubjects.com/a/oip-what-is-rate-limiting
- **register:** oip_protocol
- **updated:** 2026-07-04T19:01:10.019Z
- **tags:** oip, protocol

## Body

# Rate Limiting

Rate limiting is a mechanism that controls how many requests a client can send to a system within a specific time window. It is a guardrail, not a suggestion. It prevents a single actor from consuming disproportionate resources, destabilizing a service, or drowning out every other user. At its core, rate limiting is the enforcement of a budget: you get N operations per T seconds, and the system enforces that boundary without negotiation.

## Why It Matters

Every shared resource faces the same problem: demand exceeds supply. Without rate limiting, a single misconfigured client, a malicious actor, or a viral event can exhaust compute, bandwidth, or connection pools. The service collapses. Everyone loses.

Rate limiting is fairness made mechanical. It replaces the chaos of first-come-first-served with an explicit, predictable contract. It tells every client: here is your share, here is the window, and here is what happens when you exceed it. No ambiguity. No exceptions for "important" users unless the contract explicitly says so.

Beyond protection, rate limiting is an observable boundary. It surfaces capacity constraints. It forces system designers to declare what they can handle. A system without rate limits is a system that has not yet thought about its own limits. That is not robustness. That is hope.

## How It Works

Rate limiting operates on three variables: the **identifier**, the **budget**, and the **window**.

The identifier answers: who is being limited? It could be an IP address, an API key, a user ID, a session token, or a combination. The system must resolve the identifier deterministically on every request.

The budget answers: how many requests are allowed? This is a count. It could be 60 requests, 5,000 requests, or 1 request. The budget is fixed per window.

The window answers: in what time period? This is the reset interval. It could be one second, one minute, or one hour. When the window resets, the budget replenishes.

Here is the exact sequence for a typical token bucket implementation, which is the most common and pedagogically clean model:

1. **Extract identifier** from the incoming request (API key, IP, token).
2. **Look up the bucket** for that identifier in a fast store (Redis, an in-memory map, a D1 row).
3. **Check the current tokens** in the bucket. If tokens > 0, decrement by 1 and allow the request. If tokens == 0, reject the request with a 429 status.
4. **Replenish tokens** at a fixed rate. For example, a bucket with capacity 100 and a refill rate of 10 tokens per second starts full, drains down, and refills continuously.
5. **Return headers** telling the client their remaining budget, the reset time, and the limit. This is not optional. It is part of the contract.

Other algorithms exist. **Fixed window** divides time into discrete buckets (e.g., every hour) and counts requests per bucket. It is simple but vulnerable to burst attacks at window boundaries. **Sliding window** tracks the exact timestamps of recent requests and rejects if too many fall within the trailing window. It is accurate but more expensive to compute. **Leaky bucket** smooths traffic by allowing requests to exit at a fixed rate, enforcing uniform flow rather than burst-then-stop.

Token bucket is the default choice for most APIs because it allows controlled bursts while enforcing a long-term average. It is the right balance between protection and usability.

## The Contract

The exact interface for rate limiting is codified in **RFC 6585** and enforced by standard HTTP headers. A rate-limited system MUST return the following on every response:

| Header | Meaning |
|--------|---------|
| `X-RateLimit-Limit` | The maximum number of requests allowed per window. |
| `X-RateLimit-Remaining` | The number of requests remaining in the current window. |
| `X-RateLimit-Reset` | The Unix timestamp when the current window resets. |
| `Retry-After` | When a 429 is returned, the number of seconds the client MUST wait before retrying. |

When a client exceeds the limit, the server MUST respond with:

```
HTTP/1.1 429 Too Many Requests
Retry-After: 3600
X-RateLimit-Limit: 100
X-RateLimit-Remaining: 0
X-RateLimit-Reset: 1712345678
```

The client is expected to read these headers and adapt. Good clients back off. Bad clients get banned. The contract is not a negotiation. It is a declaration of the server's boundary, and the client obeys or is disconnected.

The contract also has a social dimension. A rate limit should be documented before it is enforced. Changing a limit without notice is a breaking change. The limit is part of the API's public surface, not a hidden internal detail.

## Real Examples

**GitHub REST API** — Unauthenticated requests are limited to 60 per hour per IP. Authenticated requests with a personal access token are limited to 5,000 per hour. GitHub Apps scale with repository and user count, up to 15,000 per hour. GitHub returns `X-RateLimit-Limit`, `X-RateLimit-Remaining`, `X-RateLimit-Used`, `X-RateLimit-Reset`, and `X-RateLimit-Resource` on every response. Exceeding the limit returns 403 or 429 with a `Retry-After` header.

**Twitter (X) API v2** — The Essential tier allows 100 requests per 15 minutes for most endpoints. The Elevated tier allows 300 per 15 minutes. Each endpoint has its own distinct bucket. The API returns `x-rate-limit-limit`, `x-rate-limit-remaining`, and `x-rate-limit-reset`.

**OpenAI API** — Rate limits are tiered by organization level. GPT-4 endpoints may allow 200 requests per minute for Tier 1, while DALL-E image generation may allow 5 images per minute. Limits are per-model and per-endpoint. The API returns headers including `x-ratelimit-limit-requests`, `x-ratelimit-remaining-requests`, and `x-ratelimit-reset-requests`.

**Cloudflare Workers** — Built-in rate limiting is available via the Rate Limiting Ruleset, which can trigger on IP, cookie, header, or JA3 fingerprint. It supports fixed window and sliding window. When triggered, it can block, challenge, or log. The threshold and window are configurable per rule.

**Redis as a rate limit store** — Redis `INCR` with `EXPIRE` is the standard backend for fixed-window counters. Redis Lua scripts atomically check-and-decrement for token bucket. Redis is the right choice because it is fast, has atomic operations, and supports TTL-based expiration of windows automatically.

## Common Mistakes

**Mistake 1: No rate limit at all.** Every public API without rate limits is a denial-of-service attack waiting to happen. It does not matter if you are small. A single `curl` loop in a shell script can overwhelm a naive endpoint.

**Mistake 2: Only rate limiting by IP.** IP-based limits are trivial to bypass. Residential proxies rotate IPs. NAT means multiple legitimate users share an IP. Rate limits must be tied to identity, not just network location.

**Mistake 3: Returning 403 instead of 429.** A 403 says "you are forbidden forever." A 429 says "you are temporarily blocked, try again." Clients treat these differently. Using 403 for rate limit exhaustion breaks retry logic.

**Mistake 4: Missing `Retry-After` on 429.** If the client does not know when to retry, it will guess. Guessing means retry storms, thundering herds, and cascading failures. The `Retry-After` header is mandatory in the contract.

**Mistake 5: Not documenting the limits.** A rate limit that is not documented is a landmine. Developers discover it in production when their integration breaks. Document the limit, the window, the headers, and the error format in the API reference.

**Mistake 6: One global limit for all endpoints.** A search endpoint costs 100x more than a metadata endpoint. They should not share the same bucket. GitHub and OpenAI both use per-endpoint or per-resource limits for this reason.

**Mistake 7: Counting requests but not counting cost.** A GraphQL query that returns 10,000 nested objects is not one request. It is one expensive request. Advanced rate limiting weights requests by computational cost, not just count.

## Connection to OIP

Rate limiting is not an incidental feature. It is a structural requirement of any open, deterministic, auditable system. The OIP philosophy demands that every interaction have a visible contract, that every boundary be explicit, and that every enforcement be inspectable.

Rate limiting embodies all three.

**Open:** The limit is public. The headers are public. The documentation is public. There are no hidden quotas or backroom deals. Every participant knows the rules before they play.

**Deterministic:** The same identifier, at the same time, with the same budget, produces the same result. The algorithm is specified. The headers are standardized. There is no discretion, no favoritism, no "it depends on how the server feels."

**Auditable:** Every rate limit event can be logged. Every 429 can be recorded. The ledger of who was limited, when, and why, is a permanent record. It can be replayed. It can be audited. It can be disputed.

A system without rate limits cannot be audited because it has no enforced boundary. A system with hidden limits cannot be open because the contract is secret. Rate limiting, done correctly, is the intersection of operational necessity and architectural integrity. It is what makes a shared system possible.

## Connection to the Grain Philosophy

This protocol is part of the [Open Inventory Protocol](/a/philosophy) — a living system of self-describing voxels that serves the Grain philosophy. The OIP is the interface. The philosophy is the core.


## Claims (0)


## Voxel graph (0 atoms · 0 edges)
- full graph: https://miscsubjects.com/api/articles/oip-what-is-rate-limiting/voxels

## Article constitution

- full: https://miscsubjects.com/api/articles/constitution

## Source ledger (0)
- chain valid: yes · head: `genesis`

## Provenance (0 model passes)
- chain valid: yes · head: `genesis`


## Question graph
- questions: 0 · evidence ingests: 0

## LLM manifest — how to communicate with this ledger

- system map: https://miscsubjects.com/api/articles/system-map?format=markdown
- topology (ranked): https://miscsubjects.com/api/articles/oip-what-is-rate-limiting/topology
- ingest: POST https://miscsubjects.com/api/protocol/ingest
- claim: POST https://miscsubjects.com/api/protocol/claim

### Quick actions for this article
- **Read live:** https://miscsubjects.com/api/articles/oip-what-is-rate-limiting/topology
- **Ask (API):** POST https://miscsubjects.com/api/protocol/ask `{"slug":"oip-what-is-rate-limiting","question":"..."}`
- **Ingest your findings:** POST https://miscsubjects.com/api/protocol/ingest or text `ingest oip-what-is-rate-limiting|your evidence`
- **Post one claim:** POST https://miscsubjects.com/api/protocol/claim or text `claim oip-what-is-rate-limiting|tier|assertion`
- **iMessage ask:** `oip-what-is-rate-limiting|your question`
- **System map:** https://miscsubjects.com/api/articles/system-map?format=markdown


---

## §SELF — miscsubjects (paste without context)

**Principle:** Self-explaining payload — no external context required. This _self block describes what you are reading and where to look next.

**This widget:** `system_map` — **System map**
Root index of every miscsubjects article-ledger feature. Start here if you have zero context.
- **article slug:** `oip-what-is-rate-limiting`
- **contains:** body, claims, sources, voxels, provenance, question graph, constitution, llm_manifest
- **how to use:** Root index of every miscsubjects article-ledger feature. Start here if you have zero context.
- **read:** https://miscsubjects.com/api/articles/system-map

### Logical proof (verify each step)
1. Articles are voxel graphs of tiered claims, not prose blobs. → https://miscsubjects.com/api/articles/constitution
2. Claims link to hash-chained sources via source_ids. → https://miscsubjects.com/api/articles/oip-what-is-rate-limiting/sources
3. Ask reads topology; ingest/claim append to ledger. → https://miscsubjects.com/api/protocol
4. Models queue growth: populate → collaborate → repair → reflex. → https://miscsubjects.com/api/protocol/grow
5. Graph proves its own shape (reflex) and $/claim (yield). → https://miscsubjects.com/graph.html?layer=reflex
6. Full feature index + _explain on every API response. → https://miscsubjects.com/api/articles/system-map

### Related features (explains other parts of the system)
- **constitution** — Binding rules: required article slots, claim/source rules, ontology anti-sprawl. · https://miscsubjects.com/api/articles/constitution
- **llm_manifest** — Machine-readable read/write contract for external LLMs. · https://miscsubjects.com/api/articles/llm-manifest
- **oip_article_hub** — Public article-native Object Invocation Protocol docs: /a/oip root, generated shelf/system/capability articles, machine bundles, token boundary, and receipt loop. · https://miscsubjects.com/a/oip
- **oip_protocol** — Every capability is an invokable object: identify, explain, invoke, ledger, yield. · https://miscsubjects.com/a/oip
- **bundle** — Paste-ready package: body + claims + sources + voxels + provenance + manifest + constitution. · https://miscsubjects.com/api/articles/oip-what-is-rate-limiting/bundle?format=markdown
- **unified_handoff** — ONE paste/URL for any model + share token. Same self-explaining pattern as article bundle, but whole build. · https://miscsubjects.com/api/handoff?format=markdown

### Full index
- JSON: https://miscsubjects.com/api/articles/system-map
- Markdown: https://miscsubjects.com/api/articles/system-map?format=markdown

*Not medical advice. Tier-honest. Cite claim/source ids.*