Killing KV in the hot path

How I cut Cloudflare KV from 6-9 operations per request to zero, and what I replaced it with.


TrackerSync is a free, serverless Fitbit → Garmin migration tool I run on Cloudflare’s free tier. In February 2026, every normal API request was spending 6 to 9 KV operations before doing any actual work. At even modest traffic that meant burning through the free-tier KV ceiling in hours. This is the story of moving rate-limiting to a Durable Object, metadata to D1, and shrinking KV’s role to “graceful degradation only.” Result: zero KV ops on the happy path.

How I noticed

The first warning was a 429 from my own site, on my own laptop, after I’d run two test conversions in a row. That shouldn’t be possible — the rate limits are per client, and I’m exactly one client. I opened the Cloudflare dashboard expecting a misconfigured threshold. The KV ops counter is what made me close my laptop and go for a walk.

At the traffic TrackerSync was getting — a few hundred conversions a day — the KV daily-operations chart was already at 80% of free-tier ceiling and trending up. I’d been telling myself the chart looked like that because I was testing. The chart did not care what I told myself.

The shape of the problem

TrackerSync’s hot path looks deceptively simple:

client → /api/upload → /api/validate → /api/convert → /api/download

Underneath, each of those endpoints was hitting Cloudflare KV multiple times before it even touched the user’s file:

EndpointKV ops/request (before)KV ops/request (after)
/api/upload6 – 90
/api/validate6 – 90
/api/convert7 – 100
/api/download/*6 – 90
/api/usage/*30

The 6-9 wasn’t one careless KV call repeated; it was four legitimate concerns layered on top of each other:

  1. Suspicious-client check in global middleware. KV read.
  2. Blocked-client check. KV read.
  3. Multi-tier rate-limiter. Up to 3 KV reads (per-IP, per-fingerprint, per-cookie).
  4. Health probe churn. A background “is KV alive?” probe that wrote, read, and deleted a sentinel key — running often enough to dominate the budget at low traffic.

Each one made sense on its own. Stacked, they were a free-tier killer.

Why KV was the wrong tool

KV is wonderful for what it’s designed for: globally-replicated, eventually-consistent reads. The cache-style reads are practically free in latency terms. But “rate limit” and “did this IP misbehave?” have requirements KV doesn’t meet well:

Once you say it out loud — “I need atomic counters with read-your-writes consistency” — the answer is obvious: Durable Objects, with D1 as the durable record. The reason I hadn’t said it out loud six months earlier is the reason I hadn’t said it out loud six months earlier. KV had been the thing I reached for when I was building fast, and it kept working in the way KV always keeps working: just well enough that the bill comes later.

The new architecture

Before — every request touches KV 6-9 times Client Workermiddleware + handler KV: suspicious KV: blocked KV: ratelimit ×3 KV: health probe Job

After — D1 + DO on the hot path, KV only for degraded mode Client Workerslim middleware DO: AtomicLimiter D1: metadata + counters KV: fallback only Job

Three swaps did the work:

1. Rate-limiting moves into a Durable Object

AtomicRateLimiter is a Durable Object keyed by client signal (fingerprint + IP-hash + cookie blend). Inside the DO, increments are serialized — no race conditions, no double-spends. The DO’s storage API is read-after-write consistent by construction, so a newly-blocked client stays blocked on the next request even if it lands in a different isolate.

The middleware became three lines:

const limiter = env.RATE_LIMIT_DO.idForName(signal);
const stub = env.RATE_LIMIT_DO.get(limiter);
const decision = await stub.fetch(request);

2. Transient metadata moves from KV to D1

Uploads and conversions both leave a metadata breadcrumb that downstream endpoints need to read. These had been KV entries with a TTL. They’re now D1 rows:

CREATE TABLE uploads (
  upload_id   TEXT PRIMARY KEY,
  metadata    TEXT,
  created_at  INTEGER,
  expires_at  INTEGER
);
CREATE INDEX idx_uploads_expires_at ON uploads(expires_at);

Expired rows are treated as missing on read and lazily deleted during maintenance. The TTLs (3600s for uploads, 7200s for conversions) didn’t change; only the backing store did.

3. Health probes leave the hot path

The “is KV alive?” check used to run on every request as a write-read-delete cycle on a sentinel key. That was three KV ops per request just to know whether KV was sad. The new arrangement:

The shape of the result

The first 24 hours after deployment told a clear story:

KV ops/day
┌────────────────────────────────────────────────────┐
│ before  ████████████████████████████████ ~310,000 │
│ after   ▏           <500 (fallback + status pings) │
└────────────────────────────────────────────────────┘
                                       (~99.8% reduction)

D1 query volume went up, but D1 row reads on the free tier have a much larger budget than KV ops, and most reads hit the uploads / conversions indexes. The new bottleneck — if there is one — is D1 row-reads, which I now have a clean signal on.

The DO didn’t blow up. That was the part I was most nervous about: Durable Objects serialize requests by key, so a misconfigured signal blend could turn a single noisy client into a global queue. The blend (fingerprint_hash ⊕ ip_subnet) keeps cardinality high enough that no single DO instance carries a meaningful slice of traffic. I had a half-written rollback PR open in another tab for the first 48 hours. I closed it without reading it again.

What this cost me

Three things to be honest about.

Migration risk. Moving the rate-limit decision to a new storage backend is the kind of change that, done badly, lets every request through for fifteen minutes. I shipped it behind a flag with KV as a fallback for the first week, watched the metrics, and only removed the flag once D1 + DO had carried a full traffic cycle without complaint.

A new failure mode. The DO is now on the critical path. If a Durable Object namespace has trouble, every request slows down or fails. The intelligent-fallback path covers it, but the latency of falling back is a real thing I now have to alert on.

Per-tenant noise. D1 is single-region (closest replica reads, single writer). At low scale this is fine; at higher scale I’ll need to think about read replicas or sharding. That’s a future problem, not today’s.

The takeaway

KV is a great primitive for the workloads it was designed for: globally-cached configuration, session-shaped reads, read-heavy lookups where eventual consistency is fine. The mistake I made — and I think it’s a common one — was reaching for KV because it was the first key-value store at hand, not because it fit.

Rate limiting wants atomicity. Metadata wants transactional reads with a TTL. Both have better homes on Cloudflare’s stack than KV. The hot path is cleaner, the free-tier headroom is back, and — quieter benefit — the worker is now easier to reason about, because each storage layer has one job.


Project: TrackerSync — free Fitbit → Garmin migration on Cloudflare’s free tier.