I built naps.sh in about a week. It's an AI-powered idea validation tool. You describe a business idea, an agent runs market research across six phases (keyword analysis, Reddit pain discovery, competitor mapping, funding signals, market sizing), and produces a scored report with a GO/NO-GO verdict.

The whole thing runs on Cloudflare Workers. SvelteKit frontend, D1 for relational data, R2 for file storage, Queues for background jobs. And one Durable Object class that became the center of everything.

This was my first time using Durable Objects.

Why Durable Objects

Midway through day two, I had a working AI chat using the Vercel AI SDK's streamText. Standard request-response. It worked. But I kept thinking about what happens when the user closes the tab mid-validation. Or opens a second tab. Or comes back an hour later. The validation runs take a few minutes. That's a long time to keep a stateless HTTP connection alive.

I'd been curious about Durable Objects for a while. Never had a clear use case. This felt like one.

Each validation session maps to one DO instance. The DO owns the conversation state, streams responses over WebSocket, persists messages to its SQLite storage, and handles multi-tab coordination. One actor per session, isolated, stateful, long-lived.

The mental model clicked fast. A Durable Object is the session. Not a service that manages sessions. The session itself.

What worked

Per-session isolation is the thing I keep coming back to. Every validation gets its own little world. Its own SQLite database, its own WebSocket connections. No shared state between different users' validations. When I needed to add multi-tab support, the DO already knew about all its connected sockets, so it was just... natural.

WebSocket Hibernation is nice too. The DO goes to sleep when no client is connected and wakes up when one reconnects. I'm not paying for idle compute while a user reads their report.

And SQLite in the DO is great. I have three tables (messages, metadata, and a stream buffer), reads and writes are fast, and the data lives right next to the compute that needs it. No network hop to a database.

What didn't

The dev experience is rough

There's no hot reload for Durable Objects with SvelteKit. Every change means re-running the full build pipeline. I ended up with concurrently running vite build --watch alongside wrangler dev, but the intermediate build states cause failures because the DurableObject import from cloudflare:workers breaks during partial rebuilds.

Not terrible. But compared to the instant feedback loop you get with regular SvelteKit development, it's a step down that you feel every time.

SvelteKit doesn't export DO classes

This was the first real friction. SvelteKit's Cloudflare adapter produces a worker entry point, but it doesn't know about your Durable Object classes. They don't get exported. Wrangler can't find them. Deploy fails.

This is a known issue, open since June 2021. Almost five years, 35+ comments, no official fix. A community PR tried to solve it in 2023 but was never merged. The most promising path is @cloudflare/vite-plugin (GA since April 2025), which gives full control over the worker entry point, but SvelteKit hasn't integrated with it yet.

I had to write a patch-worker.ts script that runs after each build, injecting the DO class exports into the generated worker entry. It works, but it's a build step I wish I didn't need. Every workaround in that issue thread is some variation of the same thing.

Preview deployments don't work with DOs

Found this when trying to test a deployment on a preview URL. Durable Objects aren't available in preview environments. You have to deploy to production to test them.

So every time you want to verify DO behavior in a deployed environment, you're pushing to prod. For a platform that's supposed to help you move fast, this is annoying.

The split storage problem

This one crept up on me. The DO stores conversation messages in its SQLite. But session metadata (title, status, user ownership, credits) lives in D1, because that data needs to be queryable across sessions for list pages, admin views, auth checks.

Two databases. Two sources of truth for what is essentially one entity.

It works, but the seams show. When the validation finishes, I need to update status in both the DO and D1. When the list page loads, I can't ask the DOs for their state. I need D1 to have it already.

The fan-out trap

This is the split storage problem taken to its logical conclusion. I built a /validations page that shows all your past sessions with their current status. My first instinct was to fetch each DO to get its state.

Twenty sessions means forty DO stub calls (status + report status) blocking server-side rendering. That's not going to work.

The solution is obvious in hindsight: push state changes to D1 so the list page queries one table. But it means you're constantly syncing state from the DO to D1, and you have to be disciplined about it. Every status transition in the DO needs a corresponding D1 write.

You can't query across DOs. That's by design. If your UI needs to show state from many instances at once, you need a denormalized read store somewhere else. I wish I'd thought about this on day one instead of day five.

The stream buffer pattern

Cloudflare can redeploy your worker at any time. When that happens, all in-memory state in your DO is gone. The SQLite survives, but your local variables (the streaming response, the accumulated chunks, the abort controller) vanish.

For a validation that takes several minutes, this is a real problem. A user could be watching their report stream in and suddenly... nothing.

I built a two-layer recovery for this:

  1. While streaming, every chunk gets accumulated in memory (for fast replay to reconnecting tabs) and appended to a stream_buffer table in SQLite. A stream_active flag is set in metadata before streaming starts.

  2. If a client reconnects and finds stream_active is true but there's no in-memory state (meaning a deploy happened), the DO replays all buffered chunks from SQLite and appends a message: "Research was interrupted by a server update."

  3. When a new message arrives, the DO checks for an interrupted stream first, extracts any partial text from the buffer, saves it to the messages table, and clears the buffer.

The validation doesn't resume, it just recovers gracefully. The user gets everything that was generated before the deploy and can continue from where it stopped. I'm happy with how this turned out.

The existential question

Around day five, I stopped and asked myself: is Durable Objects the right abstraction for what I'm building?

The answer was yes. Per-session AI agents with WebSocket streaming and persistent state, that's the textbook DO use case. I wouldn't choose anything else for this specific problem.

But the DO class grew into a 400+ line god object handling WebSockets, AI streaming, SQL operations, session lifecycle, multi-tab coordination, and report triggering. I should have been more intentional about the boundaries earlier.

And I should have thought harder upfront about what data lives in the DO versus what lives in a shared database. The split storage problem doesn't go away with better code, only with better planning.

If you're starting with DOs

The one thing I'd say: figure out your read patterns before you write any code. The moment you need a list view or an aggregate query across instances, you need a separate data store. This is the thing that will bite you if you don't plan for it.

Build for deploy interruptions from day one. Your DO will be evicted. It might happen mid-operation. If you're doing anything long-running, you need a recovery strategy.

And accept that the developer experience is behind what you're used to with modern frameworks. Budget time for build pipeline work. I spent more time on patch-worker.ts and dev server configuration than I'd like to admit.

I'd use Durable Objects again. The constraints are real but so is the model. Once it clicks, it clicks.