The shape
When it’s the right fit
- Multi-step commerce flows — carts, checkouts, orders, shipments
- Long-lived agent interactions — an agent’s task progresses through
assigned → accepted → working → delivered → rated - Regulated flows — every state change is also a durable audit event
- Mandate lifecycles —
active → paused → revoked - Anything where “what state are we in?” matters to correctness
Define states explicitly
- Invalid transitions (
cart → shipped) throw immediately - Visualizable — xstate has a visualizer that renders the diagram
- Serializable — store
stateas a column; restore by loading it - Exhaustive — you can’t forget to handle a state
Wire webhooks to transitions
Audit trail for free
Every transition is a row inorder_events. That’s your audit log. For regulated flows, this becomes the compliance record:
Side effects as enqueued jobs
Don’t execute side effects inline — enqueue them. The state transition itself is the durable signal:- Side effects run asynchronously; state transition is durable
- Each side effect has its own retry / DLQ
- New side effects don’t require changing the transition logic
Compose with agent state
If you’re building agents, every agent is itself an FSM:Handling concurrency
Two webhooks for the same order arriving simultaneously (common after a burst) → use optimistic locking:version column on your table guarantees no two concurrent transitions corrupt the state.
Testing
FSMs are testable by enumerating transitions:Downsides
- Upfront design cost — FSM has to be right; refactoring states later is painful
- Over-engineering risk — a 3-state CRUD object doesn’t need this
- Library dependency — xstate (Node) or similar is an additional dependency; some teams prefer rolling their own state tables
Recommended libraries
| Language | Library |
|---|---|
| JS / TS | xstate |
| Python | transitions |
| Go | looplab/fsm |
| Ruby | aasm |
| Java / Kotlin | Spring Statemachine |
Observability
- State transition histogram per day / per object type
- Time-in-state — long-lived
awaiting_payment= checkout abandonment; longfulfilling= shipping issue - Invalid-transition-attempted counter — high count = webhook-handler bug
- State distribution — snapshot of where your orders are right now
See also
- Event-driven — for simple cases where FSM is overkill
- Queue-backed — the infra layer this pattern builds on
- Core concepts: Transfers — Sly’s own state machines (transfers, streams, quotes)
