First Step Function
Workflow.
Initial setup
AWS Console or CLI: create state machine with Amazon States Language definition. Standard JSON or YAML format.
Two execution types: Standard (long-running, billed per state transition) and Express (high-throughput, billed by duration).
Standard for orchestration of human-paced workflows; Express for high-frequency event-driven workflows.
Simple workflow
Sequence of Lambda invocations: Task states with Resource pointing at the function ARN. Each completes; output flows to the next.
Choice state for branching. Conditional logic based on previous output.
Map state for fan-out. Iterate over an array; each element processed in parallel.
Error handling
Catch blocks for specific error types. Retry, fall back, or terminate based on the error.
Retry policies: exponential backoff with max attempts. Built-in; declarative.
Fallback states: catch errors and route to a different state. Useful for graceful degradation.
Observability
CloudWatch logs per state machine execution. Trace through state transitions.
X-Ray integration. Distributed tracing across Lambda invocations within a state machine.
Per-execution UI: visual graph of execution path; click states to inspect inputs and outputs.
Operating Step Functions
IaC for state machine definitions: Terraform, CDK, or SAM. Don't click-build production.
Version control state machine definitions. Reviewed via PR.
Test executions in development environment before promoting. Step Functions supports TestState API for state-level testing.