Prefect Application Services
Service Types
Prefect services are organized into four main types based on their architecture and operational patterns:
Web Services
Web services handle HTTP requests and are written with FastAPI. They route requests based on URL and provide API endpoints for client interactions. These services are stateless and designed for high availability.
Scaling: Can be horizontally scaled. Multiple replicas can run simultaneously since they are stateless.
Background Services
Background services defer work by passing messages through Redis and offloading processing to worker processes. They use ARQ for task queuing and are designed to handle time-consuming operations asynchronously without blocking API responses. Examples include operations like deletions that take a long time, sending emails from automations after delays, or processing events during flow runs.
Scaling: Can be horizontally scaled. Multiple workers can process tasks from the same Redis queues concurrently.
Consumer Services
Consumer services monitor specific topics through Redis and react to messages. When events are published (like "prefect events"), multiple consumers can react simultaneously. These services provide event-driven processing and enable real-time reactions to system changes.
Scaling: Can be horizontally scaled. Multiple consumers can process messages from the same Redis topics concurrently.
Loop Services
Loop services operate on intervals, performing actions like "do X every Y seconds." They poll databases and take actions based on conditions. For example, the scheduler monitors for deployments with active schedules and adds pending flow runs to the database when found.
Scaling: Must run as singletons. Only one replica should be active to prevent conflicts like duplicate scheduled runs or competing cleanup operations.
Operational Considerations
Critical Path Services
Some services are on the critical path for system functionality:
- auth
- All API requests require authentication. Downtime blocks all system access.
- orion
- Core orchestration service. Downtime prevents flow execution and API operations.
- ui
- User interface access. Downtime prevents web-based management.
Service Dependencies
- All web services depend on
auth
for request authentication - Automations require
events
,triggers
, andactions
working together - Flow execution requires
orion
,scheduler
, and related background services - Event processing requires
events
,events-background
,ladler
, andpartman
Resource Patterns
- Web services are CPU-bound during request processing
- Background services vary by task type (CPU for processing, I/O for database operations)
- Loop services are typically low-resource but may spike during polling operations
- Consumer services are I/O-bound, processing Redis message queues
Web and APIs
ui
(web service)
ui
is a web service responsible for displaying and presenting the web user interface frontend.
One of two core Docker images used for the Kubernetes cluster.
People and Teams
auth
(web service)
auth
is a web service responsible for authentication and permission lookup. Additionally handles SCIM and user management actions.
The performance of auth
is critical - all requests to nebula
or orion
pass through auth
first to authenticate the request and populate actor permissions.
nebula
(web service)
nebula
is a user management API that permissions access to accounts and workspaces.
Authorization is performed on a per-request basis based on information from auth
.
Each route is associated with one or more required account level or user level permissions.
Using FastAPI dependencies, each request will ensure the actor has the correct permissions to call a route.
For orion
, all requests are made against a single workspace and will be of the form <endpoint>/accounts/<account_id>/workspaces/<workspace_id>/<orion-endpoint>
.
Orchestration
logs
(web service)
logs
is a web service responsible for writing and reading flow and task run logs.
Separate from the orion
service - high volumes of logs do not impact orchestration.
Conversely, orchestration can continue to function while logs can be batch written asynchronously.
orchestration-ui-reads
(web service)
orchestration-ui-reads
is a web service that handles UI-specific read operations for orchestration data.
orion
(web service)
orion
is the core orchestration and scheduling service.
Additionally, orion
is one of the main two base Docker images used for the Kubernetes pods.
It comprises all of the core Prefect orchestration and scheduling.
There are four core components of the orion
service that are segmented using Istio and VirtualServices to improve efficiency and throughput.
flow-run-reads
flow-run-reads
is a notional / VirtualService that solely handles the flow-run-reads
responsibility for the orion
service.
It is a core service that is configured as part of kubernetes and istio routing policy. Uses the orion
base Docker image, with service specific (flow-run-reads) parameters passed to it.
flow-run-writes
flow-run-writes
, like flow-run-reads
, is a notional / VirtualService that solely handles the flow-run-writes
responsibility for the orion
service.
It is a core service that is configured as part of kubernetes and istio routing policy. Uses the orion
base Docker image, with service specific (flow-run-writes) parameters passed to it.
task-run-reads
task-run-reads
is a notional / VirtualService that solely handles the task-run-reads
responsibility for the orion
service.
It is a core service that is configured as part of kubernetes and istio routing policy. Uses the orion
base Docker image, with service specific (task-run-reads) parameters passed to it.
task-run-writes
task-run-writes
is a notional / VirtualService that solely handles the task-run-writes
responsibility for the orion
service.
It is a core service that is configured as part of kubernetes and istio routing policy. Uses the orion
base Docker image, with service specific (task-run-writes) parameters passed to it.
task-run-websockets
(web service)
task-run-websockets
is a web service that manages WebSocket connections for real-time task run updates.
work-pool-reads
(web service)
work-pool-reads
is a web service that handles read operations for work pool data.
Observability
events
(web service)
events
The API service that is responsible for creating events external to the Prefect API.
For example, a task emits an event as it changes states, which is received and action taken by the events
service.
Additionally, events
is one of the core services required for automations, the others being triggers
and actions
.
Emitted events are consumed by triggers
to evaluate automation trigger conditions.
A flow run is orchestrated server-side, and is not an external event. A flow run that transitions from a Scheduled
state to a Pending
state emits an internal event to the Prefect API via the orion
service.
events-webhooks
(web service)
events-webhooks
is an API web service, specifically for creating event webhooks in the UI under the Event Webhooks
header.
events-websockets
(web service)
events-websockets
maintains and manages the WebSocket implementation for streaming events. This provides the same purposes as the events-webhooks
but is specifically implemented for WebSockets.
Background Processing
Orchestration
expiration-processor
(loop service)
expiration-processor
is a loop service responsible for identifying flow runs whose expires_at
date has passed and soft-deleting them and their related objects.
expiration-setter
(loop service)
expiration-setter
is a loop service responsible for identifying flow runs that do not have an expiration set and setting them based on the run_retention_days
of the flow's workspace's terms.
foreman
(background service)
foreman
is a background service that coordinates and manages various orchestration tasks.
mark-late-runs
(loop service)
mark-late-runs
is a loop service responsible for querying runs in a Scheduled state which should have already started and marking them as Late
.
nebula-background
(background service)
nebula-background
is a background worker process that retrieves administrative tasks from Redis, similar to orion-background
.
orion-background
(background service)
orion-background
services are background worker processes that retrieve scheduling and orchestration tasks from Redis.
For performance reasons, a server task can be enqueued in Redis, and deferred to a later time. Execution is then submitted to orion-background
services asynchronously.
One such example is deleting a flow. This can be an expensive and time-consuming operation to ensure all child elements of a flow are properly deleted, such as flow-runs, task-runs, and logs.
These services allow for flows that are marked as deleted, to perform the operation in an asynchronous manner.
The background processing is split into multiple services based on task priority: orion-background-fast
, orion-background-medium
, and orion-background-slow
.
reaper-man
(loop service)
reaper-man
is a loop service responsible for removing soft-deleted rows.
It is often not feasible to synchronously delete data on request (prompting the necessity of background services).
Instead, some tables have an additional column deleted_at
, which is set to the timestamp at which we want to delete rows.
scheduler
(loop service)
scheduler
is a loop service responsible for querying Deployments and creating scheduled flow runs based on the deployment.
task-run-recorder
(background service)
task-run-recorder
is a background service that records and persists task run information.
Observability
actions
(consumer service)
actions
is one of the services responsible for some task as the result of an automation being triggered by triggers
.
actions
is a consumer from Redis for tasks that need to take action, such as "Schedule a Deployment", or "Disable a work-queue".
events-background
(background service)
events-background
is a background service that processes events asynchronously.
ladler
(consumer service)
ladler
is an event driven service that moves / migrates event messages from Redis into the Postgres Event Database for long term storage.
partman
(loop service)
partman
is a loop service responsible for maintaining partitions in the events
database. Manages:
- Ensuring daily partitions of the
events
table exist for the next week. - Cleaning up partitions older than 30 days.
- Asynchronous index creation on the
events
table
triggers
(consumer service)
Triggers is comprised of two separate services - triggers-reactive
and triggers-proactive
.
triggers-reactive
manages and actions reactive events and automations, such as "If this flow run is marked Crashed, then take some action".
One of the core services required for automations, the others being events
and actions
.
triggers-proactive
manages actions, proactive events, and automations, such as "If a work-queue is unhealthy for 5 minutes, then take some action".
One of the core services required for automations, the others being events
and actions
.