Skip to content

Infrastructure Services

For Self Managed Prefect, the following infrastructure requirements are necessary to be in place.

Load Balancer

A load balancer is used to facilitate high traffic throughput, multi-zone availability, SSL termination, and integrates with an ingress controller.
Either a Layer 4, or a Layer 7 load balancer can be utilized. The requirement for Prefect to function, is SSL termination, and host based path routing to the Traefik service mesh.

Bringing your own service mesh? That's OK! As long as you route traffic from the API / UI to the appropriate ingress resources with Traefik.

Why Traefik? What about Istio, or another service mesh? Due to the complexity and nature of the offering, we understand you might have different requirements in your environment. Using Traefik allows a much lighter footprint to handle authorization and service routing that does not require additional knowledge or support.

DNS

Two distinct Fully Qualified Domain Names are required for Self Managed Prefect.

  • api.<domain> - Used for API interactions such as the Workers and clients.
  • app.<domain> - How you can access the UI via the web.

These domains will be used for the Common Names and/or Subject Alternate Names for SSL Certificates, in addition to the CLOUD_UI_URL and CLOUD_API_URL.

Using your existing DNS provider, DNS records should be created to route traffic - one each for the api and app sub-domains route traffic to the ingress.

SSL Termination

SSL Certificates are required to properly secure the client interactions with the API, even in a self-managed scenario.

There are a number of possible how-tos, Certificate Authorities, and vendors available to issue and sign certificates, so this section describes the requirements to implement.

Certificate Management

Certificates can be requested through a vendor such as DigiCert, or LetsEncrypt. The Prefect API requires a publicly signed SSL certificate.
A self-signed certificate can be used, however, as the Prefect Client (the open-source Prefect package) is shipped with the certifi package, it will not be trusted by default. To ensure maximum availability and support, a publicly signed certificate is recommended.

Certificate expiration should be generated in accordance with internal / security policies.

Kubernetes Cluster

Kubernetes is the core execution platform required for Prefect. Suggested node sizes should be 8 vCPU, and 16GiB of memory per node, with a minimum of 3 nodes.

Istio

Istio is responsible for managing the networking Service mesh and auth routes to route traffic properly. This includes traffic from the routed ingress, to the nebula service, auth service, ui service, and server service.

This is configured and deployed as part of the Prefect installation through Helm. No additional configuration should be necessary.

Redis

Redis is used to manage arq or background worker tasks. There are several instances in place for various services to utilize, and facilitate asynchronous execution of operations that are not time sensitive.

Cache

The cache table is used primarily for operations that are naturally constant, and would be a significant performance loss to retrieve from database.
The auth service makes heavy utilization of the cache table to determine which users are actively logged in, and which require re-authentication.

Work-Pools

A dedicated instance for push-pool background work. This instance is responsible for managing background tasks on push work-pools to separate core orchestration for all users to ones primarily utilizing push work-pools.

Triggers

A dedicated instance responsible for Automation triggers and background tasks.

PostgreSQL

PostgreSQL (and the various cloud services that offer it) is used to manage state for all the metadata occurring within the application. There are three databases in use to maintain the application:

Table Purpose
Events Logs and Actions
Nebula Authorization, RBAC, Workspaces, Accounts
Server Flows, Tasks, Deployments

Events DB

Used to store events data. The primary purpose is to provide a performant datastore for recent events data, with most queries returning in under ~200ms.

Nebula DB

The Nebula database stores information about users, accounts, teams, etc. that are used in permissioning access to server data. Both the nebula and auth services need access to the Nebula database.

Server DB

The server database stores information about flows, flow runs, task runs, etc.
The server database is accessed by the server service, as well as many other background services. This data is separated into “workspaces” using a workspace_id column.