Kubernetes on AWS - Advanced Options
Here are some advanced options for deploying Metaflow on AWS Kubernetes.
Remote State Backends for Terraform
Terraform manages the state of AWS resources in tfstate files locally by default.
If you plan to maintain the minimal stack for any significant period of time, it is highly recommended that these state files be stored in cloud storage (e.g. Amazon S3) instead.
Some reasons include:
- More than one person needs to administer the stack (using terraform). Everyone should work off a single copy of tfstate.
- You wish to mitigate the risk of data loss on your local disk.
For more details, see Terraform documentation.
Deploying Multiple Metaflow Stacks
If you want to run more than one instance of this stack, you can use Terraform workspaces.
Authenticated Public Endpoints for Metaflow Services
The deployment approach taken by the terraform templates minimizes publicly accessible surface area. Only the EKS Kubernetes API is available publicly. This allows authorized users (through the secure Kubernetes API) to:
- Inspect cluster's workloads.
- CRUD Kubernetes objects (e.g. submit job pods).
However, this deployment style does not include publicly accessible endpoints for the web services running within the EKS cluster. For the purpose of this sample deployment template, users must use the Kubernetes API to set up port-forwarding in order to access these services from their workstations.
For a more friendly user experience, publicly accessible endpoints can be authenticated and authorized using technologies like:
- OIDC (OpenID Connect)
- JWT tokens
- Identity-as-a-service providers (e.g. Auth0).
Please talk to us for more information about this topic.
EKS Workload Identities
In the Metaflow stack generated by these terraform templates, all Metaflow workloads running within EKS access AWS resources as a specific service account identity. We use IRSA which does the following:
- Metaflow tasks pods run as a certain Kubernetes Service Account (KSA).
- KSA is annotated with a link to the IAM role.
- The pod running as KSA assumes the IAM role when accessing AWS resources.
For finer grain control and end-to-end identity, please talk to us.