Kubernetes on Google CLoud - Advanced Options
Here are some advanced options for deploying Metaflow on Google Cloud.
Remote State Backends for Terraform
Terraform manages the state of GCP resources in tfstate files locally by default.
If you plan to maintain the minimal stack for any significant period of time, it is highly recommended that these state files be stored in cloud storage (e.g. Google Cloud Storage) instead.
Some reasons include
- More than one person needs to administer the stack (using terraform). Everyone should work off a single copy of tfstate.
- You wish to mitigate the risk of data loss on your local disk.
For more details, see Terraform docs.
Deploying Multiple Metaflow Stacks
If you want to run more than one instance of this stack, you can use Terraform workspaces.
Authenticated Public Endpoints for Metaflow Services
The deployment approach taken by the terraform templates minimizes publicly accessible surface area. Only the GKE Kubernetes API is available publicly. This allows authorized users (through the secure Kubernetes API) to
- Inspect cluster's workloads
- CRUD Kubernetes objects (e.g. submit job pods)
However, this deployment style does not include publicly accessible endpoints for the web services running within the GKE cluster. For the purpose of this sample deployment template, users must use the Kubernetes API to set up port-forwarding in order to access these services from their workstations.
For a more friendly user experience, publicly accessible endpoints can be authenticated and authorized using technologies like:
Please talk to us for more information about this topic.
GKE Workload Identities
In the Metaflow stack generated by these terraform templates, all Metaflow workloads running within GKE access GCP resources as a specific service account identity. We use GKE Workload Identity (in short), which does the following:
Metaflow tasks pods run as a certain Kubernetes Service Account (KSA)
KSA is annotated with a link to the Google Service Account (GSA)
The pod running as KSA assumes the identity of GSA when access GCP resources
For finer grain control, it is possible to map separate identities to each workload and Metaflow task. It is possible to extend this further. E.g. user X's Metaflow runs may assume a certain KSA_1 (mapping to GSA_1), whilst user Y's Metaflow runs may assume a certain KSA_2 (mapping to GSA_2).