Kubeflow Orchestrator
How to orchestrate pipelines with Kubeflow
The Kubeflow orchestrator is an orchestrator flavor provided with the ZenML kubeflow
integration that uses Kubeflow Pipelines to run your pipelines.
This component is only meant to be used within the context of remote ZenML deployment scenario. Usage with a local ZenML deployment may lead to unexpected behavior!
When to use it
You should use the Kubeflow orchestrator if:
-
you’re looking for a proven production-grade orchestrator.
-
you’re looking for a UI in which you can track your pipeline runs.
-
you’re already using Kubernetes or are not afraid of setting up and maintaining a Kubernetes cluster.
-
you’re willing to deploy and maintain Kubeflow Pipelines on your cluster.
How to deploy it
The Kubeflow orchestrator supports two different modes: Local
and remote
. In case you want to run the orchestrator on a local Kubernetes cluster running on your machine, there is no additional infrastructure setup necessary.
If you want to run your pipelines on a remote cluster instead, you’ll need to set up a Kubernetes cluster and deploy Kubeflow Pipelines:
AWS
GCP
Azure
-
Have an existing AWS EKS cluster set up.
-
Make sure you have the AWS CLI set up.
-
Download and install
kubectl
and configure it to talk to your EKS cluster using the following command:
-
Install Kubeflow Pipelines onto your cluster.
-
Have an existing GCP GKE cluster set up.
-
Make sure you have the Google Cloud CLI set up first.
-
Download and install
kubectl
and configure it to talk to your GKE cluster using the following command:
-
Install Kubeflow Pipelines onto your cluster.
-
Have an existing AKS cluster set up.
-
Make sure you have the az CLI set up first.
-
Download and install
kubectl
and it to talk to your AKS cluster using the following command:
- Install Kubeflow Pipelines onto your cluster.
Since Kubernetes v1.19, AKS has shifted
to containerd
. However, the workflow controller installed with the Kubeflow installation has
Docker
set as the default runtime. In order to make your pipelines work, you have to change the value to one of the options
listed here
, preferably
k8sapi
.This change has to be made by editing the
containerRuntimeExecutor
property of theConfigMap
corresponding to the workflow controller. Run the following commands to first know what config map to change and then to edit it to reflect your new value.
If one or more of the deployments are not in the Running
state, try increasing the number of nodes in your cluster.
If you’re installing Kubeflow Pipelines manually, make sure the Kubernetes service is called exactly ml-pipeline
. This is a requirement for ZenML to connect to your Kubeflow Pipelines deployment.
How to use it
To use the Kubeflow orchestrator, we need:
- The ZenML
kubeflow
integration installed. If you haven’t done so, run
Local
Remote
When using the Kubeflow orchestrator locally, you’ll additionally need:
-
K3D installed to spin up a local Kubernetes cluster.
-
A local container registry as part of your stack.
The local Kubeflow Pipelines deployment requires more than 2 GB of RAM, so if you’re using Docker Desktop make sure to update the resource limits in the preferences.
We can then register the orchestrator and use it in our active stack:
When using the Kubeflow orchestrator with a remote cluster, you’ll additionally need:
-
A remote ZenML server deployed to the cloud. See the deployment guide for more information.
-
Kubeflow pipelines deployed on a remote cluster. See the deployment section for more information.
-
The name of your Kubernetes context which points to your remote cluster. Run
kubectl config get-contexts
to see a list of available contexts. -
A remote artifact store as part of your stack.
-
A remote container registry as part of your stack.
We can then register the orchestrator and use it in our active stack:
ZenML will build a Docker image called <CONTAINER_REGISTRY_URI>/zenml:<PIPELINE_NAME>
which includes your code and use it to run your pipeline steps in Kubeflow. Check out this page if you want to learn more about how ZenML builds these images and how you can customize them.
Once the orchestrator is part of the active stack, we need to run zenml stack up
before running any pipelines. This command
-
forwards a port, so you can view the Kubeflow UI in your browser.
-
(in the local case) uses K3D to provision a Kubernetes cluster on your machine and deploys Kubeflow Pipelines on it.
You can now run any ZenML pipeline using the Kubeflow orchestrator:
Additional configuration
For additional configuration of the Kubeflow orchestrator, you can pass KubeflowOrchestratorSettings
which allows you to configure (among others) the following attributes:
-
client_args
: Arguments to pass when initializing the KFP client. -
user_namespace
: The user namespace to use when creating experiments and runs. -
pod_settings
: Node selectors, affinity and tolerations to apply to the Kubernetes Pods running your pipline. These can be either specified using the Kubernetes model objects or as dictionaries.
Check out the API docs for a full list of available attributes and this docs page for more information on how to specify settings.
Enabling CUDA for GPU-backed hardware
Note that if you wish to use this orchestrator to run steps on a GPU, you will need to follow the instructions on this page to ensure that it works. It requires adding some extra settings customization and is essential to enable CUDA for the GPU to give its full acceleration.
Important Note for Multi-Tenancy Deployments
Kubeflow has a notion of multi-tenancy built into its deployment. Kubeflow’s multi-user isolation simplifies user operations because each user only views and edited\s the Kubeflow components and model artifacts defined in their configuration.
Using the ZenML Kubeflow orchestrator on a multi-tenant deployment without any settings will result in the following error:
In order to get it to work, we need to leverage the KubeflowOrchestratorSettings
referenced above. By setting the namespace option, and by passing in the right authentication credentials to the Kubeflow Pipelines Client, we can make it work.
First, when registering your kubeflow orchestrator, please make sure to include the kubeflow_hostname
parameter. The kubeflow_hostname
must end with the /pipeline
post-fix.
Then, ensure that you use the pass the right settings before triggering a pipeline run. The following snippet will prove useful:
Note that the above is also currently not tested on all Kubeflow versions, so there might be further bugs with older Kubeflow versions. In this case, please reach out to us on Slack.
Using secrets in settings
The above example encoded the username and password in plain-text as settings. You can also set them as secrets.
And then you can use them in code:
See full documentation of using secrets within ZenML here.
A concrete example of using the Kubeflow orchestrator can be found here.
For more information and a full list of configurable attributes of the Kubeflow orchestrator, check out the API Docs.