When to use it?
The model deployers are optional components in the ZenML stack. They are used to deploy machine learning models to a target environment either a development (local) or a production (Kubernetes), the model deployers are mainly used to deploy models for real time inference use cases. With the model deployers and other stack components, you can build pipelines that are continuously trained and deployed to a production.How they experiment trackers slot into the stack
Here is an architecture diagram that shows how model deployers fit into the overall story of a remote stack.
Model Deployers Flavors
ZenML comes with alocal
MLflow model deployer which is a simple model deployer that deploys models to a local MLflow server. Additional model deployers that can be used to deploy models on production environments are provided by integrations:
Model Deployer | Flavor | Integration | Notes |
---|---|---|---|
MLflow | mlflow | mlflow | Deploys ML Model locally |
BentoML | bentoml | bentoml | Build and Deploy ML models locally or for production grade (Cloud, K8s) |
Seldon Core | seldon | seldon Core | Built on top of Kubernetes to deploy models for production grade environment |
KServe | kserve | kserve | Kubernetes based model deployment framework |
Custom Implementation | custom | Extend the Artifact Store abstraction and provide your own implementation |
The role Model Deployer plays in a ZenML Stack
- Holds all the stack related configuration attributes required to interact with the remote model serving tool, service or platform (e.g. hostnames, URLs, references to credentials, other client related configuration parameters). The following are examples of configuring the MLflow and Seldon Core Model Deployers and registering them as a Stack component:
- Implements the continuous deployment logic necessary to deploy models in a way that updates an existing model server that is already serving a previous version of the same model instead of creating a new model server for every new model version. Every model server that the Model Deployer provisions externally to deploy a model is represented internally as a
Service
object that may be accessed for visibility and control over a single model deployment. This functionality can be consumed directly from ZenML pipeline steps, but it can also be used outside the pipeline to deploy ad-hoc models. The following code is an example of using the Seldon Core Model Deployer to deploy a model inside a ZenML pipeline step:
- Acts as a registry for all Services that represent remote model servers. External model deployment servers can be listed and filtered using a variety of criteria, such as the name of the model or the names of the pipeline and step that was used to deploy the model. The Service objects returned by the Model Deployer can be used to interact with the remote model server, e.g. to get the operational status of a model server, the prediction URI that it exposes, or to stop or delete a model server:
Custom pre-processing and post-processing
Pre-processing is the process of transforming the data before it is passed to the machine learning model. Post-processing is the process of transforming the data after it is returned from the machine learning model, and before it is returned to the user. Both pre- and post-processing are very essential for the model deployment process. Most models require specific input formats which requires transforming the data before it is passed to the model and after it is returned from the model. ZenML allows you to define your own pre- and post-processing in two ways:- At the pipeline level by defining custom steps before and after the predict step in the ZenML pipeline.
- At the model deployment tool level by defining a custom predict, pre- and post-processing functions that would be wrapped in a Docker container and executed on the model deployment server.