How to deploy models to Kubernetes with Seldon Core
--kubernetes-context
command line argument. This Kubernetes context needs to point to the Kubernetes cluster where Seldon Core model servers will be deployed. If the context is not explicitly supplied to the example, it defaults to using the locally active context.
--secret
argument to the CLI command used to register the model deployer. We’ve already done the latter, now all that is left to do is to configure the s3-store
ZenML secret specified before as a Seldon Model Deployer configuration attribute with the credentials needed by Seldon Core to access the artifact store.
There are built-in secret schemas that the Seldon Core integration provides which can be used to configure credentials for the 3 main types of Artifact Stores supported by ZenML: S3, GCS and Azure.
you can use seldon_s3
for AWS S3 or seldon_gs
for GCS and seldon_az
for Azure. To read more about secrets, secret schemas and how they are used in ZenML, please refer to the Secrets Manager.
The following is an example of registering an S3 secret with the Seldon Core model deployer:
SeldonDeploymentConfig
you can configure:
model_name
: the name of the model in the KServe cluster and in ZenML.replicas
: the number of replicas with which to deploy the modelimplementation
: the type of Seldon inference server to use for the model. The implementation type can be one of the following: TENSORFLOW_SERVER
, SKLEARN_SERVER
, XGBOOST_SERVER
, custom
.resources
: the resources to be allocated to the model. This can be configured by passing a dictionary with the requests
and limits
keys. The values for these keys can be a dictionary with the cpu
and memory
keys. The values for these keys can be a string with the amount of CPU and memory to be allocated to the model.custom_predict
function should be getting the model and the input data as arguments and return the output data. ZenML will take care of loading the model into memory, starting the seldon-core-microservice
that will be responsible for serving the model, and running the predict function.
path
can be passed to the custom deployment parameters.