When would you want to use it?
Running ZenML pipelines with the local Artifact Store is usually sufficient if you just want to evaluate ZenML or get started quickly without incurring the trouble and the cost of employing cloud storage services in your stack. However, the local Artifact Store becomes insufficient or unsuitable if you have more elaborate needs for your project:- if you want to share your pipeline run results with other team members or stakeholders inside or outside your organization
- if you have other components in your stack that are running remotely (e.g. a Kubeflow or Kubernetes Orchestrator running in public cloud).
- if you outgrow what your local machine can offer in terms of storage space and need to use some form of private or public storage service that is shared with others
- if you are running pipelines at scale and need an Artifact Store that can handle the demands of production grade MLOps
How do you deploy it?
The S3 Artifact Store flavor is provided by the S3 ZenML integration, you need to install it on your local machine to be able to register an S3 Artifact Store and add it to your stack:s3://bucket-name
. Please read the documentation relevant to the S3 service that you are using on how to create an S3 bucket. For example, the AWS S3 documentation is available here.
With the URI to your S3 bucket known, registering an S3 Artifact Store and using it in a stack can be done as follows:
Authentication Methods
Integrating and using an S3 compatible Artifact Store in your pipelines is not possible without employing some form of authentication. ZenML currently provides three options for configuring S3 credentials, the recommended one being to use a Secrets Manager in your stack to store the sensitive information in a secure location. Implicit Authentication Explicit Credentials Secrets Manager (Recommended) This method uses the implicit AWS authentication available in the environment where the ZenML code is running. On your local machine, this is the quickest way to configure an S3 Artifact Store. You don’t need to supply credentials explicitly when you register the S3 Artifact Store, as it leverages the local credentials and configuration that the AWS CLI stores on your local machine. However, you will need to install and set up the AWS CLI on your machine as a prerequisite, as covered in the AWS CLI documentation, before you register the S3 Artifact Store. The implicit authentication method needs to be coordinated with other stack components that are highly dependent on the Artifact Store and need to interact with it directly to function. If these components are not running on your machine, they do not have access to the local AWS CLI configuration and will encounter authentication failures while trying to access the S3 Artifact Store:- Orchestrators need to access the Artifact Store to manage pipeline artifacts
- Step Operators need to access the Artifact Store to manage step level artifacts
- Model Deployers need to access the Artifact Store to load served models
- on EC2 instances, see the IAM Roles for Amazon EC2 guide
- on EKS clusters, see the Amazon EKS cluster IAM role guide
Advanced Configuration
The S3 Artifact Store accepts a range of advanced configuration options that can be used to further customize how ZenML connects to the S3 storage service that you are using. These are accessible via theclient_kwargs
, config_kwargs
and s3_additional_kwargs
configuration attributes and are passed transparently to the underlying S3Fs library:
-
client_kwargs
: arguments that will be transparently passed to the botocore client. You can use it to configure parameters likeendpoint_url
andregion_name
when connecting to an S3 compatible endpoint (e.g. Minio). -
config_kwargs
: advanced parameters passed to botocore.client.Config. -
s3_additional_kwargs
: advanced parameters that are used when calling S3 API, typically used for things likeServerSideEncryption
andACL
.