- the Artifact Store is a type of Stack Component that needs to be registered as part of your ZenML Stack.
- the objects circulated through your pipelines are serialized and stored in the Artifact Store using Materializers. Materializers implement the logic required to serialize and deserialize the artifact contents and to store them and retrieve their contents to/from the Artifact Store.
- you can access the artifacts produced by your pipeline runs from the Artifact Store using the post-execution workflow API.
When to use it
The Artifact Store is a mandatory component in the ZenML stack. It is used to store all artifacts produced by pipeline runs, and you are required to configure it in all of your stacks.Artifact Store Flavors
Out of the box, ZenML comes with alocal
artifact store already part of the default stack that stores artifacts on your local filesystem. Additional Artifact Stores are provided by integrations:
Artifact Store | Flavor | Integration | URI Schema(s) | Notes |
---|---|---|---|---|
Local | local | built-in | None | This is the default Artifact Store. It stores artifacts on your local filesystem. Should be used only for running ZenML locally. |
Amazon S3 | s3 | s3 | s3:// | Uses AWS S3 as an object store backend |
Google Cloud Storage | gcp | gcp | gs:// | Uses Google Cloud Storage as an object store backend |
Azure | azure | azure | abfs://, az:// | Uses Azure Blob Storage as an object store backend |
Custom Implementation | custom | custom | Extend the Artifact Store abstraction and provide your own implementation |
path
attribute that must be configured when it is registered with ZenML. This is a URI pointing to the root path where all objects are stored in the Artifact Store. It must use a URI schema that is supported by the Artifact Store flavor. For example, the S3 Artifact Store will need a URI that contains the s3://
schema:
How to use it
The Artifact Store provides low-level object storage services for other ZenML mechanisms. When you develop ZenML pipelines, you normally don’t even have to be aware of its existence or interact with it directly. ZenML provides higher-level APIs that can be used as an alternative to store and access artifacts:- return one or more objects from your pipeline steps to have them automatically saved in the active Artifact Store as pipeline artifacts.
- use the post-execution workflow API to retrieve pipeline artifacts from the active Artifact Store after a pipeline run is complete.
- if you implement custom Materializers for your artifact data types
- if you want to store custom objects in the Artifact Store
The Artifact Store API
All ZenML Artifact Stores implement the same IO API that resembles a standard file system. This allows you to access and manipulate the objects stored in the Artifact Store in the same manner you would normally handle files on your computer and independently of the particular type of Artifact Store that is configured in your ZenML stack. Accessing the low-level Artifact Store API can be done through the following Python modules:-
zenml.io.fileio
provides low-level utilities for manipulating Artifact Store objects (e.g.open
,copy
,rename
,remove
,mkdir
). These functions work seamlessly across Artifact Stores types. They have the same signature as the Artifact Store abstraction methods (in fact, they are one and the same under the hood). - zenml.utils.io_utils includes some higher-level helper utilities that make it easier to find and transfer objects between the Artifact Store and the local filesystem or memory.
Repository
singleton to retrieve the root path of the active Artifact Store and then use it as a base path for artifact URIs, e.g.:
- creating folders, writing and reading data directly to/from an artifact store object
- using a temporary local file/folder to serialize and copy in-memory objects to/from the artifact store (heavily used in Materializers to transfer information between the Artifact Store and external libraries that don’t support writing/reading directly to/from the artifact store backend):