๐Ÿ›• Artifact Logging#

Composer supports uploading artifacts, such as checkpoints and profiling traces, directly to third-party experiment trackers (e.g. Weights & Biases) and cloud storage backends (e.g. AWS S3).

What is an artifact?#

An artifact is a file generated during training. Checkpoints, profiling traces, and log files are the most common examples of artifacts. An artifacts must be a single, local file. Collections of files can be combined into a single tarball, and a file can be stored in a temporary folder.

Each artifact must have a name, which is independent of the artifactโ€™s local filepath. A remote backend that logs an artifact is responsible for storing and organizing the file by the artifactโ€™s name. An artifact with the same name should override a previous artifact with that name. It is recommended that artifact names include file extensions.

How are artifacts generated?#

In Composer, individual classes, such as algorithms, callbacks, loggers, and profiler trace handlers, can generate artifacts.

Once a artifact file has been written to disk, the class should call file_artifact(), and the centralized Logger will then pass the filepath and artifact name to all LoggerDestinations, which are ultimately responsible for uploading and storing artifacts (more on that below).

Below are some examples of the classes that generate artifacts and the types of artifacts they generate. For each class, see the linked API Reference for additional documentation.

Type

Class Name

Description of Generated Artifacts

Callback

CheckpointSaver

Training checkpoint files

Callback

ExportForInferenceCallback

Trained models in inference formats

Callback

MLPerfCallback

MLPerf submission files

Logger

FileLogger

Log files

Logger

TensorboardLogger

Tensorboard TF Event Files

Trace Handler

JSONTraceHandler

Profiler trace files

Logging custom artifacts#

It is also possible to log custom artifacts outside of an algorithm or callback. For example:

from composer import Trainer
from composer.loggers import LogLevel

# Construct the trainer
trainer = Trainer(...)

# Log a custom artifact, such as a configuration YAML
trainer.logger.file_artifact(
    log_level=LogLevel.FIT,
    artifact_name='hparams.yaml',
    file_path='/path/to/hparams.yaml',
)

# Train!
trainer.fit()

How are artifacts uploaded?#

To store artifacts, in the loggers argument to the Trainer constructor, you must specify a LoggerDestination that implements the log_file_artifact().

See also

The built-in WandBLogger and ObjectStoreLogger implement this method โ€“ see the examples below.

The centralized Composer Logger will invoke this method for all LoggerDestinations. If no LoggerDestination implements this method, then artifacts will not be stored remotely.

Because LoggerDestinations can both generate and store artifacts, there is a potential for a circular dependency. As such, it is important that any logger that generates artifacts (e.g. the Tensorboard Logger) does not also attempt to store artifacts. Otherwise, you could run into an infinite loop!

Where can I store artifacts?#

Composer includes two built-in LoggerDestinations to store artifacts:

  • The WandBLogger can upload Composer training artifacts as W & B Artifacts, which are associated with the corresponding W & B project.

  • The ObjectStoreLogger can upload Composer training artifacts to any cloud storage backend or remote filesystem. We include integrations for AWS S3 and SFTP (see the examples below), and you can write your own integration for a custom backend.

Why should I use artifact logging instead of uploading artifacts manually?#

Artifact logging in Composer is optimized for efficiency. File uploads happen in background threads or processes, ensuring that the training loop is not blocked due to network I/O. In other words, this feature allows you to train the next batch while the previous checkpoint is being uploaded simultaneously.

Examples#

Below are some examples on how to configure Composer to log artifacts to various backends:

Weights & Biases Artifacts#

See also

The WandBLogger API Reference.

from composer.loggers import WandBLogger
from composer import Trainer

# Configure the logger
logger = WandBLogger(
    log_artifacts=True,  # enable artifact logging
)

# Define the trainer
trainer = Trainer(
    ...,
    loggers=logger,
)

# Train!
trainer.fit()

S3 Objects#

To log artifacts to a S3 bucket, weโ€™ll need to configure the ObjectStoreLogger with the S3ObjectStore backend.

See also

The ObjectStoreLogger and S3ObjectStore API Reference.

from composer.loggers import ObjectStoreLogger
from composer.utils.object_store import S3ObjectStore
from composer import Trainer

# Configure the logger
logger = ObjectStoreLogger(
    object_store_cls=S3ObjectStore,
    object_store_kwargs={
        # Keyword arguments for the S3ObjectStore constructor.
        # See the API reference for all available arguments
        'bucket': 'my-bucket-name',
    },
)

# Define the trainer
trainer = Trainer(
    ...,
    loggers=logger,
)

# Train!
trainer.fit()

SFTP Filesystem#

Similar to the S3 Example above, we can log artifacts to a remote SFTP filesystem.

See also

The ObjectStoreLogger and SFTPObjectStore API Reference.

from composer.loggers import ObjectStoreLogger
from composer.utils.object_store import SFTPObjectStore
from composer import Trainer

# Configure the logger
logger = ObjectStoreLogger(
    object_store_cls=SFTPObjectStore,
    object_store_kwargs={
        # Keyword arguments for the SFTPObjectStore constructor.
        # See the API reference for all available arguments
        'host': 'sftp_server.example.com',
    },
)

# Define the trainer
trainer = Trainer(
    ...,
    loggers=logger,
)

# Train!
trainer.fit()