Custom Models/datasets

The MosaicML Trainer can easily be extended to use your own models and datasets. We walk through two ways to get started and experiment with algorithms on your machine learning projects.

Models

Models provided to Trainer use the minimal interface in BaseMosaicModel:

class BaseMosaicModel(torch.nn.Module, ABC):

    def forward(self, batch: Batch) -> Tensors:
    # computes the forward pass given a batch of data.

    def loss(self, outputs: Any, batch: Batch) -> Tensors:
    # given the outputs from forward, and the batch, return the loss

    def metrics(self, train: bool = False) -> Metrics:
    # returns a collection of `torchmetrics`

    def validate(self, batch: Batch) -> Tuple[Any, Any]:
    # runs validation and returns a tuple of results that are
    # then passed to self.metrics

Note

The Batch is the data returned from your dataloader. Since our algorithms need to know the structure of Batch in order to apply itself (e.g. augmentations must access the inputs), we currently support two types of Batch: a tuple of (input, target) tensors, and a Dict[str, Tensor] typically used for NLP applications.

For convenience, we’ve provided a few base classes that are task-specific:

Classification: MosaicClassifier. Uses cross entropy loss and torchmetrics.Accuracy.
Transformers: MosaicTransformer. For use with HuggingFace Transformers.
Segmentation: UNet. Uses a Dice and CE loss.

In this tutorial, we start with a simple image classification model:

import torch
import composer

class SimpleModel(composer.models.MosaicClassifier):
    def __init__(self, num_hidden: int, num_classes: int):
        module = torch.nn.Sequential(
            torch.nn.Flatten(start_dim=1),
            torch.nn.Linear(28 * 28, num_hidden),
            torch.nn.Linear(num_hidden, num_classes),
        )
        self.num_classes = num_classes
        super().__init__(module=module)

Datasets

Provide the trainer with your torch.utils.data.Dataset by configuring a DataloaderSpec for both train and validation datasets. Here, we create the DataloaderSpec with the MNIST dataset:

from composer import DataloaderSpec
from torchvision import datasets, transforms

train_dataloader_spec = DataloaderSpec(
    dataset=datasets.MNIST('/datasets/', train=True, transform=transforms.ToTensor(), download=True),
    drop_last=False,
    shuffle=True,
)

eval_dataloader_spec = DataloaderSpec(
    dataset=datasets.MNIST('/datasets/', train=False, transform=transforms.ToTensor()),
    drop_last=False,
    shuffle=False,
)

Trainer init

Now that your Dataset and Model are ready, you can initialize the Trainer and train your model with our algorithms.

from composer import Trainer
from composer.algorithms import LabelSmoothing, CutOut

trainer = Trainer(
    model=SimpleModel(num_hidden=128, num_classes=10),
    train_dataloader_spec=train_dataloader_spec,
    eval_dataloader_spec=eval_dataloader_spec,
    max_epochs=3,
    train_batch_size=256,
    eval_batch_size=256,
    algorithms=[
        CutOut(n_holes=1, length=10),
        LabelSmoothing(alpha=0.1),
    ]
)

trainer.fit()

Trainer with YAHP

Integrating your models and datasets with yahp.hparams allows for configuration via yaml or command line flags automagically. This is recommended if you are running experiments or large scale runs, to ensure reproducibility.

First, create Hparams dataclasses for both your model and your dataset:

from dataclasses import dataclass
from composer import models, datasets
import yahp as hp

@dataclass
class MyModelHparams(models.ModelHparams):

    num_hidden: int = hp.optional(doc="num hidden features", default=128)
    num_classes: int = hp.optional(doc="num of classes", default=10)

    def initialize_object(self):
        return SimpleModel(
            num_hidden=self.num_hidden,
            num_classes=self.num_classes
        )

@dataclass
class MNISTHparams(datasets.DatasetHparams):
    is_train: bool = hp.required("whether to load the training or validation dataset")
    datadir: str = hp.required("data directory")
    download: bool = hp.required("whether to download the dataset, if needed")
    drop_last: bool = hp.optional("Whether to drop the last samples for the last batch", default=True)
    shuffle: bool = hp.optional("Whether to shuffle the dataset for each epoch", default=True)

    def initialize_object(self) -> DataloaderSpec:
        transform = transforms.Compose([transforms.ToTensor()])
        dataset = datasets.MNIST(
            self.datadir,
            train=self.is_train,
            download=self.download,
            transform=transform,
        )
        return DataloaderSpec(
            dataset=dataset,
            drop_last=self.drop_last,
            shuffle=self.shuffle,
        )

Then, we can register them with the trainer:

from composer.trainer import TrainerHparams

TrainerHparams.register_class(
    field='model',
    register_class=MyModelHparams,
    class_key='my_model'
)

dataset_args = {
   'register_class': MNISTHparams,
   'class_key': 'my_mnist'
}
TrainerHparams.register_class(
    field='train_dataset',
    **dataset_args
)
TrainerHparams.register_class(
    field='val_dataset',
    **dataset_args
)

Now, your registered dataset and model is now available by invocation either in a yaml file:

model:
  my_model:
    num_classes: 10
    num_hidden: 128

or via the command line, e.g.

python examples/run_mosaic_trainer.py -f my_config.yaml --model my_model --num_classes 10 --num_hidden 128