composer.datasets.cifar10#

composer.datasets.cifar10

Functions

dataclass

Returns the same class as was passed in, with dunder methods added based on the fields defined in the class.

Classes

CIFAR10

CIFAR10 Dataset.

DataLoader

Protocol for custom DataLoaders compatible with torch.utils.data.DataLoader.

SyntheticBatchPairDataset

Emulates a dataset of provided size and shape.

Hparams

These classes are used with yahp for YAML-based configuration.

CIFAR10DatasetHparams

Defines an instance of the CIFAR-10 dataset for image classification.

DataloaderHparams

Hyperparameters to initialize a Dataloader.

DatasetHparams

Abstract base class for hyperparameters to initialize a dataset.

SyntheticHparamsMixin

Synthetic dataset parameter mixin for DatasetHparams.

class composer.datasets.cifar10.CIFAR10DatasetHparams(use_synthetic=False, synthetic_num_unique_samples=100, synthetic_device='cpu', synthetic_memory_format=MemoryFormat.CONTIGUOUS_FORMAT, is_train=True, drop_last=True, shuffle=True, datadir=None, download=True)[source]#

Bases: composer.datasets.hparams.DatasetHparams, composer.datasets.hparams.SyntheticHparamsMixin

Defines an instance of the CIFAR-10 dataset for image classification.

Parameters
  • use_synthetic (bool, optional) โ€“ Whether to use synthetic data. (Default: False)

  • synthetic_num_unique_samples (int, optional) โ€“ The number of unique samples to allocate memory for. Ignored if use_synthetic is False. (Default: 100)

  • synthetic_device (str, optonal) โ€“ The device to store the sample pool. Set to cuda to store samples on the GPU and eliminate PCI-e bandwidth with the dataloader. Set to cpu to move data between host memory and the device on every batch. Ignored if use_synthetic is False. (Default: cpu)

  • synthetic_memory_format โ€“ The MemoryFormat to use. Ignored if use_synthetic is False. (Default: CONTIGUOUS_FORMAT)

  • datadir (str) โ€“ The path to the data directory.

  • is_train (bool) โ€“ Whether to load the training data (the default) or validation data.

  • drop_last (bool) โ€“ If the number of samples is not divisible by the batch size, whether to drop the last batch (the default) or pad the last batch with zeros.

  • shuffle (bool) โ€“ Whether to shuffle the dataset. Defaults to True.

  • download (bool) โ€“ Whether to download the dataset, if needed.

initialize_object(batch_size, dataloader_hparams)[source]#

Creates a DataLoader or DataloaderSpec for this dataset.

Parameters
  • batch_size (int) โ€“ The size of the batch the dataloader should yield. This batch size is device-specific and already incorporates the world size.

  • dataloader_hparams (DataloaderHparams) โ€“ The dataset-independent hparams for the dataloader

Returns
  • Dataloader or DataSpec โ€“ The dataloader, or if the dataloader yields batches of custom types,

  • a :class:`DataSpec`.