composer.datasets#

Modules

composer.datasets.ade20k

ADE20K Semantic segmentation and scene parsing dataset.

composer.datasets.brats

BraTS (Brain Tumor Segmentation) dataset.

composer.datasets.c4

C4 (Colossal Cleaned CommonCrawl) dataset.

composer.datasets.cifar

CIFAR image classification dataset.

composer.datasets.coco

COCO (Common Objects in Context) dataset.

composer.datasets.dataloader

Common settings across both the training and eval datasets.

composer.datasets.dataset_registry

Mapping between dataset names and corresponding HParams classes.

composer.datasets.evaluator

Specifies an instance of an Evaluator, which wraps a dataloader to include metrics that apply to a specific dataset.

composer.datasets.glue

GLUE (General Language Understanding Evaluation) dataset (Wang et al, 2019).

composer.datasets.hparams

Dataset Hyperparameter classes.

composer.datasets.imagenet

ImageNet classfication dataset.

composer.datasets.lm_datasets

Generic dataset class for self-supervised training of autoregressive and masked language models.

composer.datasets.mnist

MNIST image classification dataset.

composer.datasets.synthetic

Synthetic datasets used for testing, profiling, and debugging.

composer.datasets.utils

Utility and helper functions for datasets.

composer.datasets.webdataset

composer.datasets.webdataset

Natively supported datasets.

Modules in datasets namespace define utilities and mechanisms to create dataloaders from the given hyperparameters. Two of the important classes in this module are described below:

Functions

get_dataset_registry

Returns a mapping between different supported datasets and their HParams classes that create an instance of the dataset.

Classes

MemoryFormat

Enum class to represent different memory formats.

SyntheticBatchPairDataset

Emulates a dataset of provided size and shape.

SyntheticDataLabelType

Defines the class label type of the synthetic data.

SyntheticDataType

Defines the distribution of the synthetic data.

SyntheticPILDataset

Similar to SyntheticBatchPairDataset, but yields samples of type Image and supports dataset transformations.

WrappedDataLoader

A wrapper around dataloader.

Hparams

These classes are used with yahp for YAML-based configuration.

ADE20kDatasetHparams

Defines an instance of the ADE20k dataset for semantic segmentation from a local disk.

ADE20kWebDatasetHparams

Defines an instance of the ADE20k dataset for semantic segmentation from a remote blob store.

BratsDatasetHparams

Defines an instance of the BraTS dataset for image segmentation.

C4DatasetHparams

Builds a DataSpec for the C4 (Colossal Cleaned CommonCrawl) dataset.

CIFAR100WebDatasetHparams

Defines an instance of the CIFAR-100 WebDataset for image classification.

CIFAR10DatasetHparams

Defines an instance of the CIFAR-10 dataset for image classification from a local disk.

CIFAR10WebDatasetHparams

Defines an instance of the CIFAR-10 WebDataset for image classification.

CIFAR20WebDatasetHparams

Defines an instance of the CIFAR-20 WebDataset for image classification.

COCODatasetHparams

Defines an instance of the COCO Dataset.

DataLoaderHparams

Hyperparameters to initialize a torch.utils.data.DataLoader.

DatasetHparams

Abstract base class for hyperparameters to initialize a dataset.

EvaluatorHparams

Params for the Evaluator.

GLUEHparams

Sets up a generic GLUE dataset loader.

Imagenet1kWebDatasetHparams

Defines an instance of the ImageNet-1k WebDataset for image classification.

ImagenetDatasetHparams

Defines an instance of the ImageNet dataset for image classification.

LMDatasetHparams

Defines a generic dataset class for self-supervised training of autoregressive and masked language models.

MNISTDatasetHparams

Defines an instance of the MNIST dataset for image classification.

MNISTWebDatasetHparams

Defines an instance of the MNIST WebDataset for image classification.

SyntheticHparamsMixin

Synthetic dataset parameter mixin for DatasetHparams.

TinyImagenet200WebDatasetHparams

Defines an instance of the TinyImagenet-200 WebDataset for image classification.

WebDatasetHparams

Abstract base class for hyperparameters to initialize a webdataset.