composer.datasets#

Natively supported datasets.

Functions

build_ade20k_dataloader

Builds an ADE20k dataloader.

build_cifar10_dataloader

Builds a CIFAR-10 dataloader with default transforms.

build_ffcv_imagenet_dataloader

Builds an FFCV ImageNet dataloader.

build_imagenet_dataloader

Builds an ImageNet dataloader.

build_lm_dataloader

Builds a dataloader for a generic language modeling dataset.

build_mnist_dataloader

Builds an MNIST dataloader.

build_synthetic_ade20k_dataloader

Builds a synthetic ADE20k dataloader.

build_synthetic_cifar10_dataloader

Builds a synthetic CIFAR-10 dataset for debugging or profiling.

build_synthetic_imagenet_dataloader

Builds a synthetic ImageNet dataloader.

build_synthetic_lm_dataloader

Builds a synthetic dataloader for a generic language modeling dataset.

build_synthetic_mnist_dataloader

Builds a synthetic MNIST dataset.

Classes

ADE20k

PyTorch Dataset for ADE20k.

COCODetection

PyTorch Dataset for the COCO dataset.

PytTrain

PytVal

StreamingADE20k

Implementation of the ADE20k dataset using StreamingDataset.

StreamingC4

Implementation of the C4 (Colossal Cleaned Common Crawl) dataset using StreamingDataset V1.

StreamingCIFAR10

Implementation of the CIFAR10 dataset using StreamingDataset.

StreamingCOCO

Implementation of the COCO dataset using StreamingDataset.

StreamingImageNet1k

Implementation of the ImageNet1k dataset using StreamingDataset.

SyntheticBatchPairDataset

Emulates a dataset of provided size and shape.

SyntheticDataLabelType

Defines the class label type of the synthetic data.

SyntheticDataType

Defines the distribution of the synthetic data.

SyntheticPILDataset

Similar to SyntheticBatchPairDataset, but yields samples of type Image and supports dataset transformations.