composer.datasets#

Natively supported datasets.

Functions

build_ade20k_dataloader

Builds an ADE20k dataloader.

build_cifar10_dataloader

Builds a CIFAR-10 dataloader with default transforms.

build_ffcv_cifar10_dataloader

Builds an FFCV CIFAR10 dataloader.

build_ffcv_imagenet_dataloader

Builds an FFCV ImageNet dataloader.

build_imagenet_dataloader

Builds an ImageNet dataloader.

build_lm_dataloader

Builds a dataloader for a generic language modeling dataset.

build_mnist_dataloader

Builds an MNIST dataloader.

build_streaming_ade20k_dataloader

Build an ADE20k streaming dataset.

build_streaming_c4_dataloader

Builds a DataSpec for the StreamingC4 (Colossal Cleaned Common Crawl) dataset.

build_streaming_cifar10_dataloader

Builds a streaming CIFAR10 dataset

build_streaming_imagenet1k_dataloader

Builds an imagenet1k streaming dataset

build_synthetic_ade20k_dataloader

Builds a synthetic ADE20k dataloader.

build_synthetic_cifar10_dataloader

Builds a synthetic CIFAR-10 dataset for debugging or profiling.

build_synthetic_imagenet_dataloader

Builds a synthetic ImageNet dataloader.

build_synthetic_mnist_dataloader

Builds a synthetic MNIST dataset.

Classes

ADE20k

PyTorch Dataset for ADE20k.

PytTrain

PytVal

SyntheticBatchPairDataset

Emulates a dataset of provided size and shape.

SyntheticDataLabelType

Defines the class label type of the synthetic data.

SyntheticDataType

Defines the distribution of the synthetic data.

SyntheticPILDataset

Similar to SyntheticBatchPairDataset, but yields samples of type Image and supports dataset transformations.