composer.datasets.ade20k#

composer.datasets.ade20k

Functions

`dataclass`	Returns the same class as was passed in, with dunder methods added based on the fields defined in the class.
`pil_image_collate`	Constructs a `BatchPair` from datasets that yield samples of type `PIL.Image.Image`.

Classes

`ADE20k`	PyTorch Dataset for ADE20k.
`DataSpec`	Specifications for operating and training on data.
`Dataset`	An abstract class representing a `Dataset`.
`NormalizationFn`	Normalizes input data and removes the background class from target data if desired.
`PadToSize`	Pad an image to a specified size.
`PhotometricDistoration`	Applies a combination of brightness, contrast, saturation, and hue jitters with random intensity.
`RandomCropPair`	Crop the image and target at a randomly sampled position.
`RandomHFlipPair`	Flip the image and target horizontally with a specified probability.
`RandomResizePair`	Resize the image and target to base_size scaled by a randomly sampled value.
`SyntheticBatchPairDataset`	Emulates a dataset of provided size and shape.

Hparams

These classes are used with yahp for YAML-based configuration.

`ADE20kDatasetHparams`	Defines an instance of the ADE20k dataset for semantic segmentation.
`DatasetHparams`	Abstract base class for hyperparameters to initialize a dataset.
`SyntheticHparamsMixin`	Synthetic dataset parameter mixin for `DatasetHparams`.

Attributes

IMAGENET_CHANNEL_MEAN
IMAGENET_CHANNEL_STD
Optional
Tuple
Union
ceil

class composer.datasets.ade20k.ADE20k(datadir, split='train', both_transforms=None, image_transforms=None, target_transforms=None)[source]#

Bases: torch.utils.data.dataset.Dataset

PyTorch Dataset for ADE20k.

Parameters

datadir (str) – the path to the ADE20k folder.
split (str) – the dataset split to use, either ‘train’, ‘val’, or ‘test’. Default is ‘train’.
both_transforms (Module) – transformations to apply to the image and target simultaneously. Default is None.
image_transforms (Module) – transformations to apply to the image only. Default is None.
target_transforms (Module) – transformations to apply to the target only. Default is None.

class composer.datasets.ade20k.ADE20kDatasetHparams(use_synthetic=False, synthetic_num_unique_samples=100, synthetic_device='cpu', synthetic_memory_format=MemoryFormat.CONTIGUOUS_FORMAT, is_train=True, drop_last=True, shuffle=True, datadir=None, split='train', base_size=512, min_resize_scale=0.5, max_resize_scale=2.0, final_size=512, ignore_background=True)[source]#

Bases: composer.datasets.hparams.DatasetHparams, composer.datasets.hparams.SyntheticHparamsMixin

Defines an instance of the ADE20k dataset for semantic segmentation.

Parameters

use_synthetic (bool, optional) – Whether to use synthetic data. (Default: False)
synthetic_num_unique_samples (int, optional) – The number of unique samples to allocate memory for. Ignored if use_synthetic is False. (Default: 100)
synthetic_device (str, optonal) – The device to store the sample pool. Set to cuda to store samples on the GPU and eliminate PCI-e bandwidth with the dataloader. Set to cpu to move data between host memory and the device on every batch. Ignored if use_synthetic is False. (Default: cpu)
synthetic_memory_format – The MemoryFormat to use. Ignored if use_synthetic is False. (Default: CONTIGUOUS_FORMAT)
datadir (str) – The path to the data directory.
is_train (bool) – Whether to load the training data (the default) or validation data.
drop_last (bool) – If the number of samples is not divisible by the batch size, whether to drop the last batch (the default) or pad the last batch with zeros.
shuffle (bool) – Whether to shuffle the dataset. Defaults to True.
split (str) – the dataset split to use either ‘train’, ‘val’, or ‘test’. Default is train.
base_size (int) – initial size of the image and target before other augmentations. Default is 512.
min_resize_scale (float) – the minimum value the samples can be rescaled. Default is 0.5.
max_resize_scale (float) – the maximum value the samples can be rescaled. Default is 2.0.
final_size (int) – the final size of the image and target. Default is 512.
ignore_background (bool) – if true, ignore the background class when calculating the training loss. Default is true.

initialize_object(batch_size, dataloader_hparams)[source]#

Creates a DataLoader or DataloaderSpec for this dataset.

Parameters

batch_size (int) – The size of the batch the dataloader should yield. This batch size is device-specific and already incorporates the world size.
dataloader_hparams (DataloaderHparams) – The dataset-independent hparams for the dataloader

Returns

Dataloader or DataSpec – The dataloader, or if the dataloader yields batches of custom types,
a :class:`DataSpec`.

validate()[source]#

Validate that the hparams are of the correct types. Recurses through sub-hparams.

Raises: TypeError – Raises a TypeError if any fields are an incorrect type.

class composer.datasets.ade20k.PadToSize(size, fill=0)[source]#