imagenet#

ImageNet classification streaming dataset.

The most widely used dataset for Image Classification algorithms. Please refer to the ImageNet 2012 Classification Dataset for more details.

Classes

StreamingImageNet1k

Implementation of the ImageNet1k dataset using StreamingDataset.

class composer.datasets.imagenet.StreamingImageNet1k(remote, local, split, shuffle, resize_size=- 1, crop_size=224, batch_size=None)[source]#

Bases: composer.datasets.streaming.dataset.StreamingDataset, torchvision.datasets.vision.VisionDataset

Implementation of the ImageNet1k dataset using StreamingDataset.

Parameters
  • remote (str) โ€“ Remote directory (S3 or local filesystem) where dataset is stored.

  • local (str) โ€“ Local filesystem directory where dataset is cached during operation.

  • split (str) โ€“ The dataset split to use, either โ€˜trainโ€™ or โ€˜valโ€™.

  • shuffle (bool) โ€“ Whether to shuffle the samples in this dataset.

  • resize_size (int, optional) โ€“ The resize size to use. Use -1 to not resize. Default: -1.

  • size (crop) โ€“ The crop size to use. Default: 224.

  • batch_size (Optional[int]) โ€“ Hint batch_size that will be used on each deviceโ€™s DataLoader. Default: None.

decode_class(data)[source]#

Decode the sample class.

Parameters

data (bytes) โ€“ The raw bytes.

Returns

np.int64 โ€“ The class encoded by the bytes.

decode_image(data)[source]#

Decode the sample image.

Parameters

data (bytes) โ€“ The raw bytes.

Returns

Image โ€“ PIL image encoded by the bytes.