composer.datasets.imagenet_hparams#

ImageNet classification dataset hyperparameters.

The most widely used dataset for Image Classification algorithms. Please refer to the ImageNet 2012 Classification Dataset for more details.

Hparams

These classes are used with yahp for YAML-based configuration.

ImagenetDatasetHparams

Defines an instance of the ImageNet dataset for image classification.

StreamingImageNet1kHparams

DatasetHparams for creating an instance of StreamingImageNet1k.

class composer.datasets.imagenet_hparams.ImagenetDatasetHparams(use_synthetic=False, synthetic_num_unique_samples=100, synthetic_device='cpu', synthetic_memory_format=MemoryFormat.CONTIGUOUS_FORMAT, drop_last=True, shuffle=True, resize_size=- 1, crop_size=224, use_ffcv=False, ffcv_cpu_only=False, ffcv_dir='/tmp', ffcv_dest='imagenet_train.ffcv', ffcv_write_dataset=False, is_train=True, datadir=None)[source]#

Bases: composer.datasets.dataset_hparams.DatasetHparams, composer.datasets.synthetic_hparams.SyntheticHparamsMixin

Defines an instance of the ImageNet dataset for image classification.

Parameters
  • resize_size (int, optional) โ€“ The resize size to use. Use -1 to not resize. Default: -1.

  • size (crop) โ€“ The crop size to use. Default: 224.

  • use_ffcv (bool) โ€“ Whether to use FFCV dataloaders. Default: False.

  • ffcv_dir (str) โ€“ A directory containing train/val <file>.ffcv files. If these files donโ€™t exist and ffcv_write_dataset is True, train/val <file>.ffcv files will be created in this dir. Default: "/tmp".

  • ffcv_dest (str) โ€“ <file>.ffcv file that has dataset samples. Default: "imagenet_train.ffcv".

  • ffcv_write_dataset (std) โ€“ Whether to create dataset in FFCV format (<file>.ffcv) if it doesnโ€™t exist. Default:

  • False. โ€“

  • datadir (str) โ€“ The path to the data directory.

  • is_train (bool) โ€“ Whether to load the training data or validation data. Default: True.

class composer.datasets.imagenet_hparams.StreamingImageNet1kHparams(drop_last=True, shuffle=True, remote='s3://mosaicml-internal-dataset-imagenet1k/mds/1/', local='/tmp/mds-cache/mds-imagenet1k/', split='train', resize_size=- 1, crop_size=224)[source]#

Bases: composer.datasets.dataset_hparams.DatasetHparams

DatasetHparams for creating an instance of StreamingImageNet1k.

Parameters
  • remote (str) โ€“ Remote directory (S3 or local filesystem) where dataset is stored. Default: 's3://mosaicml-internal-dataset-imagenet1k/mds/1/`

  • local (str) โ€“ Local filesystem directory where dataset is cached during operation. Default: '/tmp/mds-cache/mds-imagenet1k/`

  • split (str) โ€“ The dataset split to use, either โ€˜trainโ€™ or โ€˜valโ€™. Default: 'train`.

  • resize_size (int, optional) โ€“ The resize size to use. Use -1 to not resize. Default: -1.

  • size (crop) โ€“ The crop size to use. Default: 224.