composer.datasets.coco#

COCO (Common Objects in Context) dataset.

COCO is a large-scale object detection, segmentation, and captioning dataset. Please refer to the COCO dataset for more details.

Classes

COCODetection

PyTorch Dataset for the COCO dataset.

Hparams

These classes are used with yahp for YAML-based configuration.

COCODatasetHparams

Defines an instance of the COCO Dataset.

class composer.datasets.coco.COCODatasetHparams(is_train=True, drop_last=True, shuffle=True, datadir=None)[source]#

Bases: composer.datasets.hparams.DatasetHparams

Defines an instance of the COCO Dataset.

Parameters
  • datadir (str) โ€“ The path to the data directory.

  • is_train (bool) โ€“ Whether to load the training data or validation data. Default: True.

  • drop_last (bool) โ€“ If the number of samples is not divisible by the batch size, whether to drop the last batch or pad the last batch with zeros. Default: True.

  • shuffle (bool) โ€“ Whether to shuffle the dataset. Default: True.

initialize_object(batch_size, dataloader_hparams)[source]#

Creates a DataLoader or DataSpec for this dataset.

Parameters
  • batch_size (int) โ€“ The size of the batch the dataloader should yield. This batch size is device-specific and already incorporates the world size.

  • dataloader_hparams (DataLoaderHparams) โ€“ The dataset-independent hparams for the dataloader.

Returns

DataLoader or DataSpec โ€“ The DataLoader, or if the dataloader yields batches of custom types, a DataSpec.

class composer.datasets.coco.COCODetection(img_folder, annotate_file, transform=None)[source]#

Bases: torch.utils.data.dataset.Dataset

PyTorch Dataset for the COCO dataset.

Parameters
  • img_folder (str) โ€“ the path to the COCO folder.

  • annotate_file (str) โ€“ path to a file that contains image id, annotations (e.g., bounding boxes and object classes) etc.

  • transform (Module) โ€“ transformations to apply to the image.