composer.datasets.streaming.download#

Download handling for StreamingDataset.

Functions

safe_download

Safely downloads a file from remote to local.

composer.datasets.streaming.download.download(remote, local, timeout)[source]#

Download a file from remote to local.

Parameters
  • remote (str) โ€“ Remote path (S3 or local filesystem).

  • local (str) โ€“ Local path (local filesystem).

  • timeout (float) โ€“ How long to wait for shard to download before raising an exception.

composer.datasets.streaming.download.download_from_local(remote, local)[source]#

Download a file from remote to local.

Parameters
  • remote (str) โ€“ Remote path (local filesystem).

  • local (str) โ€“ Local path (local filesystem).

composer.datasets.streaming.download.download_from_s3(remote, local, timeout)[source]#

Download a file from remote to local.

Parameters
  • remote (str) โ€“ Remote path (S3).

  • local (str) โ€“ Local path (local filesystem).

  • timeout (float) โ€“ How long to wait for shard to download before raising an exception.

composer.datasets.streaming.download.safe_download(remote, local, timeout=60)[source]#
Safely downloads a file from remote to local.

Handles multiple threads attempting to download the same shard. Gracefully deletes stale tmp files from crashed runs.

Parameters
  • remote (str) โ€“ Remote path (S3 or local filesystem).

  • local (str) โ€“ Local path (local filesystem).

  • timeout (float) โ€“ How long to wait for shard to download before raising an exception. Default: 60 sec.

composer.datasets.streaming.download.wait_for_download(local, timeout=60)[source]#

Block until another workerโ€™s shard download completes.

Parameters
  • local (str) โ€“ Path to file.

  • timeout (float) โ€“ How long to wait before raising an exception. Default: 60 sec.