composer.utils.file_helpers#
Helpers for working with files.
Functions
Ensure that the given folder is empty. |
|
Format |
|
Format |
|
Get a file from a local folder, URL, or object store. |
|
Returns whether |
Exceptions
Exception if |
- exception composer.utils.file_helpers.GetFileNotFoundException[source]#
Bases:
RuntimeError
Exception if
get_file()
failed due to a not found error.
- composer.utils.file_helpers.ensure_folder_is_empty(folder_name)[source]#
Ensure that the given folder is empty.
Hidden files and folders (those beginning with
.
) and ignored. Sub-folders are checked recursively.- Parameters
- Raises
FileExistsError โ If
folder_name
contains any non-hidden files, recursively.
- composer.utils.file_helpers.format_name_with_dist(format_str, run_name, **extra_format_kwargs)[source]#
Format
format_str
with therun_name
, distributed variables, andextra_format_kwargs
.The following format variables are available:
Variable
Description
{run_name}
The name of the training run. See
run_name
.{rank}
The global rank, as returned by
get_global_rank()
.{local_rank}
The local rank of the process, as returned by
get_local_rank()
.{world_size}
The world size, as returned by
get_world_size()
.{local_world_size}
The local world size, as returned by
get_local_world_size()
.{node_rank}
The node rank, as returned by
get_node_rank()
.For example, assume that the rank is
0
. Then:>>> from composer.utils import format_name_with_dist >>> format_str = '{run_name}/rank{rank}.{extension}' >>> format_name_with_dist( ... format_str, ... run_name='awesome_training_run', ... extension='json', ... ) 'awesome_training_run/rank0.json'
- composer.utils.file_helpers.format_name_with_dist_and_time(format_str, run_name, timestamp, **extra_format_kwargs)[source]#
Format
format_str
with therun_name
, distributed variables,timestamp
, andextra_format_kwargs
.In addition to the variables specified via
extra_format_kwargs
, the following format variables are available:Variable
Description
{run_name}
The name of the training run. See
run_name
.{rank}
The global rank, as returned by
get_global_rank()
.{local_rank}
The local rank of the process, as returned by
get_local_rank()
.{world_size}
The world size, as returned by
get_world_size()
.{local_world_size}
The local world size, as returned by
get_local_world_size()
.{node_rank}
The node rank, as returned by
get_node_rank()
.{epoch}
The total epoch count, as returned by
epoch()
.{batch}
The total batch count, as returned by
batch()
.{batch_in_epoch}
The batch count in the current epoch, as returned by
batch_in_epoch()
.{sample}
The total sample count, as returned by
sample()
.{sample_in_epoch}
The sample count in the current epoch, as returned by
sample_in_epoch()
.{token}
The total token count, as returned by
token()
.{token_in_epoch}
The token count in the current epoch, as returned by
token_in_epoch()
.For example, assume that the current epoch is
0
, batch is0
, and rank is0
. Then:>>> from composer.utils import format_name_with_dist_and_time >>> format_str = '{run_name}/ep{epoch}-ba{batch}-rank{rank}.{extension}' >>> format_name_with_dist_and_time( ... format_str, ... run_name='awesome_training_run', ... timestamp=state.timer.get_timestamp(), ... extension='json', ... ) 'awesome_training_run/ep0-ba0-rank0.json'
- composer.utils.file_helpers.get_file(path, destination, object_store=None, chunk_size=1048576, progress_bar=True)[source]#
Get a file from a local folder, URL, or object store.
- Parameters
path (str) โ
The path to the file to retreive.
If
object_store
is specified, then thepath
should be the object name for the file to get. Do not include the the cloud provider or bucket name.If
object_store
is not specified but thepath
begins withhttp://
orhttps://
, the object at this URL will be downloaded.Otherwise,
path
is presumed to be a local filepath.
destination (str) โ
The destination filepath.
If
path
is a local filepath, then a symlink topath
atdestination
will be created. Otherwise,path
will be downloaded to a file atdestination
.object_store (ObjectStore, optional) โ
An
ObjectStore
, ifpath
is located inside an object store (i.e. AWS S3 or Google Cloud Storage). (default:None
)This
ObjectStore
instance will be used to retreive the file. Thepath
parameter should be set to the object name within the object store.Set this parameter to
None
(the default) ifpath
is a URL or a local file.chunk_size (int, optional) โ Chunk size (in bytes). Ignored if
path
is a local file. (default: 1MB)progress_bar (bool, optional) โ Whether to show a progress bar. Ignored if
path
is a local file. (default:True
)
- Raises
GetFileNotFoundException โ If the
path
does not exist, aGetFileNotFoundException
exception will be raised.