composer.trainer.trainer_hparams#

The Hparams used to construct the Trainer.

Hparams

These classes are used with yahp for YAML-based configuration.

TrainerHparams

Params for instantiating the Trainer.

class composer.trainer.trainer_hparams.TrainerHparams(model, train_dataset, train_batch_size, dataloader, max_duration, datadir=None, val_dataset=None, eval_batch_size=None, evaluators=None, algorithms=<factory>, optimizer=None, schedulers=<factory>, device=<factory>, grad_accum=1, grad_clip_norm=None, validate_every_n_epochs=1, validate_every_n_batches=-1, compute_training_metrics=False, precision=Precision.AMP, scale_schedule_ratio=1.0, step_schedulers_every_batch=True, dist_timeout=15.0, ddp_sync_strategy=None, seed=None, deterministic_mode=False, loggers=<factory>, log_level='INFO', callbacks=<factory>, load_path_format=None, load_object_store=None, load_weights_only=False, load_strict_model_weights=False, load_chunk_size=1048576, load_progress_bar=True, save_folder=None, save_name_format='ep{epoch}-ba{batch}-rank{rank}', save_latest_format='latest-rank{rank}', save_overwrite=False, save_weights_only=False, save_interval='1ep', train_subset_num_batches=None, eval_subset_num_batches=None, deepspeed=None, profiler_trace_file=None, prof_event_handlers=<factory>, prof_skip_first=0, prof_wait=0, prof_warmup=1, prof_active=4, prof_repeat=1, sys_prof_cpu=True, sys_prof_memory=False, sys_prof_disk=False, sys_prof_net=False, sys_prof_stats_thread_interval_seconds=0.5, torch_profiler_trace_dir=None, torch_prof_use_gzip=False, torch_prof_record_shapes=False, torch_prof_profile_memory=True, torch_prof_with_stack=False, torch_prof_with_flops=True)[source]#

Bases: yahp.hparams.Hparams

Params for instantiating the Trainer.

See also

The documentation for the Trainer.

Parameters
  • model (ModelHparams) โ€“

    Hparams for constructing the model to train.

    See also

    composer.models for models built into Composer.

  • train_dataset (DatasetHparams) โ€“

    Hparams used to construct the dataset used for training.

    See also

    composer.datasets for datasets built into Composer.

  • train_batch_size (int) โ€“ The optimization batch size to use for training. This is the total batch size that is used to produce a gradient for the optimizer update step.

  • dataloader (DataLoaderHparams) โ€“ Hparams used for constructing the dataloader which will be used for loading the train dataset and (if provided) the validation dataset.

  • max_duration (str) โ€“

    The maximum duration to train as a str (e.g. 1ep, or 10ba). Will be converted to a Time object.

    See also

    Time for more details on time construction.

  • datadir (str, optional) โ€“ Datadir to apply for both the training and validation datasets. If specified, it will override both train_dataset.datadir and val_dataset.datadir. (default: None)

  • val_dataset (DatasetHparams, optional) โ€“

    Hparams for constructing the dataset used for evaluation. (default: None)

    See also

    composer.datasets for datasets built into Composer.

  • eval_batch_size (int, optional) โ€“ The batch size to use for evaluation. Must be provided if one of val_dataset or evaluators is set. (default: None)

  • evaluators (List[EvaluatorHparams], optional) โ€“

    Hparams for constructing evaluators to be used during the eval loop. Evaluators should be used when evaluating one or more specific metrics across one or more datasets. (default: None)

    See also

    Evaluator for more details on evaluators.

  • algorithms (List[AlgorithmHparams], optional) โ€“

    The algorithms to use during training. (default: [])

    See also

    composer.algorithms for the different algorithms built into Composer.

  • optimizers (OptimizerHparams, optional) โ€“

    The hparams for constructing the optimizer. (default: None)

    See also

    Trainer for the default optimizer behavior when None is provided.

    See also

    composer.optim for the different optimizers built into Composer.

  • schedulers (List[SchedulerHparams], optional) โ€“

    The learning rate schedulers. (default: []).

    See also

    Trainer for the default scheduler behavior when [] is provided.

    See also

    composer.optim.scheduler for the different schedulers built into Composer.

  • device (DeviceHparams) โ€“ Hparams for constructing the device used for training. (default: CPUDeviceHparams)

  • grad_accum (int, optional) โ€“ See Trainer.

  • grad_clip_norm (float, optional) โ€“ See Trainer.

  • validate_every_n_batches (int, optional) โ€“ See Trainer.

  • validate_every_n_epochs (int, optional) โ€“ See Trainer.

  • compute_training_metrics (bool, optional) โ€“ See Trainer.

  • precision (Precision, optional) โ€“ See Trainer.

  • scale_schedule_ratio (float, optional) โ€“ See Trainer.

  • step_schedulers_every_batch (bool, optional) โ€“ See Trainer.

  • dist_timeout (float, optional) โ€“ See Trainer.

  • ddp_sync_strategy (DDPSyncStrategy, optional) โ€“ See Trainer.

  • seed (int, optional) โ€“ See Trainer.

  • deterministic_mode (bool, optional) โ€“ See Trainer.

  • loggers (List[LoggerCallbackHparams], optional) โ€“

    Hparams for constructing the destinations to log to. (default: [])

    See also

    composer.loggers for the different loggers built into Composer.

  • log_level (str) โ€“

    The Python log level to use for log statements in the composer module. (default: INFO)

    See also

    The logging module in Python.

  • callbacks (List[CallbackHparams], optional) โ€“

    Hparams to construct the callbacks to run during training. (default: [])

    See also

    composer.callbacks for the different callbacks built into Composer.

  • load_path_format (str, optional) โ€“ See Trainer.

  • load_object_store (ObjectStoreProvider, optional) โ€“ See Trainer.

  • load_weights_only (bool, optional) โ€“ See Trainer.

  • load_chunk_size (int, optional) โ€“ See Trainer.

  • save_folder (str, optional) โ€“ See CheckpointSaver.

  • save_name_format (str, optional) โ€“ See CheckpointSaver.

  • save_latest_format (str, optional) โ€“ See CheckpointSaver.

  • save_overwrite (str, optional) โ€“ See CheckpointSaver.

  • save_weights_only (bool, optional) โ€“ See CheckpointSaver.

  • save_interval (str, optional) โ€“ See CheckpointSaverHparams.

  • train_subset_num_batches (int, optional) โ€“ See Trainer.

  • eval_subset_num_batches (int, optional) โ€“ See Trainer.

  • deepspeed_config (Dict[str, JSON], optional) โ€“ If set to a dict will be used for as the DeepSpeed config for training (see Trainer for more details). If None will pass False to the trainer for the deepspeed_config parameter signaling that DeepSpeed will not be used for training.

  • profiler_trace_file (str, optional) โ€“ See Trainer.

  • prof_event_handlers (List[ProfilerEventHandlerHparams], optional) โ€“ See Trainer.

  • prof_skip_first (int, optional) โ€“ See Trainer.

  • prof_wait (int, optional) โ€“ See Trainer.

  • prof_warmup (int, optional) โ€“ See Trainer.

  • prof_active (int, optional) โ€“ See Trainer.

  • prof_repeat (int, optional) โ€“ See Trainer.

  • sys_prof_cpu (bool, optional) โ€“ See Trainer.

  • sys_prof_memory (bool, optional) โ€“ See Trainer.

  • sys_prof_disk (bool, optional) โ€“ See Trainer.

  • sys_prof_net (bool, optional) โ€“ See Trainer.

  • sys_prof_stats_thread_interval_seconds (float, optional) โ€“ See Trainer.

  • torch_profiler_trace_dir (str, optional) โ€“ See Trainer.

  • torch_prof_use_gzip (bool) โ€“ See Trainer.

  • torch_prof_record_shapes (bool, optional) โ€“ See Trainer.

  • torch_prof_profile_memory (bool, optional) โ€“ See Trainer.

  • torch_prof_with_stack (bool, optional) โ€“ See Trainer.

  • torch_prof_with_flops (bool, optional) โ€“ See Trainer.