composer.core.evaluator#

A wrapper for a dataloader to include metrics that apply to a specific dataset.

Functions

ensure_evaluator

Ensure that evaluator is an Evaluator.

evaluate_periodically

Helper function to generate an evaluation interval callable.

Classes

Evaluator

A wrapper for a dataloader to include metrics that apply to a specific dataset.

class composer.core.evaluator.Evaluator(*, label, dataloader, metrics, subset_num_batches=None, eval_interval=None)[source]#

A wrapper for a dataloader to include metrics that apply to a specific dataset.

For example, CrossEntropyLoss metric for NLP models.

>>> from torchmetrics.classification.accuracy import Accuracy
>>> eval_evaluator = Evaluator(
...     label="myEvaluator",
...     dataloader=eval_dataloader,
...     metrics=Accuracy()
... )
>>> trainer = Trainer(
...     model=model,
...     train_dataloader=train_dataloader,
...     eval_dataloader=eval_evaluator,
...     optimizers=optimizer,
...     max_duration="1ep",
... )
Parameters
  • label (str) โ€“ Name of the Evaluator.

  • dataloader (DataSpec | Iterable | Dict[str, Any]) โ€“ Iterable that yields batches, a DataSpec for evaluation, or a Dict of DataSpec kwargs.

  • metrics (Metric | torchmetrics.MetricCollection) โ€“ torchmetrics.Metric to log. metrics will be deep-copied to ensure that each evaluator updates only its metrics.

  • subset_num_batches (int, optional) โ€“ The maximum number of batches to use for each evaluation. Defaults to None, which means that the eval_subset_num_batches parameter from the Trainer will be used. Set to -1 to evaluate the entire dataloader

  • eval_interval (Time | int | str | (State, Event) -> bool, optional) โ€“

    An integer, which will be interpreted to be epochs, a str (e.g. 1ep, or 10ba), a Time object, or a callable. Defaults to None, which means that the eval_interval parameter from the Trainer will be used.

    If an integer (in epochs), Time string, or Time instance, the evaluator will be run with this frequency. Time strings or Time instances must have units of TimeUnit.BATCH or TimeUnit.EPOCH.

    Set to 0 to disable evaluation.

    If a callable, it should take two arguments (State, Event) and return a bool representing whether the evaluator should be invoked. The event will be either Event.BATCH_END or Event.EPOCH_END.

    When specifying eval_interval, the evaluator(s) are also run at the Event.FIT_END if it doesnโ€™t evenly divide the training duration.

composer.core.evaluator.ensure_evaluator(evaluator, default_metrics)[source]#

Ensure that evaluator is an Evaluator.

Parameters
  • evaluator (Evaluator | DataSpec | Iterable | Dict[str, Any]) โ€“ A dataloader, DataSpec instance, dictionary of DataSpec kwargs, or existing evaluator.

  • default_metrics (Metric | torchmetrics.MetricCollection) โ€“ The metrics for the evaluator, if a datalaoder was specified.

Returns

Evaluator โ€“ An evaluator.

composer.core.evaluator.evaluate_periodically(eval_interval, eval_at_fit_end=True)[source]#

Helper function to generate an evaluation interval callable.

Parameters
  • eval_interval (str | Time | int) โ€“ A Time instance or time string, or integer in epochs, representing how often to evaluate. Set to 0 to disable evaluation.

  • eval_at_fit_end (bool) โ€“ Whether to evaluate at the end of training, regardless of eval_interval. Default: True

Returns

(State, Event) -> bool โ€“ A callable for the eval_interval argument of an Evaluator.