composer.utils.collect_env#
Helpers to gather system information for debugging and bug reporting.
Leverages PyTorchโs torch.utils.collect_env
package to gather pertinent system information.
The following information is additionally collected to faciliate Comopser specific debug:
Composer version
Number of nodes
Host processor model name
Host processor physical core count
Number of accelerators per node
Accelerator model name
This package can be invoked as a standalone console script or can be invoked from within an application to gather and generate a system environment report.
The module can be invoked by using the entrypoint alias:
$ composer_collect_env
Or manually as a standalone script:
$ python composer/utils/collect_env.py
To generate a system report from within a user application see print_env()
.
A custom excepthook wrapper is also provided which extends the original sys.excepthook()
to automatically collect system information when an exception is raised.
To override the original sys.excepthook()
see configure_excepthook()
.
By default, the Composer custom excepthook
automatically generates the environment report.
To disable automatic environment report generation, use the disable_env_report()
helper
function. Report generation can be re-enabled by using the enable_env_report()
function.
Functions
|
Collect and print system information when |
|
Disable environment report generation on exception. |
|
Enable environment report generation on exception. |
|
Generate system information report. |
- class composer.utils.collect_env.ComposerEnv(composer_version, node_world_size, host_processor_model_name, host_processor_core_count, local_world_size, accelerator_model_name, cuda_device_count)[source]#
Bases:
tuple
composer.utils.collect_env.ComposerEnv
- composer.utils.collect_env.configure_excepthook()[source]#
Collect and print system information when
sys.excepthook()
is called.The custom exception handler causes an exception message to be printed when
sys.excepthook()
is called. The exception message provides the user with information on the nature of the exception and directs the user to file GitHub issues as appropriate.By default, the custom exception handler also generates an environment report users can attach to bug reports. Environment report generation can be optionally enabled/disabled by using the
enable_env_report()
anddisable_env_report()
helper functions, respectively.Additioanlly, the custom exceptionhook checks if the user is running from an IPython session and sets up the custom exception handler accordingly.
To override the default
sys.excepthook()
with the custom except hook:>>> configure_excepthook() >>> sys.excepthook <function _custom_exception_handler at ...>
- composer.utils.collect_env.disable_env_report()[source]#
Disable environment report generation on exception.
- composer.utils.collect_env.enable_env_report()[source]#
Enable environment report generation on exception.
- composer.utils.collect_env.get_cuda_device_count()[source]#
Get the number of CUDA devices on the system.
- composer.utils.collect_env.get_host_processor_cores()[source]#
Determine the number of physical host processor cores.
- composer.utils.collect_env.get_local_world_size()[source]#
Determine the number of accelerators per node.
- composer.utils.collect_env.get_torch_env()[source]#
Query Torch system environment via
torch.utils.collect_env
.
- composer.utils.collect_env.print_env(file=None)[source]#
Generate system information report.
Example:
from composer.utils.collect_env import print_env print_env()
Sample Report:
--------------------------------- System Environment Report Created: 2022-04-27 00:25:33 UTC --------------------------------- PyTorch information ------------------- PyTorch version: 1-91+cu111 Is debug build: False CUDA used to build PyTorch: 111 ROCM used to build PyTorch: N/A OS: Ubuntu 18.04.6 LTS (x86_64) GCC version: (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0 Clang version: Could not collect CMake version: version 3.10.2 Libc version: glibc-2.27 Python version: 3.8 (64-bit runtime) Python platform: Linux-5.8.0-63-generic-x86_64-with-glibc2.27 Is CUDA available: True CUDA runtime version: 11.1.105 GPU models and configuration: GPU 0: NVIDIA GeForce RTX 3080 GPU 1: NVIDIA GeForce RTX 3080 GPU 2: NVIDIA GeForce RTX 3080 GPU 3: NVIDIA GeForce RTX 3080 Nvidia driver version: 470.57.02 cuDNN version: Probably one of the following: /usr/lib/x86_64-linux-gnu/libcudnn.so.8.0.5 /usr/lib/x86_64-linux-gnu/libcudnn_adv_infer.so.8.0.5 /usr/lib/x86_64-linux-gnu/libcudnn_adv_train.so.8.0.5 /usr/lib/x86_64-linux-gnu/libcudnn_cnn_infer.so.8.0.5 /usr/lib/x86_64-linux-gnu/libcudnn_cnn_train.so.8.0.5 /usr/lib/x86_64-linux-gnu/libcudnn_ops_infer.so.8.0.5 /usr/lib/x86_64-linux-gnu/libcudnn_ops_train.so.8.0.5 HIP runtime version: N/A MIOpen runtime version: N/A Versions of relevant libraries: [pip3] numpy==1.22.3 [pip3] pytorch-ranger==0.1.1 [pip3] torch==1.9.1+cu111 [pip3] torch-optimizer==0.1.0 [pip3] torchmetrics==0.7.3 [pip3] torchvision==0.10.1+cu111 [pip3] vit-pytorch==0.27.0 [conda] Could not collect Composer information -------------------- Composer version: 0.6.0 Host processor model name: AMD EPYC 7502 32-Core Processor Host processor core count: 64 Number of nodes: 1 Accelerator model name: NVIDIA GeForce RTX 3080 Accelerators per node: 1 CUDA Device Count: 4
- Parameters
file (TextIO, optional) โ File handle, sys.stdout or sys.stderr. Defaults to sys.stdout.