Installation
#
RequirementsHere're the system requirements for MS-AMP.
- Latest version of Linux, you're highly encouraged to use Ubuntu 18.04 or later.
- Nvidia GPU(e.g. H100/A100) and compatible drivers should be installed correctly.
- Driver version can be checked by running
nvidia-smi
. - Python version 3.7 or later (which can be checked by running
python3 --version
). - Pip version 18.0 or later (which can be checked by running
python3 -m pip --version
). - CUDA version 11 or later (which can be checked by running
nvcc --version
). - PyTorch version 1.14 or later (which can be checked by running
python -c "import torch; print(torch.__version__)"
).
You can try MS-AMP in two ways: Using Docker or installing from source.
- Using Docker is a convenient way to get started with MS-AMP. You can use the pre-built Docker image to quickly set up an environment for running MS-AMP.
- On the other hand, installing from source gives you more control over the installation process and allows you to customize the installation to your needs.
#
Use DockerYou can try the latest MS-AMP Docker container with the following commands:
sudo docker run -it -d --name=msampcu122 --privileged --net=host --ipc=host --gpus=all -v /:/hostroot ghcr.io/azure/msamp:main-cuda12.2 bashsudo docker exec -it msampcu122 bash
MS-AMP is pre-installed in Docker container and you can verify it by running:
python -c 'import msamp;print(msamp.__version__)'
We also provide stable Docker images here.
#
Install from sourceWe strongly recommend using PyTorch NGC Container to avoid messing up local environment.
For example, to start PyTorch 2.1 container, run the following command:
sudo docker run -it -d --name=msamp --privileged --net=host --ipc=host --gpus=all nvcr.io/nvidia/pytorch:23.10-py3 bashsudo docker exec -it msamp bash
Then, you can clone the source from GitHub.
git clone https://github.com/Azure/MS-AMP.gitcd MS-AMPgit submodule update --init --recursive
If you want to train model with multiple GPU, you need to install MSCCL to support FP8. Please note that the compilation of MSCCL may take ~40 minutes on A100 nodes and ~7 minutes on H100 node.
cd third_party/msccl
# A100make -j src.build NVCC_GENCODE="-gencode=arch=compute_80,code=sm_80"# H100make -j src.build NVCC_GENCODE="-gencode=arch=compute_90,code=sm_90"
apt-get updateapt install build-essential devscripts debhelper fakerootmake pkg.debian.builddpkg -i build/pkg/deb/libnccl2_*.debdpkg -i build/pkg/deb/libnccl-dev_2*.deb
cd -
Then, you can install MS-AMP from source.
python3 -m pip install --upgrade pippython3 -m pip install .make postinstall
Before using MS-AMP, you need to preload msampfp8 library and it's depdencies:
NCCL_LIBRARY=/usr/lib/x86_64-linux-gnu/libnccl.so # Change as neededexport LD_PRELOAD="/usr/local/lib/libmsamp_dist.so:${NCCL_LIBRARY}:${LD_PRELOAD}"
After that, you can verify the installation by running:
python3 -c "import msamp; print(msamp.__version__)"