Skip to main content

Releasing MS-AMP v0.4

· One min read
MS-AMP Team

We are very happy to announce that MS-AMP 0.4.0 version is officially released today!

You can install and try MS-AMP by following Getting Started Tutorial.

MS-AMP 0.4.0 Release Notes#

MS-AMP Improvements#

  • Improve GPT-3 performance by optimizing the FP8-gradient accumulation with kernel fusion technology
  • Support FP8 in FSDP
  • Support DeepSpeed+TE+MSAMP and add cifar10 example
  • Support MSAMP+TE+DDP
  • Update DeepSpeed to latest version
  • Update TransformerEngin to V1.1 and flash-attn to latest version
  • Support CUDA 12.2
  • Fix several bugs in DeepSpeed integration

MS-AMP-Examples Improvements#

  • Improve document for data processing in GPT3
  • Add launch script for pretraining GPT-6b7
  • Use new API of TransformerEngine in Megatron-LM

Document Improvements#

  • Add docker usage in Installation page
  • Tell customer how to run FSDP and DeepSpeed+TE+MSAMP example in "Run Examples" page

Releasing MS-AMP v0.3

· One min read
MS-AMP Team

We are very happy to announce that MS-AMP 0.3.0 version is officially released today!

You can install and try MS-AMP by following Getting Started Tutorial.

MS-AMP 0.3.0 Release Notes#

MS-AMP Improvements#

  • Integrate latest Transformer Engine into MS-AMP
  • Integrate with latest Megatron-LM
  • Add a website for MS-AMP and improve documents
  • Add custom DistributedDataParallel which supports FP8 and computation/computation overlap
  • Refactor code in dist_op module
  • Support UT for distributed testing
  • Integrate with MSCCL

MS-AMP-Examples Improvements#

  • Support pretrain GPT-3 with Megatron-LM and MS-AMP
  • Provide a tool to print the traffic per second of NVLINK and InfiniBand
  • Print tflops and throughput metrics in all the examples

Document Improvements#

  • Add performance number in Introduction page
  • Enhance Usage page and Optimization Level page
  • Add Container Images page
  • Add Developer Guide section

Releasing MS-AMP v0.2

· One min read
MS-AMP Team

We are very happy to announce that MS-AMP 0.2.0 version is officially released today!

You can install and try MS-AMP by following Getting Started Tutorial.

MS-AMP 0.2.0 Release Notes#

MS-AMP Improvements#

  • Add O3 optimization for supporting FP8 in distributed training frameworks
  • Support ScalingTensor in functional.linear
  • Support customized attributes in FP8Linear
  • Improve performance
  • Add docker file for pytorch1.14+cuda11.8 and pytorch2.1+cuda12.1
  • Support pytorch 2.1
  • Add performance result and TE result in homepage
  • Cache TE build in pipeline

MS-AMP-Examples Improvements#

Add 3 examples using MS-AMP: