ZeRO & DeepSpeed: Enabling models with over 100 Billion Parametrs.

An open source library released by Microsoft; called DeepSpeed, greatly advances model training for very large models consisting of more than 100 Billion Parameters.

Deep Speed improves four aspects:

  • Scale: up to 10x bigger
  • Speed: Up to 5x faster
  • Cost: 5x reduction in cost
  • Usability: Low code impact.

Please follow the link at the Microsoft Research Blog.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.