Cerebras Systems has unveiled a new supercomputer this week in the form of Andromeda. A 13.5 million core AI supercomputer, now available for commercial and academic work. Equipped with more than 13.5 million AI-optimized compute cores and fed by 18,176 3rd Gen AMD EPYC processors (1.6 times as many cores as the largest supercomputer in the world). Andromeda delivers near-perfect scaling via simple data parallelism across GPT-class large language models, including GPT-3, GPT-J and GPT-NeoX, unlike any known GPU-based cluster.
Andromeda is deployed in Santa Clara, California, in 16 racks at Colovore, a leading high performance data center. The 16 CS-2 systems, with a combined 13.5 million AI optimized cores are fed by 284 64-core AMD 3rd Gen EPYC processors. The SwarmX fabric, which links the MemoryX parameter storage solution to the 16 CS-2s, provides more than 96.8 terabits of bandwidth.
“Near-perfect scaling means that as additional CS-2s are used, training time is reduced in near perfect proportion. This includes large language models with very large sequence lengths, a task that is impossible to achieve on GPUs. In fact, GPU impossible work was demonstrated by one of Andromeda’s first users, who achieved near perfect scaling on GPT-J at 2.5 billion and 25 billion parameters with long sequence lengths — MSL of 10,240. The users attempted to do the same work on Polaris, a 2,000 Nvidia A100 cluster, and the GPUs were unable to do the work because of GPU memory and memory bandwidth limitations.”
“Andromeda’s near-perfect scaling across the largest natural language processing models is made possible by the second-generation Cerebras Wafer Scale Engine (WSE-2), the industry’s largest and most powerful processor, and by Cerebras’ MemoryX and Swarm X technologies.
MemoryX enables even a single CS-2 to support multi-trillion parameter models. SwarmX technology links MemoryX to a cluster of CS-2s. Together these industry-leading technologies enable Cerebras’ large clusters to avoid two of the major challenges plaguing traditional clusters used for modern AI work: the complexity of parallel programming and the performance degradation of distributed computing.”
“The 16 CS-2s powering Andromeda run in a strictly data parallel mode, enabling simple and easy model distribution, and single-keystroke scaling from 1 to 16 CS-2s. In fact, sending AI jobs to Andromeda can be done quickly and painlessly from a Jupyter notebook, and users can switch from one model to another with a few keystrokes.
Andromeda’s 16 CS-2s were assembled in only 3 days, without any changes to the code, and immediately thereafter workloads scaled linearly across all 16 systems. And because the Cerebras WSE-2 processor, at the heart of its CS-2s, has 1,000 times more memory bandwidth than a GPU, Andromeda can harvest structured and unstructured sparsity as well as static and dynamic sparsity. These are things other hardware accelerators, including GPUs, simply can’t do. The result is that Cerebras can train models in excess of 90% sparse to state-of-the-art accuracy.”
- Argonne National Laboratory: “In collaboration with Cerebras researchers, our team at Argonne has completed pioneering work on gene transformers – work that is a finalist for ACM Gordon Bell Special Prize for HPC-Based COVID-19 Research. Using GPT3-XL, we put the entire COVID-19 genome into the sequence window, and Andromeda ran our unique genetic workload with long sequence lengths (MSL of 10K) across 1, 2, 4, 8 and 16 nodes, with near-perfect linear scaling. Linear scaling is amongst the most sought-after characteristics of a big cluster, and Cerebras Andromeda’s delivered 15.87X throughput across 16 CS-2 systems, compared to a single CS-2, and a reduction in training time to match. Andromeda sets a new bar for AI accelerator performance,” said Rick Stevens, Associate Lab Director, at Argonne National Laboratory.
- JasperAI: “Jasper uses large language models to write copy for marketing, ads, books, and more. We have over 85,000 customers who use our models to generate moving content and ideas. Given our large and growing customer base, we’re exploring testing and scaling models fit to each customer and their use cases. Creating complex new AI systems and bringing it to customers at increasing levels of granularity demands a lot from our infrastructure. We are thrilled to partner with Cerebras and leverage Andromeda’s performance and near perfect scaling without traditional distributed computing and parallel programming pains to design and optimize our next set of models,” said Dave Rogenmoser, CEO of JasperAI.
- AMD: “AMD is investing in technology that will pave the way for pervasive AI, unlocking new efficiency and agility abilities for businesses. The combination of the Cerebras Andromeda AI supercomputer and a data pre-processing pipeline powered by AMD EPYC-powered servers, together will put more capacity in the hands of researchers and support faster and deeper AI capabilities,” said Kumaran Siva, corporate vice president, Software & Systems Business Development, AMD.
- University of Cambridge: “It is extraordinary that Cerebras provided graduate students with free access to a cluster this big. Andromeda delivers 13.5 million AI cores and near perfect linear scaling across the largest language models, without the pain of distributed compute and parallel programing. This is every ML graduate student’s dream,” said Mateo Espinosa, doctoral candidate at the University of Cambridge in the United Kingdom.
Source : Cerebras Systems
Latest Geeky Gadgets Deals
Disclosure: Some of our articles include affiliate links. If you buy something through one of these links, Geeky Gadgets may earn an affiliate commission. Learn more.