In the world of artificial intelligence and machine learning, the NVIDIA GH200 Grace Hopper Superchip has made a remarkable debut. The Superchip has demonstrated exceptional performance in the MLPerf industry benchmarks, running all data center inference tests with aplomb. This achievement is a testament to NVIDIA’s commitment to pushing the boundaries of AI technology and its consistent performance leadership since the inception of the MLPerf benchmarks in 2018.
The GH200 Superchip is a unique combination of a Hopper GPU and a Grace CPU. This amalgamation provides more memory, bandwidth, and the ability to shift power between the CPU and GPU for optimized performance. This innovative design has allowed NVIDIA’s H100 GPUs and Grace Hopper Superchips to lead across all MLPerf’s data center tests, including inference for computer vision, speech recognition, and medical imaging.
NVIDIA GH200 Grace Hopper Superchip
In a bid to further optimize inference, NVIDIA developed TensorRT-LLM, an open-source generative AI software. This software enables customers to double the inference performance of their H100 GPUs at no added cost. When used on H100 GPUs, TensorRT-LLM provides up to an 8x performance speedup compared to prior generation GPUs running GPT-J 6B without the software. This is a significant leap in performance, demonstrating NVIDIA’s commitment to providing value to its customers.
NVIDIA’s L4 GPUs also delivered impressive performance across the board in the latest MLPerf benchmarks. They provided up to 6x more performance than CPUs rated for nearly 5x higher power consumption. These GPUs are available from Google Cloud and many system builders, serving customers in industries ranging from consumer internet services to drug discovery.
Other articles you may find of interest on the subject of NVIDIA :
- AAEON BOXER-8621AI mini PC powered by NVIDIA Jetson
- Build generative AI applications using Azure and NVIDIA
- NVIDIA OptiX 8 ray tracing framework released
- NVIDIA Canvas 1.4 now available with Panorama
- Stable Diffusion XL NVIDIA TensorRT performance improvement
- NVIDIA DLSS is now supported by over 150 games
- NVIDIA partners with HPE to take AI from Edge to Cloud
In a further demonstration of its innovative approach, NVIDIA applied a new model compression technology to achieve up to a 4.7x performance boost running the BERT LLM on an L4 GPU. This technology showcases NVIDIA’s commitment to continually improving the performance of its products.
The NVIDIA Jetson Orin system-on-module also showed significant performance increases of up to 84% compared to the prior round in object detection. This is a common use case in edge AI and robotics scenarios, further demonstrating the versatility and applicability of NVIDIA’s technology.
The MLPerf benchmarks, backed by more than 70 organizations including Alibaba, Arm, Cisco, Google, Harvard University, Intel, Meta, Microsoft, and the University of Toronto, are a reliable measure of performance in the AI industry. NVIDIA’s exceptional performance in these benchmarks is a testament to its technological prowess and commitment to innovation.
In a move that promotes transparency and collaboration, all the software used in NVIDIA’s benchmarks is available from the MLPerf repository. This allows everyone to achieve the same world-class results, fostering a spirit of shared learning and progress in the AI industry.
In conclusion, the NVIDIA GH200 Grace Hopper Superchip’s exceptional performance in the MLPerf benchmarks is a testament to NVIDIA’s commitment to pushing the boundaries of AI technology. With its innovative combination of a Hopper GPU and a Grace CPU, open-source software, and new model compression technology, NVIDIA continues to lead the way in AI training and inference.
Source: NVIDIA
Latest Geeky Gadgets Deals
Disclosure: Some of our articles include affiliate links. If you buy something through one of these links, Geeky Gadgets may earn an affiliate commission. Learn about our Disclosure Policy.