As well as announcing the release of its new NVIDIA CloudXR 3.0 , NVIDIA has also officially introduced its new PCI-based accelerator with 80 GB of HBM2e memory. The NVIDIA A100 PCIe 80GB card offers twice the memory capacity of the original Ampere GA100 together with a higher bandwidth.
The new NVIDIA A100 PCIe 80GB is based on 7nm Ampere GA100 GPU and is equipped with 6,192 CUDA cores, offering a bandwidth of 2039 GB/s which is over 484 GB/s more than the A100 40GB launched approximately seven months ago and is also equipped with faster memory with an effective speed of 3186 Gbps. The NVIDIA A100 PCIe 80GB card has been specifically designed for high-performance computing to celebrate deep learning algorithms for AI applications.
“The new Multi-Instance GPU (MIG) feature allows GPUs based on the NVIDIA Ampere architecture (such as NVIDIA A100) to be securely partitioned into up to seven separate GPU Instances for CUDA applications, providing multiple users with separate GPU resources for optimal GPU utilization. This feature is particularly beneficial for workloads that do not fully saturate the GPU’s compute capacity and therefore users may want to run different workloads in parallel to maximize utilization. For Cloud Service Providers (CSPs), who have multi-tenant use cases, MIG ensures one client cannot impact the work or scheduling of other clients, in addition to providing enhanced isolation for customers.”
“With MIG, each instance’s processors have separate and isolated paths through the entire memory system – the on-chip crossbar ports, L2 cache banks, memory controllers, and DRAM address busses are all assigned uniquely to an individual instance. This ensures that an individual user’s workload can run with predictable throughput and latency, with the same L2 cache allocation and DRAM bandwidth, even if other tasks are thrashing their own caches or saturating their DRAM interfaces. MIG can partition available GPU compute resources (including streaming multiprocessors or SMs, and GPU engines such as copy engines or decoders), to provide a defined quality of service (QoS) with fault isolation for different clients such as VMs, containers or processes. MIG enables multiple GPU Instances to run in parallel on a single, physical NVIDIA Ampere GPU.
With MIG, users will be able to see and schedule jobs on their new virtual GPU Instances as if they were physical GPUs. MIG works with Linux operating systems, supports containers using Docker Engine, with support for Kubernetes and virtual machines using hypervisors such as Red Hat Virtualization and VMware vSphere.”