Skip to content
Surf Wiki
Save to docs
general

From Surf Wiki (app.surf) — the open knowledge base

Ampere (microarchitecture)

Ampere is the codename for a graphics processing unit (GPU) microarchitecture developed by Nvidia as the successor to both the Volta and Turing architectures. It was officially announced on May 14, 2020, and is named after French mathematician and physicist André-Marie Ampère.


Launched
Nvidia
.mw-parser-output .plainlist ol,.mw-parser-output .plainlist ul{line-height:inherit;list-style:none;margin:0;padding:0}.mw-parser-output .plainlist ol li,.mw-parser-output .plainlist ul li{margin-bottom:0}TSMCSamsung
TSMC N7 (professional) Samsung 8N (consumer)
GA10x
GeForce RTX 30 series
RTX A series
A100
192 KB per SM (professional)128 KB per SM (consumer)
2 MB to 6 MB
GDDR6GDDR6XHBM2
PCIe 4.0
DirectX 12 Ultimate (Feature Level 12_2)
Direct3D 12.0
Shader Model 6.8
OpenGL 4.6
Compute Capability 8.6
Vulkan 1.4
OpenCL 3.0
.mw-parser-output .hlist dl,.mw-parser-output .hlist ol,.mw-parser-output .hlist ul{margin:0;padding:0}.mw-parser-output .hlist dd,.mw-parser-output .hlist dt,.mw-parser-output .hlist li{margin:0;display:inline}.mw-parser-output .hlist.inline,.mw-parser-output .hlist.inline dl,.mw-parser-output .hlist.inline ol,.mw-parser-output .hlist.inline ul,.mw-parser-output .hlist dl dl,.mw-parser-output .hlist dl ol,.mw-parser-output .hlist dl ul,.mw-parser-output .hlist ol dl,.mw-parser-output .hlist ol ol,.mw-parser-output .hlist ol ul,.mw-parser-output .hlist ul dl,.mw-parser-output .hlist ul ol,.mw-parser-output .hlist ul ul{display:inline}.mw-parser-output .hlist .mw-empty-li{display:none}.mw-parser-output .hlist dt::after{content:": "}.mw-parser-output .hlist dd::after,.mw-parser-output .hlist li::after{content:"\a0 · ";font-weight:bold}.mw-parser-output .hlist dd:last-child::after,.mw-parser-output .hlist dt:last-child::after,.mw-parser-output .hlist li:last-child::after{content:none}.mw-parser-output .hlist dd dd:first-child::before,.mw-parser-output .hlist dd dt:first-child::before,.mw-parser-output .hlist dd li:first-child::before,.mw-parser-output .hlist dt dd:first-child::before,.mw-parser-output .hlist dt dt:first-child::before,.mw-parser-output .hlist dt li:first-child::before,.mw-parser-output .hlist li dd:first-child::before,.mw-parser-output .hlist li dt:first-child::before,.mw-parser-output .hlist li li:first-child::before{content:" (";font-weight:normal}.mw-parser-output .hlist dd dd:last-child::after,.mw-parser-output .hlist dd dt:last-child::after,.mw-parser-output .hlist dd li:last-child::after,.mw-parser-output .hlist dt dd:last-child::after,.mw-parser-output .hlist dt dt:last-child::after,.mw-parser-output .hlist dt li:last-child::after,.mw-parser-output .hlist li dd:last-child::after,.mw-parser-output .hlist li dt:last-child::after,.mw-parser-output .hlist li li:last-child::after{content:")";font-weight:normal}.mw-parser-output .hlist ol{counter-reset:listitem}.mw-parser-output .hlist ol>li{counter-increment:listitem}.mw-parser-output .hlist ol>li::before{content:" "counter(listitem)"\a0 "}.mw-parser-output .hlist dd ol>li:first-child::before,.mw-parser-output .hlist dt ol>li:first-child::before,.mw-parser-output .hlist li ol>li:first-child::before{content:" ("counter(listitem)"\a0 "}H.264H.265
H.264H.265AV1
8-bit10-bit
NVENC
DisplayPort 1.4aHDMI 2.1
Turing (consumer) Volta (professional)
Ada Lovelace (consumer) Hopper (datacenter)
Supported

Ampere is the codename for a graphics processing unit (GPU) microarchitecture developed by Nvidia as the successor to both the Volta and Turing architectures. It was officially announced on May 14, 2020, and is named after French mathematician and physicist André-Marie Ampère.

Nvidia announced the Ampere architecture GeForce 30 series consumer GPUs at a GeForce Special Event on September 1, 2020. Nvidia announced the A100 80 GB GPU at SC20 on November 16, 2020. Mobile RTX graphics cards and the RTX 3060 based on the Ampere architecture were revealed on January 12, 2021.

Nvidia announced Ampere's successor, Hopper, at GTC 2022, and "Ampere Next Next" (Blackwell) for a 2024 release at GPU Technology Conference 2021.

Architectural improvements of the Ampere architecture include the following:

  • CUDA Compute Capability 8.0 for A100 and 8.6 for the GeForce 30 series

  • TSMC's 7 nm FinFET process for A100

  • Custom version of Samsung's 8 nm process (8N) for the GeForce 30 series

  • Third-generation Tensor Cores with FP16, bfloat16, TensorFloat-32 (TF32) and FP64 support and sparsity acceleration. The individual Tensor cores have with 256 FP16 FMA operations per clock 4x processing power (GA100 only, 2x on GA10x) compared to previous Tensor Core generations; the Tensor Core Count is reduced to one per SM.

  • Second-generation ray tracing cores; concurrent ray tracing, shading, and compute for the GeForce 30 series

  • High Bandwidth Memory 2 (HBM2) on A100 40 GB & A100 80 GB

  • GDDR6X memory for GeForce RTX 3090, RTX 3080 Ti, RTX 3080, RTX 3070 Ti

  • Double FP32 cores per SM on GA10x GPUs

  • NVLink 3.0 with a 50 Gbit/s per pair throughput

  • PCI Express 4.0 with SR-IOV support (SR-IOV is reserved only for A100)

  • Multi-instance GPU (MIG) virtualization and spatial GPU partitioning feature in A100 supporting up to seven instances

  • PureVideo feature set K hardware video decoding with AV1 hardware decoding for the GeForce 30 series and feature set J for A100

  • 5 NVDEC for A100

  • Adds new hardware-based 5-core JPEG decode (NVJPG) with YUV420, YUV422, YUV444, YUV400, RGBA. Should not be confused with Nvidia NVJPEG (GPU-accelerated library for JPEG encoding/decoding)

  • GA100

  • GA102

  • GA103

  • GA104

  • GA106

  • GA107

  • GA10B

Comparison of Compute Capability: GP100 vs GV100 vs GA100

GPU featuresNvidia Tesla P100Nvidia Tesla V100Nvidia A100
GPU codenameGP100GV100GA100
GPU architecturePascalVoltaAmpere
Compute capability6.07.08.0
Threads / warp323232
Max warps / SM646464
Max threads / SM204820482048
Max thread blocks / SM323232
Max 32-bit registers / SM655366553665536
Max registers / block655366553665536
Max registers / thread255255255
Max thread block size102410241024
FP32 cores / SM646464
Ratio of SM registers to FP32 cores102410241024
Shared Memory Size / SM64 KBConfigurable up to 96 KBConfigurable up to 164 KB

Comparison of Precision Support Matrix

Legend:

  • FPnn: floating point with nn bits
  • INTn: integer with n bits
  • INT1: binary
  • TF32: TensorFloat32
  • BF16: bfloat16

Comparison of Decode Performance

H.264 decode (1080p30)H.265 (HEVC) decode (1080p30)VP9 decode (1080p30)
162222
75157108
DieGA100GA102GA103GA104GA106GA107GA10BGA10F
826 mm2628 mm2496 mm2392 mm2276 mm2200 mm2448 mm2?
54.2B28.3B22B17.4B12B8.7B21B?
65.6 MTr/mm245.1 MTr/mm244.4 MTr/mm244.4 MTr/mm243.5 MTr/mm243.5 MTr/mm246.9 MTr/mm2?
87663221
12884604830201612
819210752768061443840256020481536
512336240192120806448
192112969648323216
512336240192120806448
N/A8460483020812
24 MB10.5 MB7.5 MB6 MB3 MB2.5 MB3 MB1.5 MB
192 KBper SM128 KB per SM192 KBper SM128 KBper SM
40 MB6 MB4 MB4 MB3 MB2 MB4 MB1 MB

The Ampere-based A100 accelerator was announced and released on May 14, 2020. The A100 features 19.5 teraflops of FP32 performance, 6912 FP32/INT32 CUDA cores, 3456 FP64 CUDA cores, 40 GB of graphics memory, and 1.6 TB/s of graphics memory bandwidth. The A100 accelerator was initially available only in the 3rd generation of DGX server, including 8 A100s. Also included in the DGX A100 is 15 TB of PCIe gen 4 NVMe storage, two 64-core AMD Rome 7742 CPUs, 1 TB of RAM, and Mellanox-powered HDR InfiniBand interconnect. The initial price for the DGX A100 was $199,000.

Comparison of accelerators used in DGX:

  • GeForce MX series

    • GeForce MX570 (mobile) (GA107)
  • GeForce 20 series

    • GeForce RTX 2050 (mobile) (GA107)
  • GeForce 30 series

    • GeForce RTX 3050 Laptop GPU (GA107)
    • GeForce RTX 3050 (GA106 or GA107)
    • GeForce RTX 3050 Ti Laptop GPU (GA107)
    • GeForce RTX 3060 Laptop GPU (GA106)
    • GeForce RTX 3060 (GA106 or GA104)
    • GeForce RTX 3060 Ti (GA104 or GA103)
    • GeForce RTX 3070 Laptop GPU (GA104)
    • GeForce RTX 3070 (GA104)
    • GeForce RTX 3070 Ti Laptop GPU (GA104)
    • GeForce RTX 3070 Ti (GA104 or GA102)
    • GeForce RTX 3080 Laptop GPU (GA104)
    • GeForce RTX 3080 (GA102)
    • GeForce RTX 3080 12 GB (GA102)
    • GeForce RTX 3080 Ti Laptop GPU (GA103)
    • GeForce RTX 3080 Ti (GA102)
    • GeForce RTX 3090 (GA102)
    • GeForce RTX 3090 Ti (GA102)
  • Nvidia Workstation GPUs (formerly Quadro)

    • RTX A1000 (mobile) (GA107)
    • RTX A2000 (mobile) (GA106)
    • RTX A2000 (GA106)
    • RTX A3000 (mobile) (GA104)
    • RTX A4000 (mobile) (GA104)
    • RTX A4000 (GA104)
    • RTX A5000 (mobile) (GA104)
    • RTX A5500 (mobile) (GA103)
    • RTX A4500 (GA102)
    • RTX A5000 (GA102)
    • RTX A5500 (GA102)
    • RTX A6000 (GA102)
    • A800 Active
  • Nvidia Data Center GPUs (formerly Tesla)

    • Nvidia A2 (GA107)
    • Nvidia A10 (GA102)
    • Nvidia A16 (4 × GA107)
    • Nvidia A30 (GA100)
    • Nvidia A40 (GA102)
    • Nvidia A100 (GA100)
    • Nvidia A100 80 GB (GA100)
    • Nvidia A100X
    • NVIDIA A30X
  • Tegra SoCs

    • AGX Orin (GA10B)
    • Orin NX (GA10B)
    • Orin Nano (GA10B)
    • T239 (Nintendo Switch 2, GA10F)
TypeGA10BGA107GA106GA104GA103GA102GA100
—N/aGeForce MX570 (mobile)—N/a—N/a—N/a—N/a—N/a
—N/aGeForce RTX 2050 (mobile)—N/a—N/a—N/a—N/a—N/a
—N/aGeForce RTX 3050 LaptopGeForce RTX 3050GeForce RTX 3050 Ti LaptopGeForce RTX 3050GeForce RTX 3060 LaptopGeForce RTX 3060GeForce RTX 3060GeForce RTX 3060 TiGeForce RTX 3070 LaptopGeForce RTX 3070GeForce RTX 3070 Ti LaptopGeForce RTX 3070 TiGeForce RTX 3080 LaptopGeForce RTX 3060 TiGeForce RTX 3080 Ti LaptopGeForce RTX 3070 TiGeForce RTX 3080GeForce RTX 3080 TiGeForce RTX 3090GeForce RTX 3090 Ti—N/a
—N/aRTX A1000 (mobile)RTX A2000 (mobile) RTX A2000RTX A3000 (mobile)RTX A4000 (mobile)RTX A4000RTX A5000 (mobile)RTX A5500 (mobile)RTX A4500RTX A5000RTX A5500RTX A6000—N/a
—N/aNvidia A2Nvidia A16—N/a—N/a—N/aNvidia A10Nvidia A40Nvidia A30Nvidia A100
AGX OrinOrin NXOrin Nano—N/a—N/a—N/a—N/a—N/a—N/a
  • List of eponyms of Nvidia GPU microarchitectures

  • List of Nvidia graphics processing units

  • Nvidia NVENC

  • Nvidia NVDEC

  • Nvidia A100 Tensor Core GPU Architecture whitepaper

  • Nvidia Ampere GA102 GPU Architecture whitepaper

  • Nvidia Ampere Architecture

  • Nvidia A100 Tensor Core GPU

  • Nvidia Ampere Architecture In-Depth

Want to explore this topic further?

Ask Mako anything about Ampere (microarchitecture) — get instant answers, deeper analysis, and related topics.

Research with Mako

Free with your Surf account

Content sourced from Wikipedia, available under CC BY-SA 4.0.

This content may have been generated or modified by AI. CloudSurf Software LLC is not responsible for the accuracy, completeness, or reliability of AI-generated content. Always verify important information from primary sources.

Report