Nvidia Hopper H100: Huge 4nm GPU designed for artificial intelligence

Nvidia Hopper H100: Connecting with Artificial Intelligence

To the general public, Nvidia is first and foremost a leading manufacturer of graphics card chips, with its GeForce series leading the market for more than a decade. But under the leadership of CEO Jensen Huang, the company has always seen far beyond the gamer market and has very quickly positioned itself as a major player in hardware acceleration. An area where demand is growing, even exponentially, given the huge demand for resources in areas such as artificial intelligence and model management (climate, roads, etc.).

Two different approaches

For this sector, Nvidia has historically always had two different approaches. The first was to adapt its graphics architectures to two variants, one consumer and one professional - as in the case of Ampere; the second was to create two separate architectures, each targeting a specific market, as in the case of Volta, which was designed specifically for the acceleration sector.

Hopper is part of this second approach. The architecture was designed for the acceleration sector to meet expectations in terms of AI or even omnichannel. And the least we can say is that two years after the GA100 chip (Ampere architecture), Nvidia introduced the H100 chip, which looks pretty impressive on paper. With 80 billion transistors spread over an area of 814 mm², this chip is quite different from its predecessor, which was "limited" to 54.2 billion transistors over an area of 828 mm². These numbers are not at all deceptive, as Nvidia has abandoned the 7nm process in favor of the 4nm process offered by TSMC (Noda N4). The chip also consumes a maximum of 700W, much more than the 500W of the previous generation.

Nvidia Transformer Engine

Equipping the chip

The chip features a PCIe 5.0 interface and is surrounded by a maximum of 80GB of dedicated HBM3 memory - enough to provide a bandwidth of 3TB/s. The specific compute units that Nvidia calls gas pedals have been overhauled, with the addition of a fourth-generation Tensor Core dedicated to AI, which are claimed to be six times faster than the GA100 chip. The number of CUDA cores has increased from 6,912 to 16,896. This gives a raw performance three times faster than the previous generation gas pedal in just two years.

Transformer Engine

Nvidia has also developed a new gas pedal called Transformer Engine. It is designed to accelerate the processing of AI-related models such as real-time translation, query interpretation, image analysis, and even health and climate research. Training neurons that used to take days can now be done in just a few hours. This fact will interest many, not least Google, whose BERT algorithm uses this type of engine to better understand user queries and provide increasingly accurate answers. As an example, Nvidia said that a job that used to take 7 days with 8,000 GPUs will now take just 20 hours with Hopper chips.


This new GPU will be available to Nvidia partners starting in the third quarter. It can be purchased individually (in PCIe format) or as DGX racks with 8 modules or SuperPOD cabinets with 32 modules. A maximum of 256 modules can be interconnected using an NVLink switch capable of 70.4 TB/second module-to-module communication. Finally, the program already includes supercomputers, in particular Nvidia's Eos division - a supercomputer that the company will use itself and offer to its partners - which will house 576 DGX racks representing 4,608 GPUs. Eos will have a computing power of 275 petaflops in FP64, making it the second most powerful supercomputer in the world after Fugaku (442 petaflops). Now we just have to wait for Nvidia's announcements in the consumer sector: it is likely that the company will announce the successor to Ampère in the coming months.

Nvidia Hopper H100 Specifications

Key findings

1. Nvidia's Hopper H100 GPU, unveiled at GTC, marks the company's entry into the hardware acceleration market for the professional sector, particularly for artificial intelligence applications. With an impressive 80 billion transistors etched on a 4nm process, the H100 is able to meet the growing demand for accelerated computing.

2. Historically, Nvidia has approached hardware acceleration with two strategies: adapting consumer graphics architectures for professional use and creating standalone architectures designed for specific markets. Hopper follows the latter approach and is designed specifically for acceleration sectors such as artificial intelligence and omnichannel applications.

3. The H100 chip boasts significant improvements over its predecessors. It contains 80 billion transistors on an 814 mm² die manufactured using TSMC's 4nm process. With a maximum power consumption of 700W and up to 80GB of dedicated HBM3 memory, the chip offers significant performance gains.

4. The chip features fourth-generation tensor cores designed for artificial intelligence, delivering up to six times faster performance than previous versions. The number of CUDA cores has also increased significantly, resulting in three times the raw performance of the previous generation "gas pedal".

5. Nvidia is introducing the Transformer Engine, designed to accelerate processing of AI-related models, including real-time translation, query interpretation, image analysis, and health and climate research. Neural network training, which used to take days, can now be done in hours.

6. Although the Hopper architecture is aimed at the professional sector, anticipation for announcements in the consumer sector is still high. With a successor to the Ampère architecture likely on the horizon, Nvidia continues to shape the landscape of accelerated computing.


leave feedback