One Pillar in Textarossa is to improve the energy efficiency of future servers by developing a new high-efficiency cooling system at node and system level. This innovative technology, based on two-phase cooling technology, is developed by InQuattro and will be fully integrated in optimized multi-level runtime resource management to increase resource exploitation and performance.
Two Integrated Development Vehicle (IDVs) platforms will be developed to demonstrate the capability of this two-phase liquid-cooling technology.
With IDV-A, the project does not develop a new HPC node but focuses on the development of a blade based on a HPC node under development, where the standard single-phase liquid cooling is replaced by this innovative two-phase liquid cooling.
To demonstrate the improvement brought by such technology, it is important to compared it with one state-of-the-art implementation of single-phase liquid cooling. Atos has then selected one node developed for one hybrid blade in OpenSequana. OpenSequana is a new concept where all interfaces of BullSequanaXH3000 blades are published so that any OEM can develop one blade embedding one or several nodes of its interest and benefit from the BullSequanaXH3000 infrastructure regarding administration, power and cooling.
This node embeds one host with two CPU and four 700W Nvidia Hopper GPU.
The BullSequana Direct Liquid Cooling (DLC) technology is based on cold plates and water blocks inside the blades connected to a secondary loop inside the rack, with a primary loop at up to 40°C at rack input to allow free cooling (no energy spent to cool the liquid of the primary loop). This technology is capable of cooling efficiently this last generation of GPU component but might reach its limit in few years when the component consumption still increases with maximum case temperatures constantly decreasing.
Two-phase liquid cooling is an interesting opportunity to improve the cooling efficiency beyond the current technology.
The first challenge is to demonstrate the capability to evacuate the heat generated by several components with 700W peek consumption. Yet one OpenSequana blade is 100% cooled with liquid as there is no air to maximize the blade density and achieve a cooling efficiency close to 1. In IDV-A, as a first step, only the CPU and GPU with high consumption will be cooled with two-phase liquid cooling, by developing a new water blocks compatible with the two-phase technology. The standard DLC cold plate will be reused to cool other components such as DIMM, disk, Voltage regulators, interconnect controllers… One heat exchanger between the two-phase liquid and the secondary loop will be added inside the blade.
The second challenge will be to remove this additional level of heat exchanger and to study the evolution of the rack to provide a secondary loop with the InQuattro liquid.
Leading Partner: ATOS