Introduction to Intel Xeon 6
I am not going to go into much detail about the architecture of these new processors because it is an exercise that we already did at the time of their launch. Today we will focus on the performance analysis of these new processors with the luck we have had of having a server, in the cloud, based on two “Intel® Xeon® 6780E Processor” with 144 cores per headin an impressive 288-core configuration on the same server.
We will go into detail about the different commercial options of these new processors and what each of the series is most suitable for, and what their most optimal working environment is. Not all Xeon 6 are available at launch.
Intel is going to distribute this new generation of processors in two main series, depending on how they have been built, basically on the number of tiles that compose them. The Xeon 6 6700 have up to two tilesonly present in their version with P-Core of up to 86 cores, or a single tile with up to 48 P-Core or up to 144 E-Core cores like the processors that we have the opportunity to test today.
All of these new processors are based on the Intel 3 manufacturing process, with communication tiles built using an Intel 7 process. The new processors are not only much faster, up to 2.7 times faster per wattthan previous generations, with a completely renewed architecture and better oriented to specific applications, but also presents important technological improvements at the connective level.
They support more PCI Express 5.0 lines but also CXL in its 2.0 variant with up to 88 lines available on these processorsSome of them are also tremendously scalable with configurations of up to 8 sockets per motherboard. In our case, we were able to test a server with two sockets, adding up to 288 E-Core cores and a CPU consumption of up to 330W.
They are processors highly oriented to cloud application services, unstructured databases, high-performance network processing and 5G networks, security and analytics.
The different SKUs available
Currently we can only find on the market the Intel Xeon 6 6700Eall based on single Tile configurations with a minimum of 64 cores and 64 processing threads. The numbers scale with configurations of 96, 112, 128 and up to 144 cores. Thermal settings range from 205 to 330 watts and all of them have turbo frequencies, but never exceeding 3.3GHz.
There are a total of seven references with customized features that you can see in the following comparison table. Except for one of them, all of them support up to two processors per motherboard and what they also have in common is that they all support DDR5 ECC 6400MT/s memory. with a maximum of 1TB capacity configured in 8 memory channels making a potential 512-Bit bus.
Another element they share is PCI Express or CXL connectivity. They are a total of 88 PCI Express 5.0 lanes or CXL 2.0. A large number of links that allows us to connect these processors to ultra-high-speed network interfaces or advanced storage systems.
Some of them, but not all, also feature Intel UPI (Ultra Path InterConnect) links with up to four of these links with speeds ranging from 16 to 24GT/s in the most advanced models. The connectivity is therefore state-of-the-art.
The dedicated accelerators introduced in the Xeon 4 are also making their way into this generation. All have them, some with four and some with two, but all feature the following hardware-based accelerator technologies: Intel® QuickAssist Technology (QAT), Intel® Dynamic Load Balancer (DLB), Intel® Data Streaming Accelerator (DSA), and Intel® In-memory Analytics Accelerator (IAA). They also all feature AVX2, AES, Boot Guard, VT-d, and VT-x to support any of the best hypervisors on the market for machine virtualization.
The complete platform
Our test server is a dream come true, and we’ll give you more details later, but for this dream to come true, we need a complete platform that supports these new processors. These processors use the FCLGA4710 socket and therefore continue to use the same platform as the fourth and fifth generation Xeons. For these motherboards, a BIOS update would be enough to gain support for the new processors, although we might lose connectivity, depending on what the server motherboard currently supports.
The Intel C741 also known as Intel Ice Lake IEH is a low-end chipsetnow everything important is fully supported by the processor. Among the few things it can offer us we can find 20 PCI Express 3.0 lines, 20 SATA 6Gbps ports and some USB 2.0 and USB 3.0 ports. It also takes care of the supporting Ethernet connectivity, usually a Gigabit port.
Another important component of these new processors is the memory, with support for high-speed DDR5 memory, 6400MTs in these models based on efficient cores. ECC memory that we can configure in 16 modules with a maximum amount of 128GB per slotwhich would allow us to have a maximum system RAM capacity of up to 2TB, 1TB of RAM per processor. The memory is configured as eight-channel 64-Bit, 512-Bit per processor.
Typically, we will see these processors supporting the latest generation of network interfaces, Ethernet or fiber optics, with up to 200 Gigabit bandwidth. Their purpose is precisely to provide great support for this type of high-speed interface, both for the network environment and for mass storage.
Our test server
The test server that we had the opportunity to test for a month, hosted in the Intel Developer Cloud, is composed of an Intel BeechnutCity test platform which is a server development platform for this new generation of processors.
It has capacity for 16 memory banks, per Socket, with capacity for up to 2TB of DDR-6400MT/s RAMa total of 1TB per processor. On a dual-socket FCLGA4710 motherboard compatible with multiple Xeon generations, including this latest one.
Our server comes equipped with Two Xeon 6780E processors with 106MB cache per heada total of 288 efficient cores with a processor consumption of 330W and turbo modes of up to 3GHz, with a base frequency of 2.2GHz. It is the most powerful efficient-based Xeon at the moment. Each processor costs just over 10,000 euros.
The most modest thing about this machine is the storage, provided by a Samsung PM9A3 4TB capacity. It is a PCI Express storage unit in U.2 format with read speeds close to 7000MBps and a read processing power of one million IOPs. It is a good unit, but nothing extraordinary considering the connectivity capabilities of these processors.
The network connectivity of this BeechnutCity platform is top-notch with a Mellanox MT2892 chipset (now NVIDIA ConnectX-6 Dx) with dual 100Gbps Ethernet interfaceThis chipset allows the configuration of a 200GBps interface over Ethernet, which is certainly impressive, but it can also be implemented in two 100GBps interfaces which should, well configured, offer the same performance or load balancing and fault tolerance while maintaining a truly impressive transfer capacity.
An impressive server, without a doubt, on which we have been able to perform various performance tests, all under a Linux environment based on CentOS Stream 9 managed through a Jupyterlab console.
Test of performance
The first thing we wanted to test is how these processors, with up to 330W thermal design, behave in continuous tests, in terms of energy consumption. We have been able to verify, always through Intel’s monitoring utilities, that the average consumption in performance tests is far from this thermal design energy consumption limit. In our tests, the consumption is more around 160W than the 330W of the specifications. These data, with this large number of cores, show us a consumption of less than 1.5W per core under load, really impressive. AMD EPYC 9754 that we add to this comparison is more of the 2W per core, and offers lower performance in many of the tests, something that AMD will surely correct with the new generation of EPYC processors based on Zen5.
In the rest of the performance tests the results are simply spectacular, especially taking into account the processor consumption, which is precisely the objective of this range, to produce performance-consumption efficiency levels of up to 2.3 times over the previous generation and superior to AMD’s EPYC 8000 based on Zen 4 and Zen4c cores.
Passmark Performance Test Linux
Linux Kernel Compilation 6.8. Seconds, less is better.
Node.js Compilation 21.7.2. Less is better
LLVM Compilation 16.0. Less is better
OpenSSL 3.3 SHA256.
John The Ripper: WPA PSK
PostgreSQL 16.
RocksDB 9.
CoreMark 1
QuantLib 1.32
Blender 4.1. Less is better
OSPRay 3.1
uvg266 0.4.1
StockFish 16.1
Average consumption (W).
Conclusion
In this generation, Intel is customizing the use of its Xeon processors, seeking to specialize its processor solutions to offer the best results depending on the work environment. This undoubtedly comes largely from the new tile-based architecture that allows for a much more specialized configuration.
This processor we tested is not only extremely fast in the right application environments, but it is also extremely efficient in executing the tasks for which it is intended. It allows significant energy savings while offering impressive performance results in applications that allow high levels of parallel execution.
It certainly leaves us wanting to test the performance of its brothers based on performance cores, which will begin to reach the market after the summer. In the meantime, these processors, with up to 144 cores and, in the future, up to 288 cores, have demonstrated their full potential.
End of Article. Tell us something in the Comments!
Add Comment