Gaming

AMD unveils first MLPerf 4.1 results on its Instinct MI300X accelerators

Geeknetic AMD reveals the first results of MLPerf 4.1 on its Instinct MI300X accelerators 1

MLPerf is one of the methods to measure the performance of hardware when handling large language models (LLM) used in artificial intelligence systems.

AMD has demonstrated what its accelerators are capable of AMD Instinct MI300X along with the latest version of the open source ROCm suite to get the most out of it. On this occasion, the first AMD benchmark has been carried out in MLPerf v4.1, just released, using the LLaMA2-70B language model, one of the most advanced today.

The tests have been carried out in two scenarios, one Offline where the throughput in tokens per second is maximized, Itro that simulates a server environment, with latency limits that seek to see the system’s ability to respond quickly to tasks that require little latency. In addition, several tests have been performed with different CPU configurations, including EPYC Genoa and Turing processors.

One of the advantages of the 192 GB of HBM that these MI300X cards integrate is that they allow large language models to be loaded directly onto them, achieving the largest HBM memory capacity on the market.

MLPerf v4.1 with AMD MI300X and EPYC Genoa and Turing CPUs

8 cards were used in this test. AMD Instinct MI300X and seeks to see the performance they obtain in combination with AMD EPYC platforms in AI workloads.

Results with two 4th Generation AMD EPYC GENOA AMD EPYC 93374F processors show 2-3% performance slower than the NVIDIA DGX H100 with Intel Xeon processors in both server and offline environments with FP8 accuracy.

In the case of the test with two AMD EPYC Turin, still in preview mode, these fifth-generation CPUs manage to slightly outperform the Intel+NVIDIA solutions in the case of Offline mode, also in FP8 precision.

Geeknetic AMD reveals the first results of MLPerf 4.1 on its Instinct MI300X 2 accelerators

AMD has also run other MLPerf v4.1 passes to compare scaling between one and eight MI300X cards. In the tests, performance can be seen to be multiplied by figures around 8, achieving practically perfect linear scaling.

In addition, they make special mention of the fact that The 192GB of HBM memory can load the LLaMa 2 70B model directly onto a single GPU.

Geeknetic AMD reveals the first results of MLPerf 4.1 on its Instinct MI300X 3 accelerators

Dell also wanted to show the performance of its systems PowerEdge XE968 equipped with 8 MI300X accelerators along with two Intel Xeon Platinum 8460Y+ processors

Geeknetic AMD reveals the first results of MLPerf 4.1 on its Instinct MI300X 4 accelerators

End of Article. Tell us something in the Comments!

Source link