AI / ML | AI Server

  |  Sample Page2

AGS8200

The Edgecore AGS8200 is a cutting-edge, high-performance GPU-based server tailored to meet the demands of AI/ML applications. Designed to excel in tasks such as training large language models, automating processes, and enhancing object classification and recognition, this server offers unrivaled performance and scalability.

At the heart of the AGS8200 lies the formidable Intel® Habana® Gaudi® 2 processors, eight in total, and dual Xeon® Sapphire-Rapids processors. These components unite to create a computing powerhouse, ready to tackle a diverse range of deep learning workloads with exceptional speed and precision.

Why Choose the AGS8200 ?

The Power of Intel® Gaudi® 2

Audi
Audi

Intel® Gaudi® 2 key benefits

■ reference/source for MLPerf image

Features

Edgecore AGS8200 is ideal for modern AI (Artificial Intelligence) and ML (Machine Learning) applications. Powered by Intel® Gaudi® 2 AI accelerator, the AGS8200 is suitable for LLM (Large Language Models) training and inferencing, allowing customers to efficiently harness the power of AI.

The system is designed with eight Intel® Habana® Gaudi® 2 processors and dual Xeon® Sapphire-Rapids processors. The Gaudi® 2 processor integrates 96GB HBM2E memory and 24 NICs of 100Gbps RoCEv2 RDMA. The 24 x 100Gbps NICs offer all-to-all connectivity and scale-out internally and externally for training, fine-tuning, and other DL processing.

Each Gaudi® 2 processor has 21 x 100Gbps non-blocking, all-to-all connectivity to other Gaudi® 2 processors within the server, allowing training across all eight Intel® Gaudi® 2 processors without requiring an external Ethernet switch.

Each AGS8200 supports 6 QSFP-DD ports for scale-out. The 400Gbps ports can be connected to 400Gbps switches, or 100Gbps switches via breakout cables, in racks and clusters of Intel® Gaudi® 2-based nodes.

The server is able to include 16 x HDD/SSD + 8 x NVMe or 8 x HDD/SSD + 16 x NVMe for storage that supports RAID HBA with 0/1/10/5/6/50/60.

■ Featuring eight Habana® Gaudi® 2 AI training processors
■ Dual 4th Gen Intel® Xeon® scalable processors
■ Expanded networking capacity with 24 x 100Gbps RoCE ports integrated into every Gaudi® 2
■ 700 GB/second scale within the server and 2.4TB/second scale out
■ Ease of system build or migration with Habana® SynapseAI® soſtware stack
■ Standardized architecture and Ethernet instead of proprietary Infiniband and NVLINK

Specifications

Form Factor
■ 8U

Compute Node
■ CPU: Sapphire Rapids, 2 Sockets
Intel® Xeon® Platinum 8454H, 32c, 64 threads, 82.5MB, 270 W
■ PCH: Emmitsburg
■ Memory: Upto 2TB 16x DDR5 memory slots per CPU
■ Operating System: Ubuntu 20.04
■ BIOS: 32MB Flash

GPU
■ 8 x OAM (Intel Habana HL-225H/C)

Input/Output
■ Front: 2 x USB 2.0/3.0, 1 x VGA, 1 x UID, 1 x PWR
■ Rear: 2 x USB 2.0/3.0, 1 x VGA, 1 x RJ-45, 1 x UID

Scale-Out Interface
■ RDMA (ROCE v2)
■ 24 x 100Gbps
■ 6 x QSFP-DD

Storage
■ Internal: 2 x M.2
■ Front: 16 x HDD/SSD+ 8 x NVMe or 8 x HDD/SSD+ 16 x NVMe

BMC
■ AST2600

TPM 2.0

CD-ROM
■ Support external USB CD-ROM

PSU
■ System: 1+1 CRPS 2700 W redundant/hot-swappable AC/DC
■ GPU: 3+3 CRPS 3000 W redundant/hot-swappable AC/DC

Fans
■ 14+1 hot-swappable fans

Dimensions
■ 900 mm x 447 mm x 352mm

Operating Temperature
■ 5°C-35°C

Expansion Slots
■ 1 x OCP 3.0
■ 8 x PCle Slots

Soſtware
■ SynapseAl: 1.13.0
■ Kernel: 5.4.0 and above
■ Python: 3.10
■ PyTorch: 2.1.0
■ TensorFlow: 2.13.1
■ Open MPI: 4.1.5
■ Libfabric: 1.16.1 and above
■ Transformers: >= 4.33.0, <4.35.0

For more information, contact us.

Related Solutions

Related Resource

2024 Product Catalogue

AI/ML White Paper

AI/ML White Paper

381 KB 337 Downloads
Enterprise SONiC by Broadcom®

Enterprise SONiC by Broadcom®

894 KB 285 Downloads
EVPN Multi-Homing

EVPN Multi-Homing

987 KB 136 Downloads

Japan’s first generative AI supercomputer for the pharmaceutical industry, powered by NVIDIA DGX systems.

Accelerate innovation in Japan’s pharmaceutical industry with Edgecore data center switches

Now more than a dozen years after OCP was formed, we can ask the question: “Did it work?”

From Green Grid to Open Computing Project: A Commercial Success!

Edgecore’s 800G switching solution key features include low latency, high radix, end-to-end congestion control, and dynamic load balancing.

Build your AI data center with Edgecore total AI solution