Get a Free Quote

Our representative will contact you soon.
Email
Tel/WhatsApp
Name
Company Name
Message
0/1000
aethlumis unveils next generation ai server tg990v3 delivering up to 40 improvement in large scale model training efficiency-2

News

Home >  News

Aethlumis Unveils Next-Generation AI Server TG990V3, Delivering Up to 40% Improvement in Large-Scale Model Training Efficiency

2025.11.18

Shenzhen, China — November 18, 2025 — As global demand for AI computing power continues to surge, Aethlumis today announced the launch of its new flagship AI server, the TG990V3. Having completed early-stage deployments across major internet companies, AI research institutions, and cloud service providers, the TG990V3 demonstrated significant performance breakthroughs in large-scale model training. In tests involving trillion-parameter workloads, the server achieved up to 40% improvement in overall training efficiency, with training cycles shortened by 30%–32% compared with the previous generation.

1-1.jpg

AI Infrastructure Reaches a New Inflection Point

With model sizes expanding from billions to trillions of parameters over the past two years, the gap between algorithm advancement and compute infrastructure has become increasingly evident. Aethlumis CEO, Wang Qihang, emphasized during the launch event:

“The pace of large-model evolution has surpassed the speed of traditional infrastructure upgrades. The TG990V3 is designed to deliver higher training efficiency without increasing hardware cost or power consumption, enabling AI teams to iterate faster and more sustainably.”

Industry analysts note that AI server competition has shifted from raw hardware stacking to system-level architectural optimization, a direction the TG990V3 embodies.

2.jpg

 

01.jpg

High-Efficiency Interconnect Architecture: 95%+ Peer-to-Peer Bandwidth Utilization in 1T-Parameter Training

Equipped with eight OAM GPU modules based on the OAI 2.0 standard, the TG990V3 adopts a next-generation multi-tier interconnect topology optimized for large-scale distributed training.

In internal testing conducted by a leading internet company on a trillion-parameter model:

• GPU-to-GPU interconnect efficiency remained stable at 95–96%

• Gradient synchronization latency dropped by 27%

• Total cluster throughput improved by 21%

A technical director from the evaluating AI lab commented:

“When training models at this scale, every percentage point of communication efficiency matters. The TG990V3 maintains stable performance even as the cluster size expands, which is a major advantage.”


02.jpg

I/O “Golden Ratio” Design: Eliminating the Data-Bottleneck in AI Training

A persistent challenge in AI training is that high-performance GPUs often remain underutilized due to I/O bottlenecks — insufficient networking bandwidth, limited storage throughput, or slow data loading pipelines.

To address this, Aethlumis introduced an industry-rare 8 : 8 : 16 (GPU : NIC : NVMe) architecture:

• 400 Gbps of dedicated network bandwidth per GPU

• Two independent NVMe Gen4/Gen5 SSDs per GPU

• Over 60% reduction in data loading latency

A domestic AI startup participating in early testing noted that GPU utilization remained consistently between 94%–97%, significantly higher than the 70–75% range seen on their existing servers.


03.jpg

Cluster-Grade Reliability: MTTR Under 3 Minutes, Linear Scaling Up to 92%

Designed for long-duration, large-scale training workloads, the TG990V3 features a fully modular architecture with hot-swappable GPU, fan, power, and networking modules.

Early customer tests reported:

• Mean Time to Repair (MTTR) reduced from 10–12 minutes to under 3 minutes

• 99.95% system availability during sustained 24/7 training cycles

• 92% linear scaling efficiency in thousand-card clusters

• This ensures unprecedented reliability for enterprises operating large distributed training environments.

 

Real-World Performance Metrics (from early adopters)

• 32% reduction in training time for trillion-parameter LLMs

• 60%+ improvement in data loading throughput

• 92% scaling efficiency in multi-node clusters

• 99.95% availability in long-duration tasks

Applications include:

• Large Language Model (LLM) training (LLaMA, GPT series, etc.)

• Multimodal model training (vision, audio, video, 3D)

• Enterprise AI platforms and inference clusters

• University & national-level research compute environments

3.jpg

Building the Next Generation of AI Infrastructure

Dr. Li Zhang, Vice President of Product at Aethlumis, concluded:

“The TG990V3 is not a simple hardware refresh. It represents a system-level optimization of the entire large-model training pipeline — including interconnect architecture, I/O subsystem, and intelligent operations. We designed it to support the next three years of accelerated growth in model scale.”

The TG990V3 is now available for enterprise-scale deployment and is already in use across several cloud platforms and AI companies.