AI expands the boundaries of our capabilities
Stimulates our imagination and creativity

Arithmetic power on standby.

Phantom AI builds 'Firefly II' deep learning training platform

“Firefly II” takes "task-level time-sharing" as its core concept, and the scheduling system responds in seconds, so that every researcher can have a smooth training experience. Meanwhile, the platform is equipped with powerful software layer support: high-performance arithmetic library (hfai.nn), distributed training communication framework (hfreduce), and high-capacity and high-bandwidth file system (3FS) specially designed for AI development, so that the AI model can be expanded to multiple nodes for massively parallel training, and experience the ultimate performance.

NOI/ACM Gold Team Continuously Optimizes Core Operators LSTM Operator Faster 20%-6x Attention Operator Faster 30%

Data sources >

Optimized allreduce solution for Firefly II's custom hardware
Good communication capabilities without specialized hardware
BERT-Large 20% training speedup at 100 nodes

Data sources >

Self-developed distributed parallel file system
Squeezing physical high-speed network bandwidth to explore performance boundaries
IO response: 1.8 billion times/second
Read and write bandwidth: 7.0 TB/sec.

Data sources >

96 %

Cluster utilization

85 %

GPU Utilization Rate

8.0 TB/s

500 GB/s

Data based on cluster utilization in February 2022

| hfai python train.py -- --nodes 1