Nvidia-backed startup Enfabrica has developed an innovative Ethernet memory pool called Emfasys, designed to alleviate memory bottlenecks in large-scale AI inference workloads. This technology allows AI servers to access up to 18TB of DDR5 memory via Ethernet connections, potentially reducing per-token generation costs by up to 50%. By leveraging Remote Direct Memory Access (RDMA) over Ethernet, Emfasys enables seamless integration with existing infrastructure, making it an attractive solution for data centers.
The Emfasys system features a 3.2 Tb/s (400 GB/s) throughput via its ACF SuperNIC chip and utilizes the CXL.mem protocol for low-latency memory access. It works with 4-way and 8-way GPU servers, enhancing compute resource efficiency and minimizing GPU memory waste. This scalable memory pooling solution is compatible with standard 400G or 800G Ethernet ports using RDMA.
Enfabrica's Emfasys system is currently being piloted and tested with select clients, showcasing promising results in addressing memory bottlenecks and improving efficiency in AI workloads. By providing a cost-efficient and flexible solution, Emfasys has the potential to revolutionize the way AI infrastructure is designed and utilized.