Inside Meta’s DSF: Multi-Vendor Silicon Powering AI Networks

Table of Contents

Meta’s AI data center network is built around a carefully engineered philosophy: vendor diversity at scale. Rather than depending on a single networking silicon provider, Meta’s Disaggregated Scheduled Fabric (DSF) intentionally mixes merchant silicon, custom ASICs, and in-house designs to optimize cost, performance, and long-term supply resilience.

At hyperscale, the network itself becomes a programmable system. DSF reflects that reality.

🧠 What DSF Is Solving
#

Large-scale AI training workloads generate traffic patterns that traditional Ethernet fabrics struggle with:

Extreme east-west bandwidth
Microburst congestion from collective operations
Sensitivity to tail latency during synchronization

DSF addresses this by separating scheduling intelligence from packet forwarding, enabling predictable performance even at tens of thousands of GPUs.

🔀 Switch ASICs: The Core Fabric
#

Meta uses different switch silicon for different layers of its leaf-spine topology, matching each ASIC’s strengths to a specific role.

Broadcom: The DSF Backbone
#

Broadcom remains the dominant supplier in Meta’s scheduled fabric.

Jericho3-AI
Deployed in the Arista 7700R4 as the DSF Leaf Switch.
Designed for AI traffic with:
- Very deep buffers
- Zero-packet-loss behavior
- Deterministic congestion handling

Ramon3
Used in the Arista 7720R4 as the DSF Spine Switch.
Ramon3 aggregates multiple Jericho3-AI devices into a single, massive non-blocking fabric domain.

Tomahawk5 (TH5)
Powers Meta’s self-designed Minipack3 switch.
- 51.2 Tbps switching capacity
- Optimized for power efficiency per bit
- Ideal for dense fabric deployments

Cisco: Competitive Merchant Silicon
#

Silicon One G200
Used in the Cisco 8501 platform.
- Direct competitor to Tomahawk5
- 51.2 Tbps throughput
- Runs Meta’s internal network OS, FBOSS, demonstrating full software portability across vendors

NVIDIA: Expanding Beyond Scheduled Fabrics
#

Spectrum-4
Deployed in Minipack3N systems within Meta’s Non-Scheduled Fabric (NSF).
- 51.2 Tbps Ethernet switching
- Used where deterministic scheduling is less critical than raw throughput

🧩 Network Interface Controllers: The Edge of the Fabric
#

At the server boundary, Meta has moved away from generic NICs toward semi-custom designs.

Marvell + Meta: FBNIC
#

Custom 5nm ASIC (FBNIC)
Co-developed with Marvell
- Multi-host NIC supporting up to four independent hosts
- PCIe Gen5 connectivity
- Up to 4×100GE network interfaces
- Hardware offloads optimized for AI collectives and low-latency messaging

FBNIC is foundational to Meta’s goal of making the network a first-class accelerator rather than a passive transport.

🤖 AI Accelerators with Native Networking
#

MTIA: Meta’s In-House AI Chip
#

Meta Training and Inference Accelerator (MTIA)
Integrates networking directly on the accelerator die.
- Native RoCE (RDMA over Converged Ethernet) support
- Direct participation in the Ethernet-based DSF
- Reduced CPU involvement and lower end-to-end latency

This tight coupling of compute and networking is critical for scaling training clusters efficiently.

📊 Meta AI Network Silicon Overview
#

Platform	Network Role	Supplier	Chip
Arista 7700R4	DSF Leaf	Broadcom	Jericho3-AI
Arista 7720R4	DSF Spine	Broadcom	Ramon3
Minipack3	Fabric Switch	Broadcom	Tomahawk5
Cisco 8501	Fabric Switch	Cisco	Silicon One G200
Minipack3N	Fabric Switch (NSF)	NVIDIA	Spectrum-4
FBNIC	Multi-host NIC	Marvell & Meta	Custom 5nm ASIC

🧭 Why This Matters
#

Meta’s DSF architecture demonstrates a clear trend in hyperscale AI infrastructure:

No single vendor dependency
Ethernet as the unifying fabric
Custom silicon where differentiation matters
Merchant silicon where scale and economics dominate

Rather than chasing a monolithic “perfect” solution, Meta is assembling a networked system of systems—one where flexibility, supply chain resilience, and software control are as important as raw bandwidth.