Skip to main content

Inside Meta’s DSF: Multi-Vendor Silicon Powering AI Networks

·543 words·3 mins
AI Infrastructure Data Center Networking Meta Ethernet Fabrics
Table of Contents

Meta’s AI data center network is built around a carefully engineered philosophy: vendor diversity at scale. Rather than depending on a single networking silicon provider, Meta’s Disaggregated Scheduled Fabric (DSF) intentionally mixes merchant silicon, custom ASICs, and in-house designs to optimize cost, performance, and long-term supply resilience.

At hyperscale, the network itself becomes a programmable system. DSF reflects that reality.

🧠 What DSF Is Solving
#

Large-scale AI training workloads generate traffic patterns that traditional Ethernet fabrics struggle with:

  • Extreme east-west bandwidth
  • Microburst congestion from collective operations
  • Sensitivity to tail latency during synchronization

DSF addresses this by separating scheduling intelligence from packet forwarding, enabling predictable performance even at tens of thousands of GPUs.

🔀 Switch ASICs: The Core Fabric
#

Meta uses different switch silicon for different layers of its leaf-spine topology, matching each ASIC’s strengths to a specific role.

Broadcom: The DSF Backbone
#

Broadcom remains the dominant supplier in Meta’s scheduled fabric.

  • Jericho3-AI
    Deployed in the Arista 7700R4 as the DSF Leaf Switch.
    Designed for AI traffic with:
    • Very deep buffers
    • Zero-packet-loss behavior
    • Deterministic congestion handling

Arista 7700R4

  • Ramon3
    Used in the Arista 7720R4 as the DSF Spine Switch.
    Ramon3 aggregates multiple Jericho3-AI devices into a single, massive non-blocking fabric domain.

Arista 7720R4

  • Tomahawk5 (TH5)
    Powers Meta’s self-designed Minipack3 switch.
    • 51.2 Tbps switching capacity
    • Optimized for power efficiency per bit
    • Ideal for dense fabric deployments

Minipack3

Cisco: Competitive Merchant Silicon
#

  • Silicon One G200
    Used in the Cisco 8501 platform.
    • Direct competitor to Tomahawk5
    • 51.2 Tbps throughput
    • Runs Meta’s internal network OS, FBOSS, demonstrating full software portability across vendors

Cisco 8501

NVIDIA: Expanding Beyond Scheduled Fabrics
#

  • Spectrum-4
    Deployed in Minipack3N systems within Meta’s Non-Scheduled Fabric (NSF).
    • 51.2 Tbps Ethernet switching
    • Used where deterministic scheduling is less critical than raw throughput

🧩 Network Interface Controllers: The Edge of the Fabric
#

At the server boundary, Meta has moved away from generic NICs toward semi-custom designs.

Marvell + Meta: FBNIC
#

  • Custom 5nm ASIC (FBNIC)
    Co-developed with Marvell
    • Multi-host NIC supporting up to four independent hosts
    • PCIe Gen5 connectivity
    • Up to 4×100GE network interfaces
    • Hardware offloads optimized for AI collectives and low-latency messaging

FBNIC is foundational to Meta’s goal of making the network a first-class accelerator rather than a passive transport.

FBNIC

🤖 AI Accelerators with Native Networking
#

MTIA: Meta’s In-House AI Chip
#

  • Meta Training and Inference Accelerator (MTIA)
    Integrates networking directly on the accelerator die.
    • Native RoCE (RDMA over Converged Ethernet) support
    • Direct participation in the Ethernet-based DSF
    • Reduced CPU involvement and lower end-to-end latency

This tight coupling of compute and networking is critical for scaling training clusters efficiently.

📊 Meta AI Network Silicon Overview
#

Platform Network Role Supplier Chip
Arista 7700R4 DSF Leaf Broadcom Jericho3-AI
Arista 7720R4 DSF Spine Broadcom Ramon3
Minipack3 Fabric Switch Broadcom Tomahawk5
Cisco 8501 Fabric Switch Cisco Silicon One G200
Minipack3N Fabric Switch (NSF) NVIDIA Spectrum-4
FBNIC Multi-host NIC Marvell & Meta Custom 5nm ASIC

🧭 Why This Matters
#

Meta’s DSF architecture demonstrates a clear trend in hyperscale AI infrastructure:

  • No single vendor dependency
  • Ethernet as the unifying fabric
  • Custom silicon where differentiation matters
  • Merchant silicon where scale and economics dominate

Rather than chasing a monolithic “perfect” solution, Meta is assembling a networked system of systems—one where flexibility, supply chain resilience, and software control are as important as raw bandwidth.

Related

Meta FBNIC 4x100G: Custom Network Silicon Enters Full Deployment
·564 words·3 mins
Meta Custom Silicon Data Center Networking Marvell OCP
Why VXLAN Needs EVPN: Solving Data Center Networking Challenges
·601 words·3 mins
VXLAN EVPN Data Center Networking Overlay Networks
NVIDIA DGX B200 System Specs Used by OpenAI
·414 words·2 mins
OpenAI NVIDIA DGX B200 Blackwell AI Infrastructure