Skip to main content

Panmnesia Pushes CXL + NVLink/UALink for AI Superclusters

·450 words·3 mins
Panmnesia CXL NVLink UALink AI Superclusters Memory Interconnect
Table of Contents

South Korea–based Panmnesia, a company focused on CXL (Compute Express Link) memory technologies, is pushing for a unified memory and interconnect design to power the next generation of AI superclusters. The company believes that future AI workloads demand both GPU node memory sharing and fast inter-GPU networking, achieved by combining CXL with UALink/NVLink architectures.


A Technical Report on the Future of AI Infrastructure
#

Panmnesia’s CEO, Dr. Myoungsoo Jung, has released a 56-page technical report titled “Compute Can’t Handle the Truth: Why Communication Tax Prioritizes Memory and Interconnects in Modern AI Infrastructure.”

The report highlights:

  • The growth of AI models and why current compute-centric infrastructures struggle to scale.
  • The limitations of rigid GPU-centric architectures, such as communication overhead and low utilization.
  • The role of emerging interconnect and memory technologies—CXL, NVLink, UALink, and HBM—in solving these challenges.

Jung notes:

“No single fixed architecture can fully satisfy all the compute, memory, and networking demands for large-scale AI. The best solution is to integrate CXL with accelerator-focused interconnects like NVLink and UALink.”


Breaking Down the Report: Three Key Parts
#

  1. Trends in AI and Data Center Architectures
    Explains how workloads like chatbots, image generation, and video processing rely on sequence models (RNNs → LLMs) and why current infrastructures bottleneck performance.

  2. CXL Composable Architectures
    Shows how CXL 3.0—with multi-level switching, advanced routing, and memory coherence—can reshape data center memory architectures. Panmnesia has already developed a CXL 3.0–compliant prototype, tested on RAG and deep learning recommendation models (DLRMs).

  3. Beyond CXL: Hybrid Link Architectures
    Introduces “CXL over XLink”, where CXL handles memory pooling and coherence, while XLink (UALink + NVLink) delivers low-latency accelerator-to-accelerator communication.


Why Combine CXL and XLink? #

  • CXL → Expands memory capacity, provides system-wide coherence, and enables disaggregated, composable memory pools.
  • XLink (UALink + NVLink) → Optimized for direct GPU-to-GPU transfers with ultra-low latency.
  • Unified Approach (CXL over XLink) → Bridges the gap, creating:
    • Accelerator-centric clusters for rapid intra-cluster GPU communication.
    • Tiered memory architectures with local high-performance memory and scalable pooled memory.

Panmnesia Technical Report diagram
Panmnesia Technical Report diagram


Toward Scalable AI Superclusters
#

Dr. Jung envisions a scalable tiered memory hierarchy for AI superclusters:

  • Tier 1: High-performance local memory managed by XLink + coherence-centric CXL.
  • Tier 2: Composable memory pools enabled by CXL for large-scale data handling.

This hybrid model is designed to meet the unique performance needs of LLMs, inference, RAG, and recommendation systems—paving the way for future-proof AI infrastructure.


Final Thoughts
#

Panmnesia’s work underscores a growing industry realization: AI progress depends as much on memory and interconnects as it does on compute.

By unifying CXL’s composability with NVLink/UALink’s accelerator speed, Panmnesia is proposing a new blueprint for AI data centers that could define the next generation of superclusters powering large-scale AI applications.


Related

CXL NAND Flash Explained: Memory Expansion for AI and HPC
·627 words·3 mins
CXL NAND Flash Memory AI Data Center
Samsung CXL Memory Modules and HBM3E Drive AI Scalability
·733 words·4 mins
Samsung CXL HBM3E Memory Architecture AI Infrastructure Data Center DRAM Composable Infrastructure
PCIe Over Optics: Scaling AI Infrastructure Beyond Rack Limits
·747 words·4 mins
PCIe PCIe Over Optics Astera Labs CXL AI Infrastructure Data Center Networking Optical Interconnect GPU Clusters