Alibaba Cloud
Beluga: CXL-Based KV Cache Architecture Cuts TTFT by 89.6%
·690 words·4 mins
CXL
Memory Architecture
Alibaba Cloud
LLM
GPU