THU FASTsys

  •   List
  •   Category
  •   Archive
  •   Tag

ServerlessLLM Locality-Enhanced Serverless Inference for Large Language Models

May 28, 2024

 OSDI 24


---
 Reporter
  • 周与祺
 Tags
  •  Serverless
  •  LLM
 Related
  • Parrot Efficient Serving of LLM-based Applications with Semantic Variable
  • Sabre Hardware-Accelerated Snapshot Compression for Serverless MicroVMs
  • Characterization of Large Language Model Development in the Datacenter
  • MinFlow High-performance and Cost-efficient Data Passing for I/O-intensive Stateful Serverless Analytics
  • Language Model is Compression
  • Efficient Memory Management for Large Language Model Serving with PagedAttention
  • Unlocking unallocated cloud capacity for long, uninterruptible workloads
THU FASTsys