Reporter
- 周与祺
Tags
Related
- Parrot Efficient Serving of LLM-based Applications with Semantic Variable
- Sabre Hardware-Accelerated Snapshot Compression for Serverless MicroVMs
- Characterization of Large Language Model Development in the Datacenter
- MinFlow High-performance and Cost-efficient Data Passing for I/O-intensive Stateful Serverless Analytics
- Language Model is Compression
- Efficient Memory Management for Large Language Model Serving with PagedAttention
- Unlocking unallocated cloud capacity for long, uninterruptible workloads