CS294-252: Architectures and Systems for Warehouse-Scale Computers

Fall 2023, UC Berkeley

Location: Tuesdays, 2-4pm in 320 Soda

Course Overview: Warehouse-Scale Computers (WSCs) host hyperscale cloud services relied on by billions of daily users. While classical WSCs were built as homogeneous collections of servers and networking hardware, modern hardware scaling trends have resulted in the introduction of specialized hardware in datacenter environments (e.g., ML accelerators and ML “supercomputer pods”, SmartNICs, GPUs, etc.). Many proposals have also been made to solve challenges like datacenter tax overheads and killer microsecond overheads with further specialization.

This graduate-level course will explore both the opportunities for deeper co-design of hardware and software to meet WSC efficiency and performance goals and the challenges of hardware specialization for the cloud systems software stack.

Prerequisites: Students must have previously taken at least one of the following graduate-level architecture/systems/VLSI courses:

  • CS252: Graduate Computer Architecture
  • CS262A: Advanced Topics in Computer Systems
  • CS268: Graduate Computer Networks
  • EECS251: Digital Design and Integrated Circuits


August 29
Intro to Warehouse-Scale Computers
Reading 1
L. Barroso, et. al. The Datacenter as a Computer, Third Edition.

September 5
Datacenter-Wide Trends
Reading 1
S. Kanev, et. al. Profiling a Warehouse-Scale Computer.
Reading 2
A. Sriraman, et. al. Accelerometer: Understanding Acceleration Opportunities for Data Center Overheads at Hyperscale.
Reading 3
J. Dean, et. al. The tail at scale. +
L. Barroso, et. al. Attack of the Killer Microseconds.

September 26
Accelerators in WSCs, Pt. 2
Reading 1
N. Lazarev, et. al. Dagger: efficient and fast RPCs in cloud microservices with near-memory reconfigurable NICs.
Reading 2
S. Karandikar, et. al. A Hardware Accelerator for Protocol Buffers.
Reading 3
M. D. Hill, et. al. Accelerator-Level Parallelism. +
R. Murty. Powering Amazon EC2: Deep dive on the AWS Nitro System.

October 3
Memory and Disaggregation, Pt. 1
Reading 1
A. Lagar-Cavilla, et. al. Software-Defined Far Memory in Warehouse-Scale Computers.
Reading 2
J. Weiner, et. al. TMO: transparent memory offloading in datacenters.
Reading 3
K. Zhao, et. al. Contiguitas: The Pursuit of Physical Memory Contiguity in Datacenters.

October 10
Modeling and Evaluation + Sustainability (+ Project Proposal Presentations)
Reading 1
S. Karandikar, et. al. FireSim: FPGA-Accelerated Cycle-Exact Scale-Out System Simulation in the Public Cloud.
Reading 2
D. Cock, et. al. Enzian: an open, general, CPU/FPGA platform for systems software research.
Reading 3
B. Acun, et. al. Carbon Explorer: A Holistic Framework for Designing Carbon Aware Datacenters.

October 17
Memory and Disaggregation, Pt. 2
Reading 1
P. Duraisamy, et. al. Towards an Adaptable Systems Architecture for Memory Tiering at Warehouse-Scale.
Reading 2
H. Al Maruf, et. al. TPP: Transparent Page Placement for CXL-Enabled Tiered-Memory.
Reading 3
H. Li, et. al. Pond: CXL-Based Memory Pooling Systems for Cloud Platforms.

October 24
Accelerators in WSCs, Pt. 3 + Server Design
Reading 1
P. Ranganathan, et. al. Warehouse-scale video acceleration: co-design and deployment in the wild.
Reading 2
A. Sriraman, et. al. SoftSKU: optimizing server architectures for microservice diversity @scale.
Reading 3
G. Ayers, et. al. Memory Hierarchy for Web Search.

October 31
RPC + Silent Data Corruption Pt. 1 + Data Analytics Pt. 1
Reading 1
K. Seemakhupt, et. al. A Cloud-Scale Characterization of Remote Procedure Calls.
Reading 2
H. D. Dixit, et. al. Silent Data Corruptions at Scale.
Reading 3
A. Gonzalez, et. al. Profiling Hyperscale Big Data Processing.

November 7
Silent Data Corruption Pt. 2 + Data Analytics Pt. 2 + Fault Tolerance
Reading 1
P. H. Hochschild, et. al. Cores that don’t count.
Reading 2
L. Wu, et. al. Q100: The Architecture and Design of a Database Processing Unit.
Reading 3
Y. Zhou, et. al. Carbink: Fault-Tolerant Far Memory.

November 14
Operating Systems
Reading 1
J. T. Humphries, et. al. ghOSt: Fast & Flexible User-Space Delegation of Linux Scheduling.
Reading 2
J. T. Humphries, et. al. A case against (most) context switches.
Reading 3
A. Belay, et. al. IX: A Protected Dataplane Operating System for High Throughput and Low Latency.

November 28
Feedback-Directed Optimization + Security
Reading 1
G. Ayers, et. al. AsmDB: understanding and mitigating front-end stalls in warehouse-scale computers.
Reading 2
Y. Zhang, et. al. OCOLOS: Online COde Layout OptimizationS.
Reading 3
C. Delimitrou, et. al. Bolt: I Know What You Did Last Summer… In The Cloud.

December 5
N/A (RRR Week)

December 12
Final Project Presentations (Finals Week)

Weekly Schedule

  • Lecture/Discussion: Tuesdays 2-4pm in 320 Soda
  • Weekly Reading Reviews: Due Mondays @ noon pacific. See Ed for submission links.
  • Weekly Student Presenter Slides: Due Fridays @ 11:59pm. See Ed for submission details.

Assignments and Grading

The course workload will consist of the following:

  • 25% of grade: Each week, students will be required to read and provide a review of two of the week’s papers and attend and participate in the week’s discussion.
    • Can drop two weeks worth, no questions asked.
  • 25% of grade: Each student will lead the discussion of two papers during the semester.
  • 50% of grade: Students will complete a semester-long research project, in groups of 2 or 3, related to the course material.


Krste Asanović


Office Hours

By appointment.

Sagar Karandikar


Office Hours

By appointment.