Scaling Laws Revisited: When Data, Not Compute, is the Bottleneck - Robotics Institute Carnegie Mellon University
Loading Events

PhD Thesis Proposal

September

18
Thu
Mihir Prabhudesai PhD Student Robotics Institute,
Carnegie Mellon University
Thursday, September 18
2:00 pm to 4:30 pm
NSH 3305
Scaling Laws Revisited: When Data, Not Compute, is the Bottleneck
Abstract:
The formula for AI progress has long appeared straightforward: more compute, more data, better models. Yet while compute is growing with better hardware and bigger clusters, data is stagnating—calling into question the very scaling laws that powered the last decade. The internet—often described as the “fossil fuel” of AI—offers only a finite reservoir of training data, raising a critical question: how can we sustain the scaling trends that underpin modern AI progress?
This thesis explores one promising direction: trading off compute for data. The central idea is to leverage additional compute to compensate for limited data. We present three simple strategies towards this goal:

  • Make the task harder. We show that making the training objective more challenging can improve the generalization ability of current models.
  • Make the supervision richer. We demonstrate that providing dense gradient feedback can enhance the sample-efficiency for post-training foundation models.
  • Make the tasks unsupervised. We find that large language models can improve at reasoning without access to any ground truth question–answer pairs, reducing reliance on costly supervision.
Collectively, the results suggest new ways for extending .scaling laws in an era where data growth can no longer be taken for granted

Thesis Committee Members:

Deepak Pathak, Co-Chair

Katerina Fragkiadak, Co-Chair

Deva Ramanan

Hao Liu

Yejin Choi, Stanford University