Name: Forecasting at Scale with Efficient Deep Learning Architectures
Start: 2026-05-04T16:30:00-04:00
End: 2026-05-04T18:00:00-04:00
Location: GHC 4405

Willa Potosnak PhD Student Robotics Institute,
Carnegie Mellon University

Monday, May 4
4:30 pm to 6:00 pm
GHC 4405

Forecasting at Scale with Efficient Deep Learning Architectures

Abstract:

Time Series Foundation Models (TSFMs) have scaled rapidly, with publicly reported pretraining corpora growing from 1.23 billion to 1 trillion data points between 2024 and 2026, an approximately 800× increase in two years. Recent work has further supplemented real-world data with synthetic data to expose models to broader time series patterns. Yet, this data-centric paradigm raises a fundamental question: must intelligent forecasting rely solely on scale, or can intentional architectural design unlock better generalization? This thesis proposes that more intelligently and efficiently leveraging existing data, rather than scale alone, is key to achieving better forecasting generalization. We pursue this through three parallel architectural themes: exploiting cross-channel structure beyond temporal patterns, enabling zero-shot generalization through structured composition, and reducing gradient and forecast variance by design. Each theme aims to enhance generalization with available data while treating computational efficiency as a core design principle.

In this thesis, we demonstrate that scale is not the only path to generalization by: developing multivariate architectures that leverage cross-channel dependencies efficiently while reducing forecast error; showing that architectures can generalize beyond their training distribution in both patterns and concepts; and verifying variance-aware architectural designs that extract richer training signals from existing data, provably reducing gradient variance while reducing forecast error and improving calibration.

Within the first theme, we further propose pretraining strategies for multivariate TSFMs to investigate whether data balancing and curriculum learning can improve downstream generalization given the same pretraining corpora. Within the second theme, we propose an additional dimension of generalization, extending beyond pattern and concept generalization to horizon generalization, an important consideration for TSFMs applied across diverse tasks and domains. Overall, this work contributes new insights into advancing time series forecasting generalization through efficient architectural design.

Committee:

Artur Dubrawski, Chair
John Dolan
Barnabás Póczos
Michael W. Mahoney (University of California, Berkeley)

Thesis Link

PhD Thesis Proposal

May

Event Navigation

PhD Thesis Proposal

May

Share This Event!

Event Navigation