On-line Algorithms for Combining Language Models

Adam Kalai, Stan Chen, Avrim Blum, and Roni Rosenfeld

Conference Paper, Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '99), Vol. 2, pp. 745 - 748, March, 1999

View Publication

Abstract

Multiple language models are combined for many tasks in language modeling, such as domain and topic adaptation. In this work, we compare on-line algorithms from machine learning to existing algorithms for combining language models. On-line algorithms developed for this problem have parameters that are updated dynamically to adapt to a data set during evaluation. On-line analysis provides guarantees that these algorithms will perform nearly as well as the best model chosen in hindsight from a large class of models, e.g., the set of all static mixtures. We describe several on-line algorithms and present results comparing these techniques with existing language modeling combination methods on the task of domain adaptation. We demonstrate that, in some situations, on-line techniques can significantly outperform static mixtures (by over 10% in terms of perplexity) and are especially effective when the nature of the test data is unknown or changes over time.

BibTeX

@conference{Kalai-1999-16697,
author = {Adam Kalai and Stan Chen and Avrim Blum and Roni Rosenfeld},
title = {On-line Algorithms for Combining Language Models},
booktitle = {Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '99)},
year = {1999},
month = {March},
volume = {2},
pages = {745 - 748},
}

Copyright notice: This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each author's copyright. These works may not be reposted without the explicit permission of the copyright holder.