Semi-Supervised Learning of Sequence Models with Method of Moments - Robotics Institute Carnegie Mellon University

Semi-Supervised Learning of Sequence Models with Method of Moments

Zita Alexandra Magalhaes Marinho, Andre F. T. Martins, Shay B. Cohen, and Noah A. Smit
Conference Paper, Proceedings of Empirical Methods for Natural Language Processing Conference (EMNLP '16), November, 2016

Abstract

We propose a fast and scalable method for semi-supervised learning of sequence models, based on anchor words and moment matching. Our method can handle hidden Markov models with feature-based log-linear emissions. Unlike other semi-supervised methods, no decoding passes are necessary on the unlabeled data and no graph needs to be constructed—only one pass is necessary to collect moment statistics. The model parameters are estimated by solving a small quadratic program for each feature. Experiments on part-of-speech (POS) tagging for Twitter and for a low-resource language (Malagasy) show that our method can learn from very few annotated sentences.

BibTeX

@conference{Marinho-2016-5622,
author = {Zita Alexandra Magalhaes Marinho and Andre F. T. Martins and Shay B. Cohen and Noah A. Smit},
title = {Semi-Supervised Learning of Sequence Models with Method of Moments},
booktitle = {Proceedings of Empirical Methods for Natural Language Processing Conference (EMNLP '16)},
year = {2016},
month = {November},
}