(Talk 1) This work is jointly done with Li Fei-Fei and Eric P. Xing and will be presented in upcoming KDD 2012. We investigate a problem of predicting what images are likely to appear on the Web at a future time point, given a query word and a database of historical image streams that potentiates learning of photo-taking patterns of previous user images and associated metadata. We address such a Web image prediction problem at both a collective group level and an individual user level. We develop a predictive framework based on the multivariate point process, which employs a stochastic parametric model to solve the relations between image occurrence and the covariates that influence it, in a globally optimal, flexible, and scalable way. Using Flickr datasets of more than ten million images of 40 topics, our empirical results show that the proposed algorithm is more successful in predicting unseen Web images than other candidate methods, including reasoning on semantic meanings only, a state-of-art image retrieval method, and a generative topic model.
(Talk 2) Research on large-scale classification so far has focused on situations involving a large number of data points and/or a large numbers of features, with a limited number of categories. However, this is not the case in recent published data sets, ambitious to mimic the real world, where, in addition to the large number of data points and features, a large number of categories exist, in the order of tens or hundreds of thousands. For example, in object recognition, the recently released ImageNet data set spans a total number of 21841 image classes. Clearly, massive multi-way classification with the number of classes approaching or even surpassing human cognitive capability is an important yet unaddressed research problem, and requires some new, out-of-box rethinking of classical approaches and more effective yet simple alternatives. We propose structured sparse output coding, a principled way for massive multi-way classification, where a sparse output coding matrix is learned to maximize codeword separation and accuracy of each bit predictor. Moreover, we provide a concave-convex procedure based algorithm for the resultant optimization problem, which solves a series of l1 regularized convex optimization problems under linear constraints, using dual proximal gradient method. Experimental results on large scale image categorization and text classification demonstrate the effectiveness of our proposed approach.
Host: Eric Xing
Appointments: Bernardo Pires (email@example.com)
(Talk 1) Gunhee Kim is a PhD student advised by Prof. Eric P. Xing at Computer Science Department of Carnegie Mellon University. Prior to starting PhD study in 2009, he earned a master’s degree under supervision of Prof. Martial Hebert in Robotics Institute, CMU, and worked as a visiting researcher at CSAIL, MIT. His research interests are computer vision and machine learning for Web applications.
(Talk 2) Bin Zhao is a PhD candidate at the Machine Learning Department in CMU, advised by Prof. Eric Xing. Bin has received his bachelors and Masters degrees from Tsinghua University in 2006 and 2009, respectively. His main research areas are machine learning and computer vision, with particular interest in large scale image classification, event detection and image segmentation.