3:00 pm - 4:00 pm
Abstract: Convolutional networks (CNNs) and recurrent networks have driven the great engineering success of deep learning in recent years. However, as academics, we still wonder whether they are indeed the ultimate models of choice. Especially, CNNs seem unable to characterize predictive uncertainty, and they are highly dependent on small filters on small, rectangular neighborhoods. On the other hand, recurrent networks seem to have encountered performance issues in recent years and unable to achieve state-of-the-art performance, are there better designs to be explored?
This talk will present some of our recent work on exploring designs of convolutional and recurrent networks. In convolutional networks, I will talk about our work in HyperGAN, where we utilize a framework similar to Generative Adversarial Networks (GANs) to generate all the weights of all the filters of a convolutional network. This turns out to be useful in terms of predictive performance as well as better estimation of the predictive (epistemic) uncertainty, which leads to better detection of outliers and adversarial examples. Going further, we will talk about PointConv, which efficiently implements CNN on irregularly spaced 3D point clouds and removes the dependency of CNN on rigid grid-like neighborhoods. PointConv learns continuous convolutional filters that apply to the entire space and hence can be used to perform classification and semantic segmentation on irregular point clouds. A computational trick allows us to greatly improve efficiency and scale to significantly larger networks. Experiments show that in semantic segmentation in realistic scenes, PointConv greatly outperform prior work in point-based structures.
Finally, on recurrent networks, I will talk about our negative experience utilizing LSTM in multi-target tracking and intuitions on how the current memorry structure is insufficient. A novel bilinear LSTM model suitable for multi-target tracking problems will be proposed, drawing ideas from classic recursive least squares. Results on the MOT 2016 and MOT 2017 challenges will be shown that significantly outperform traditional LSTMs in terms of identity switches.
Bio:Fuxin Li is currently an assistant professor in the School of Electrical Engineering and Computer Science at Oregon State University. Before that, he has held research positions in University of Bonn and Georgia Institute of Technology. He had obtained a Ph.D. degree in the Institute of Automation, Chinese Academy of Sciences in 2009. He has won an NSF CAREER award, (co-)won the PASCAL VOC semantic segmentation challenges from 2009-2012, and led a team to the 4th place finish in the DAVIS Video Segmentation challenge 2017. He has published more than 50 papers in computer vision, machine learning and natural language processing. His main research interests are video object segmentation, multi-target tracking, deep networks on point clouds, uncertainty in deep learning and human understanding of deep learning.