Loading Events

PhD Thesis Proposal

October

16
Mon
Akshara Rai Robotics Institute,
Carnegie Mellon University
Monday, October 16
12:00 pm to 1:00 pm
GHC 8102
Learning to learn from simulation: Using simulations to expedite learning on robots

Abstract:
Robot controllers, including locomotion controllers, often consist of expert-designed heuristics. These heuristics can be hard to tune, particularly in higher dimensions. It is typical to use simulation to tune or learn these parameters and test on hardware. However, controllers learned in simulation often don’t transfer to hardware due to model mismatch. This necessitates controller optimization directly on hardware. Experiments on walking robots can be expensive, due to the time involved and fear of damage to the robot. This has led to a recent interest in adapting data-efficient learning techniques to robotics. One popular method is Bayesian Optimization, a sample-efficient black-box optimization scheme. But the performance of Bayesian Optimization typically degrades in problems with higher dimensionality, including dimensionality of the scale seen in bipedal locomotion. We aim to overcome this problem by incorporating prior knowledge to reduce the  number of dimensions in a meaningful way, with a focus on bipedal locomotion. We propose two ways of doing this, hand-designed features based on knowledge of human walking and using neural networks to extract this information automatically. Our hand-designed features project the initial controller space to a 1-dimensional space, and show promise in simulation and on hardware. On the other hand, the automatically learned features can be of varying dimensions, and also lead to improvement on traditional Bayesian Optimization methods and perform competitively to our hand-designed features in simulation. Our hardware experiments are evaluated on the ATRIAS robot, while simulation experiments are done for two robots – ATRIAS and a 7-link biped model. Our results show that these feature transforms capture important aspects of walking and accelerate learning on hardware and perturbed simulation, as compared to traditional Bayesian Optimization and other optimization methods.

Thesis Committee Members:
Christopher G. Atkeson, Chair
Hartmut Geyer
Oliver Kroemer
Stefan Schaal, MPI Tübingen and USC