Abstract:
Humans are incredibly dexterous. We interact with and manipulate tools effortlessly, leveraging touch without giving it a second thought. Yet, replicating this level of dexterity in robots, is a major challenge. While the robotics community, recognizing the importance of touch in fine manipulation, has developed a wide variety of tactile sensors, how best to leverage these sensors for both perception and manipulation is unclear.
In this thesis, we seek to address how to efficiently integrate tactile sensing for both perception and dexterous manipulation. Specifically, we turn to self-supervised learning (SSL) to train general purpose tactile representations that can generalize across multiple tactile sensors, standardize usage across multiple downstream tactile tasks, and further alleviate the need to collect labeled task specific data which is not only difficult for downstream tasks, but oftentimes impractical to collect for tasks such as uncalibrated force field estimation.
First, we discuss Sparsh, a family of SSL models for vision-based tactile sensors, trained over 460k+ tactile images with masking, and self-distillation in pixel and latent spaces. Then we discuss Sparsh-skin, a tactile representation model for magnetic skin like sensors, trained via self-distillation that enables seamless use of full-hand tactile sensors in downstream tasks. We find in evaluations that both Sparsh and Sparsh-skin not only outperform task and sensor-specific end-to-end models by a large margin (~95.1% and ~56.4% respectively) , but also they are data efficient for downstream task training.
Then, we note that existing work in tactile sensing often overlooks the multimodal aspects of human touch, such as vibration and heat perception. To this end, as proposed work we discuss Sparsh-X, which aims to learn a compact tactile representation fusing image, pressure, audio and inertial measurements from the DIGIT360 sensor. Finally, we propose novel real-world tactile adaptation methods to leverage these tactile representations to learn robot policies for highly dexterous atomic skills such as object manipulation with pinch-to-power grasps leveraging full hand tactile sensing.
Thesis Committee Members:
Michael Kaess (Chair)
Shubham Tulsiani
Guanya Shi
Mustafa Mukadam (Amazon Robotics)
Jitendra Malik (UC Berkeley & Meta)
