However, extending generative modeling to touch presents unique challenges: tactile data is scarce, noisy, and expensive to collect, and there is no large-scale paired dataset linking visual appearance with tactile response. To address these challenges, my research explores three synergistic directions.
Part I: I introduce controllable visual-tactile synthesis models that jointly generate aligned visual and tactile textures from shared latent representations, enabling explicit control over appearance and feel.
Part II: I propose tactile-aware 3D generation frameworks that integrate tactile sensing into 3D diffusion pipelines, allowing models to infer physically grounded material properties from visual cues and geometry.
Part III: Building on these insights, I aim to develop scalable multimodal generation systems that leverage large vision and language foundation models and physics priors to synthesize novel materials directly from text or image input, without relying on extensive paired tactile data.
