TouchSDF: A DeepSDF Approach for 3D Shape Reconstruction Using Vision-Based Tactile Sensing

Mauro Comi1, Yijiong Lin1,2, Alex Church1, Alessio Tonioni3, Laurence Aitchison1, Nathan Lepora1,2,
1University of Bristol, 2Bristol Robotics Laboratory, 3Google Zürich
TouchSDF helps robots understand and reconstruct 3D shapes using their sense of touch, both in simulations and in the real world.
Image
Overview of TouchSDF: (1) A robot samples the object's surface to obtain real tactile images (marker patterns) that are translated into simulated images (depth maps). (2) A Convolutional Neural Network (CNN) maps the simulated images to sets of 3D points representing the local object surface at the touch locations. (3) A pre-trained DeepSDF model predicts a continuous signed-distance function (SDF) representing the object shape from the point clouds over multiple contacts.

Abstract

Humans rely on their visual and tactile senses to develop a comprehensive 3D understanding of their physical environment. Recently, there has been a growing interest in exploring and manipulating objects using data-driven approaches that utilise high-resolution vision-based tactile sensors. However, 3D shape reconstruction using tactile sensing has lagged behind visual shape reconstruction because of limitations in existing techniques, including the inability to generalise over unseen shapes, absence of real-world testing and limited expressive capacity imposed by discrete representations. To address these challenges, we propose TouchSDF, a Deep Learning approach for tactile 3D shape reconstruction that leverages the rich information provided by a vision-based tactile sensor and the expressivity of the implicit neural representation DeepSDF. This combination allows TouchSDF to reconstruct smooth and continuous 3D shapes from tactile inputs in simulation and real-world settings, opening up research avenues for robust 3D-aware representations and improved multimodal perception for robot manipulation.

Tactile data to contact geometry

Image

Comparison with prior work

We compared our method to Smith et al.'s' touch-only reconstruction approach, using the Earth Mover's Distance (EMD), Camfer Distance (CD), and Surface Reconstruction Error as metrics. TouchSDF achieves better EMD and Surface Reconstruction Error, while achieving slightly lower CD despite a better visual quality.
Image

Reconstructions: Simulation

You can interact with the ground truth and reconstructed meshes below to assess the quality of our 3D shape reconstruction approach.

Ground truth
Reconstruction

Reconstructions: Real world

Our method successfully reconstructed real 3D-printed objects and additional everyday objects (a mug and a transparent jar), achieving low EMD values that are comparable to those obtained in simulation.
Image