Abstract: This paper introduces REPeat, a novel Real2Sim2Real framework aimed at improving bite acquisition in robot-assisted feeding for individuals on soft diets, a critical need highlighted by the high prevalence of dysphagia in conditions such as ALS, Parkinson's disease, Stroke, and Multiple Sclerosis. Utilizing monocular depth estimation, REPeat transforms real-world RGB images into detailed 3D models to simulate the physical dynamics of various food types, from Newtonian fluids to non-Newtonian substances and granular solids. By leveraging the Material Point Method (MPM) for accurate food physics modeling, the framework allows the exploration of pre-acquisition strategies, extending the adaptability to the rheological complexity of soft diets. The Sim2Real step employs ControlNet to generate realistic images of the simulated plates for evaluating bite-acquisition strategies. Tested across 15 diverse food plate scenarios, REPeat shows success rate improvements in bite acquisition for most plates.
The process begins with SPANet-soft (Sec. x) giving an initial estimation of the success rate of bite acquisition. The robot performs direct bite acquisition if the initial estimation of the success rate is higher than a threshold. Otherwise, it enters the Real2Sim2Real loop that consist of:
SPANet-soft evaluated the result to compare with the success rate of directly picking up food items without pre-acquisition. If the pre-acquisition action improves the bite acquisition success rate, the robot performs the pre-acquisition action first, followed by the bite acquisition action
We select 10 types of food representing a diverse range of rheological properties which cover the extremes of the five properties, including elasticity, plasticity, viscosity, texture, and shape.
The REPeat system is evaluated on the following 15 plates containing 10 types of food items.
Results indicate that pre-acquisition actions, on average, enhance the bite acquisition success rate by 27%.
Our setup includes a robot holding a feeding utensil, and a non-slip plate. We use a camera for perception, and a F/T sensor to determine the termination of the pushing action. Our setup is not tied to a specific robot embodiment, and it adapts to different cameras, whether mounted on a frame or the wrist. The robot grasps a feeding utensil with 2 DoFs: (a) Scoop-like motion (b) twirl-like motion, allowing easier bite acquisition.
Baseline
REPeat
Baseline
REPeat
Baseline
REPeat
@inproceedings{ha2024repeat,
author = {Ha, Nayoung, and Ye, Ruolin and Liu, Ziang and Sinha, Shubhangi and Bhattacharjee, Tapomayukh},
title = {REPeat: A Real2Sim2Real Approach for Pre-acquisition of Soft Food Items in Robot-assisted Feeding},
journal = {IROS},
year = {2024},
}