REPeat: A Real2Sim2Real Approach for Pre-acquisition of Soft Food Items in Robot-assisted Feeding

*Equal contribution
Cornell University

Abstract: This paper introduces REPeat, a novel Real2Sim2Real framework aimed at improving bite acquisition in robot-assisted feeding for individuals on soft diets, a critical need highlighted by the high prevalence of dysphagia in conditions such as ALS, Parkinson's disease, Stroke, and Multiple Sclerosis. Utilizing monocular depth estimation, REPeat transforms real-world RGB images into detailed 3D models to simulate the physical dynamics of various food types, from Newtonian fluids to non-Newtonian substances and granular solids. By leveraging the Material Point Method (MPM) for accurate food physics modeling, the framework allows the exploration of pre-acquisition strategies, extending the adaptability to the rheological complexity of soft diets. The Sim2Real step employs ControlNet to generate realistic images of the simulated plates for evaluating bite-acquisition strategies. Tested across 15 diverse food plate scenarios, REPeat shows success rate improvements in bite acquisition for most plates.

Overview of REPeat

Overview of REPeat

The process begins with SPANet-soft (Sec. x) giving an initial estimation of the success rate of bite acquisition. The robot performs direct bite acquisition if the initial estimation of the success rate is higher than a threshold. Otherwise, it enters the Real2Sim2Real loop that consist of:

  1. Real2Sim: Reconstructing the 3D mesh in real-time with estimated depth as input.
  2. Simulation: Rolling out various pre-acquisition actions using high-fidelity MPM simulation.
  3. Sim2Real: Rendering a visually realistic picture based on the simulation result.

SPANet-soft evaluated the result to compare with the success rate of directly picking up food items without pre-acquisition. If the pre-acquisition action improves the bite acquisition success rate, the robot performs the pre-acquisition action first, followed by the bite acquisition action

SPANet-soft

Evaluation

We select 10 types of food representing a diverse range of rheological properties which cover the extremes of the five properties, including elasticity, plasticity, viscosity, texture, and shape.

Food Taxonomy

The REPeat system is evaluated on the following 15 plates containing 10 types of food items.

Plate Arrangements

An Example Robot Execution Sequence

Results

Results indicate that pre-acquisition actions, on average, enhance the bite acquisition success rate by 27%.

Results

Hardware Setup

Hardware Setup

Our setup includes a robot holding a feeding utensil, and a non-slip plate. We use a camera for perception, and a F/T sensor to determine the termination of the pushing action. Our setup is not tied to a specific robot embodiment, and it adapts to different cameras, whether mounted on a frame or the wrist. The robot grasps a feeding utensil with 2 DoFs: (a) Scoop-like motion (b) twirl-like motion, allowing easier bite acquisition.

Baseline vs. REPeat

BibTeX

@inproceedings{ha2024repeat,
 author    = {Ha, Nayoung, and Ye, Ruolin and Liu, Ziang and Sinha, Shubhangi and Bhattacharjee, Tapomayukh},
 title     = {REPeat: A Real2Sim2Real Approach for Pre-acquisition of Soft Food Items in Robot-assisted Feeding},
 journal   = {IROS},
 year      = {2024},
}