Naming Conventions
Session Identifiers
Sessions use the format: {OT}-{Manikin Type}-{Task}
10-1-5→ OT 10, Type 1, Task 54-0-2→ OT 4, Type 0, Task 2
Tasks
Tasks are numbered 1-15 representing different caregiving activities (see overview page for task list).
File Formats by Modality
RGB Images
- Format: JPEG files named
{frame}_anonymized.jpg - Content: Anonymized RGB images with faces blurred
- Location:
RGB/OT{N}/Task{N}/Cam{N}/RGB/
Depth Maps
- Format: PNG files named
{frame}.png - Content: 16-bit depth maps corresponding to RGB frames
- Location:
Depth/OT{N}/Task{N}/Cam{N}/Depth/
Tactile Data
- Format: JSON files named
p_{ot}-{type}-{task}.json - Content: 44 pressure sensor readings per frame
- Location:
Skin/p_ot_{N}/
Gaze Data
- Format: CSV files named
{ot}-{type}-{task}_pupil.csv - Content: 2D gaze coordinates and 3D gaze vectors with timestamps
- Location:
Gaze/OT_{N}/
2D Pose Annotations
- Format: JSON files named
pred.json - Content: Keypoint coordinates for each frame
- Location:
2D\ Annotations/
3D Pose Tracking
- Format: JSON files named
{ot}-{type}-{task}_pose3d.json - Content: 3D pose tracking data for both manikin and caregiver
- Location:
3D\ Pose/OT{N}/
Note: Due to frequent and significant occlusions in real-world caregiving interactions, some 3D poses were obtained by annotating keypoints in 2D and triangulating them. The 2D labels are generally reliable, though the resulting 3D estimates may have modest noise and should be interpreted accordingly.
Action Labels
- Format: CSV files with temporal action annotations
- Content: Hierarchical labels (Task → Subtask → Component)
- Location:
Action\ Labeling/OT{N}/
Third-Person Videos (GoPro)
- Format: MP4 video files
- Content: Third-person perspective camera footage
- Location:
GoPro/OT{N}/
Egocentric Videos (Pupil Labs)
- Format: MP4 video files
- Content: First-person perspective from head-mounted camera
- Location:
pupil/OT{N}/
Data Synchronization
Each modality has accompanying timestamp data to enable synchronization across sensors. These timestamps provide frame-to-time mappings for all modalities.
Data Coverage
Important Notes
- Check for file existence before loading - not all sessions have all modalities
- Some sessions may have additional sub-parts (e.g., 4-0-2-1, 4-0-2-2)
- Preview videos are available for quick browsing without downloading full data