To supervise and benchmark such fusion, we curate two mesh-annotated datasets: STCrowd-Mesh (~10k RGB+LiDAR pedestrian frames) and LLVIP-Mesh (~15k aligned RGB-IR pairs). On STCrowd-Mesh, adding LiDAR to RGB lowers MPJPE from 86.9mm to 75.8mm (–14.3%) and PA-MPJPE from 63.1mm to 57.8mm (–5.3mm), confirming LiDAR’s value for absolute spatial accuracy. In low-light LLVIP-Mesh scenes, fusing IR with RGB yields consistent but smaller gains, indicating complementary appearance cues.
We deploy the trained model offline on recorded DARPA Triage Challenge field logs from a Spot robot. Mesh trajectories extracted offline feed a movement-energy heuristic that flags spontaneous limb motion and thereby estimates motor alertness, a primary determinant of triage priority. These initial trials demonstrate that dense, modality-flexible pose estimation can underpin stand-off motor-alertness assessment even when real-time compute is unavailable, laying the groundwork for fully autonomous triage in austere environments.
