Bootstrapping Visual Assistant Modeling with Situated Interaction Simulation

Yichi Zhang, Run "Roihn" Peng, Lingyun Wu, Yinpei Dai, Xuweiyi Chen, Qiaozi Gao, Joyce Y. Chai

October 2025

Abstract

Visual assistants that can guide humans through complex tasks in physical environments have significant potential, yet their development is hindered by the high cost of human-in-the-loop data collection. We present BASIS (Bootstrapping Assistant modeling with Situated Interaction Simulation), a novel framework that fundamentally rethinks how visual assistants are developed and evaluated. Rather than relying on expensive human data collection, BASIS leverages simulation to bootstrap capable assistants through three interconnected stages: (1) Situated Interaction Simulation generates high-quality synthetic data through interactions between oracle assistants and simulated users; (2) Autonomous Model Development trains and continuously evaluates assistant models using this synthetic data; and (3) Real-User Validation verifies effectiveness with human users. We implement BASIS in Alexa Arena and demonstrate that our best model—despite being fine-tuned solely on synthetic data and operating under realistic perception conditions—enables real human users to achieve a 72.9% success rate, approaching the 88.6% performance of an oracle assistant with access to privileged information of perfect perception. Through detailed error analysis, we identify object identification as the primary bottleneck for current visual assistants. Our approach bridges the gap between interaction simulation and real human-AI collaboration, establishing a scalable pipeline for developing assistants that can effectively guide users through complex tasks. Project website: https://colm2025-329.github.io/

Type

Conference paper

Publication

COLM

Bootstrapping Visual Assistant Modeling with Situated Interaction Simulation

Abstract

Yichi Zhang

Ph.D. Candidate

Run "Roihn" Peng

Ph.D. Candidate

Yinpei Dai

Ph.D. Candidate

Xuweiyi Chen

Graduate Research Assistant

Qiaozi Gao

Ph.D. Student

Joyce Y. Chai

Professor