CUADebug: Diagnosing and Repairing Computer-Use Agent Failures
Published in Under review at EMNLP 2026, 2026
CUADebug studies failures in computer-use agents and introduces a framework for diagnosing and repairing them through a CUA-specific taxonomy, benchmark, and tool-augmented debugger.
Recommended citation: Weijia Zhang et al. (2026). "CUADebug: Diagnosing and Repairing Computer-Use Agent Failures." Under review at EMNLP 2026.
Download Paper
