CUADebug: Diagnosing and Repairing Computer-Use Agent Failures

Published in Under review at EMNLP 2026, 2026

CUADebug studies failures in computer-use agents and introduces a framework for diagnosing and repairing them through a CUA-specific taxonomy, benchmark, and tool-augmented debugger.

Recommended citation: Weijia Zhang et al. (2026). "CUADebug: Diagnosing and Repairing Computer-Use Agent Failures." Under review at EMNLP 2026.
Download Paper