GUIAgentDebugger
Published:
GUIAgentDebugger is a self-evolving VLM-agent debugging framework for diagnosing GUI-agent failures and improving future rollouts.
Key contributions include:
- Designed a GUI-agent error taxonomy with 4 major categories and 29 subtypes.
- Covered perception, interaction localization, task reasoning, and external system failures across mainstream OSWorld/CUA scenarios.
- Built a framework that identifies root causes from failed trajectories, distills them into reusable debugging skills, and lets agents learn from historical failures.
- Designed a dual-layer memory architecture with episodic and semantic memory.
- Added intent-aware RAG retrieval to inject skills from similar-intent trajectories and improve diagnosis and re-rollout accuracy.
