Submitted by Lei Li 117 Claw-Eval: Toward Trustworthy Evaluation of Autonomous Agents Claw-Eval 471 5