Add evaluation results for HLE, GPQA, AIME, HMMT, SWE-Bench, and Terminal-Bench (#4) d9cb81b bigeagle SaylorTwift HF Staff commited on about 5 hours ago