Sleeping Agents Lost-in-Thought Benchmark π§ Run a benchmark to see how reasoning steps affect retrieval accuracy
Sleeping Agents Master Key Capability Demo π Show expected accuracy boost for a math problem via steering
Running Agents Agentic World Model Explorer π Explore world model levels, laws, and rollouts interactively
Running Agents Agentic World Model Explorer π Explore world model levels, laws, and rollouts interactively