36 3 1

Aamer Mihaysi

O96a

https://www.mehaisi.com/

AI & ML interests

Ethical AI, NLP & Cognitive architectures

Recent Activity

updated a Space about 5 hours ago

O96a/lenvm-token-control-demo

published a Space about 5 hours ago

O96a/lenvm-token-control-demo

updated a Space 1 day ago

O96a/lost-in-thought-benchmark

View all activity

Organizations

O96a 's Spaces 39

LenVM Token-Level Length Control Demo

📏

Lost-in-Thought Benchmark

🧠

Run a benchmark to see how reasoning steps affect retrieval accuracy

Sudanese Dialect Mt Stress

🏃

Master Key Capability Demo

🔑

Show expected accuracy boost for a math problem via steering

Explore world model levels, laws, and rollouts interactively

COSPLAY Skill Bank Demo

🚀

Generate baseline vs skill‑augmented LLM answer

COSPLAY Skill Bank Demo

🚀

COMPASS-Inspired Semantic Sampling for Sudanese Arabic Dialect Understanding

🎯

Number Periodicity Demo

📊

Number Periodicity Demo

📊

Number Representation Periodicity Visualizer

📊

CoT Spatial Reasoning Degradation

🧠

Show how step-by-step prompts affect visual puzzle answers

CoT Spatial Reasoning Degradation

📉

Generate spatial puzzles and compare direct vs CoT reasoning

Weak Supervision Reasoning Explorer

🔬

Explore reasoning performance under weak supervision

W-RAC Chunking Explorer

🧩

Compare text chunking methods and view cost savings

Sudanese Arabic Navigable RAG Demo

🧭

Compare Sudanese Arabic phrase retrieval methods

Interleaved Retrieval-Reasoning Benchmark

🔄

Compare Standard vs Interleaved RAG with simulated benchmarks

Agent Architecture Visualizer

🔄

Simulate and visualize AI agent loops with permissions

TESSY Reasoning Demo - Sudanese Arabic

🧠

Analyze Sudanese Arabic samples with standard vs TESSY reasoning

Sudanese Arabic SWE-AGILE Reasoning Benchmark

🧠

Run Sudanese Arabic reasoning benchmark with context strategies

Aamer Mihaysi

AI & ML interests

Recent Activity

Organizations

O96a 's Spaces 39 Sort: Recently updated

LenVM Token-Level Length Control Demo

Lost-in-Thought Benchmark

Sudanese Dialect Mt Stress

Master Key Capability Demo

AutoResearchBench Explorer

AutoResearchBench Explorer

OneManCompany Talent Market Explorer

OneManCompany Talent Market Explorer

Agentic World Model Explorer

COSPLAY Skill Bank Demo

COSPLAY Skill Bank Demo

COMPASS-Inspired Semantic Sampling for Sudanese Arabic Dialect Understanding

Number Periodicity Demo

Number Periodicity Demo

Number Representation Periodicity Visualizer

CoT Spatial Reasoning Degradation

CoT Spatial Reasoning Degradation

Weak Supervision Reasoning Explorer

W-RAC Chunking Explorer

Sudanese Arabic Navigable RAG Demo

Interleaved Retrieval-Reasoning Benchmark

Agent Architecture Visualizer

TESSY Reasoning Demo - Sudanese Arabic

Sudanese Arabic SWE-AGILE Reasoning Benchmark

O96a 's Spaces 39