Resources for Measure what Matters: Psychometric Evaluation of AI with Situational Judgment Tests)(https://arxiv.org/abs/2510.22170)
AI & ML interests
We work with you to develop a high impact AI strategy for your industry, refine your data foundations and design meaningful human-AI interactions. We also empower you to develop, integrate and test the latest AI technologies responsibly.
Recent Activity
View all activity
models 16
thoughtworks/arithmetic-sorl
Updated • 2
thoughtworks/Qwen3-Coder-Next-Eagle3
Text Generation • 0.1B • Updated • 518 • 1
thoughtworks/GLM-4.7-FP8-Eagle3
Text Generation • 0.6B • Updated • 35 • 1
thoughtworks/MiniMax-M2.5-Eagle3
Text Generation • 0.2B • Updated • 4.42k • 4
thoughtworks/GLM-4.7-Flash-Eagle3
Text Generation • 0.1B • Updated • 646 • 1
thoughtworks/Gemma-4-31B-Eagle3
Text Generation • 0.6B • Updated • 1.24k • 4
thoughtworks/arithmetic-sorl-saes
Updated
thoughtworks/DeepSeek-R1-Distill-Qwen-14B-Eagle3
Text Generation • Updated • 64
thoughtworks/DeepSeek-R1-Distill-Qwen-7B-Eagle3
Text Generation • Updated • 114
thoughtworks/Qwen2.5-7B-Instruct-Eagle3
Text Generation • Updated • 247
datasets 13
thoughtworks/psychometric_personas_responses
Updated • 194 • 1
thoughtworks/psychometric_personas
Viewer • Updated • 23.8k • 100
thoughtworks/gemma_psychometrics_personas_responses
Viewer • Updated • 5.59M • 221 • 1
thoughtworks/ablation_psychometrics_personas
Viewer • Updated • 500 • 32
thoughtworks/psychometric_human_annotations
Viewer • Updated • 55 • 19
thoughtworks/parliamentary_personas
Viewer • Updated • 2.2k • 29
thoughtworks/psychometric_SJTs
Viewer • Updated • 4k • 36
thoughtworks/psychometric_sjts_analysis
Viewer • Updated • 2.07k • 153
thoughtworks/arithmetic-sorl-data
Viewer • Updated • 1.02M • 1.32k
thoughtworks/CulturalCounterfactuals
Updated • 6