Research
Latest Posts
-
1st Place in the ARC-AGI-3 Preview Competition
We present our winning solution for the ARC-AGI-3 Agent Preview Competition.
Read more -
AlphaWrite: Inference-Time Compute Scaling for Writing
We introduce AlphaWrite, an inference-time scaling method for creative writing that uses evolutionary generation and ELO-based ranking to improve story quality.
Read more -
Self-Rewarding, Self-Improving
We demonstrate that large language models can autonomously improve by judging their own solutions without reference answers, creating a complete self-learning loop that enhances performance beyond existing benchmarks.
Read more -
LLMs for Engineering: Teaching Models to Design High-Powered Rockets
We demonstrate that while current SOTA language models struggle with iterative self-improvement in rocket engineering challenges, augmenting them with reinforcement learning unlocks superhuman design capabilities that could revolutionize physical engineering domains.
Read more -
Text to RL: Extracting High-Quality RL Questions from Text
Turning textbooks into RL Questions
Read more