Tufa Labs Research Blog
Latest Posts
-
Self Rewarding Self Improving
We demonstrate that large language models can autonomously improve by judging their own solutions without reference answers, creating a complete self-learning loop that enhances performance beyond existing benchmarks.
Read more -
LLMs for Engineering: Teaching Models to Design High Powered Rockets
We demonstrate that while current SOTA language models struggle with iterative self-improvement in rocket engineering challenges, augmenting them with reinforcement learning unlocks superhuman design capabilities that could revolutionize physical engineering domains.
Read more -
LADDER: Self-Improving LLMs Through Recursive Problem Decomposition
An in-depth analysis of a novel framework enabling language models to autonomously improve their problem-solving capabilities through recursive decomposition.
Read more -
Don't Throw the Baby out with the Bathwater: How and why Deep Learning for ARC
Paper detailing our approach to the ARC-prize competition
Read more -
Text to RL: Extracting High-Quality RL Questions from text
Turning textbooks into RL Questions
Read more