Tufa Labs Research Blog

AlphaWrite: Inference time compute Scaling for Writing

June 6, 2025

We introduce AlphaWrite, an inference-time scaling method for creative writing that uses evolutionary generation and ELO-based ranking to improve story quality.

Self Rewarding Self Improving

May 12, 2025

We demonstrate that large language models can autonomously improve by judging their own solutions without reference answers, creating a complete self-learning loop that enhances performance beyond existing benchmarks.

LLMs for Engineering: Teaching Models to Design High Powered Rockets

April 24, 2025

We demonstrate that while current SOTA language models struggle with iterative self-improvement in rocket engineering challenges, augmenting them with reinforcement learning unlocks superhuman design capabilities that could revolutionize physical engineering domains.

LADDER: Self-Improving LLMs Through Recursive Problem Decomposition

March 25, 2025

An in-depth analysis of a novel framework enabling language models to autonomously improve their problem-solving capabilities through recursive decomposition.

Don't Throw the Baby out with the Bathwater: How and why Deep Learning for ARC

March 25, 2025

Paper detailing our approach to the ARC-prize competition

Text to RL: Extracting High-Quality RL Questions from text

March 5, 2025

Turning textbooks into RL Questions

Tufa Labs Research Blog

Latest Posts

AlphaWrite: Inference time compute Scaling for Writing

Self Rewarding Self Improving

LLMs for Engineering: Teaching Models to Design High Powered Rockets

LADDER: Self-Improving LLMs Through Recursive Problem Decomposition

Don't Throw the Baby out with the Bathwater: How and why Deep Learning for ARC

Text to RL: Extracting High-Quality RL Questions from text