Auto-Research Loop¶

A named automation pattern Shopify ships as tangent on top of tangle: given a pipeline with a measurable metric, an LLM-driven loop iterates over code, hyperparameters, prompts, and infrastructure choices — running real experiments until the metric improves. Credit for Shopify's search throughput jump (800 → 4,200 QPS) and for distillations into liquid-ai models for narrow tasks.

What it is / isn't¶

Not just hyperparameter search. The loop can rewrite code (pipeline logic, kernel choices, prompts) and read its own experiment logs. Key enablers: reproducible workflows (Tangle), cheap experiment cloning, and a well-posed reward — usually an existing production metric. Limits (per Parakhin):

Needs a well-defined reward signal — "taste" problems stay with humans
Needs a cheap simulator for the inner loop; expensive sims bottleneck progress
Doesn't replace researcher intuition; democratizes it by letting non-researchers run experiments

Relation to other concepts¶

tokens-need-critique-loop — auto-research is a critique loop elevated to pipeline scope
pipeline-as-specification / pipeline-as-verifier — the pipeline and its metric are the spec the loop optimizes
simgym supplies the cheap simulator that Tangent needs for customer-facing code paths

Auto-Research Loop¶

What it is / isn't¶

Relation to other concepts¶

See also¶