Skip to content

Auto-Research Loop

A named automation pattern Shopify ships as tangent on top of tangle: given a pipeline with a measurable metric, an LLM-driven loop iterates over code, hyperparameters, prompts, and infrastructure choices — running real experiments until the metric improves. Credit for Shopify's search throughput jump (800 → 4,200 QPS) and for distillations into liquid-ai models for narrow tasks.

What it is / isn't

Not just hyperparameter search. The loop can rewrite code (pipeline logic, kernel choices, prompts) and read its own experiment logs. Key enablers: reproducible workflows (Tangle), cheap experiment cloning, and a well-posed reward — usually an existing production metric. Limits (per Parakhin):

  • Needs a well-defined reward signal — "taste" problems stay with humans
  • Needs a cheap simulator for the inner loop; expensive sims bottleneck progress
  • Doesn't replace researcher intuition; democratizes it by letting non-researchers run experiments

Relation to other concepts

See also