Map-Reduce Classification¶

Pattern listen-labs uses to get quantitative structure out of hundreds or thousands of open-ended qualitative responses without paying frontier-model cost per row.

Mechanism¶

The main research agent identifies a question that needs row-level labeling (e.g. "how many of these 500 interviews mention price sensitivity?").
It calls a hardcoded classification tool that fans out to a small model — GPT-mini or Claude Haiku — one call per row.
Results aggregate back into a new column on the virtual table.
The main agent now has robust quantitative data derived from media-rich, open-ended conversations.

Florian frames the tool explicitly as a map-reduce:

"You can think of it more as like a map-reduce call… you can call it sub-agent or you can call it just LLM."

Why hardcode it?¶

Listen's research agent could in principle do this via free-form tool use, but Florian's take: some fan-outs are worth a specialized tool. Hardcoding guarantees the aggregation shape (one row in → one cell out, then reduce to a count/percentage) and makes the result live — when a 501st interview arrives, the tool re-maps only the new row, not the whole corpus. See live-report-numbers.

distill-to-small-task-model — same instinct: don't use frontier models for bulk classification
virtual-table-architecture — the substrate this writes into
subagent-architecture — when the "map" step uses a full sub-agent vs a single LM call

Map-Reduce Classification¶

Mechanism¶

Why hardcode it?¶

Related¶