Skip to content

Map-Reduce Classification

Pattern listen-labs uses to get quantitative structure out of hundreds or thousands of open-ended qualitative responses without paying frontier-model cost per row.

Mechanism

  1. The main research agent identifies a question that needs row-level labeling (e.g. "how many of these 500 interviews mention price sensitivity?").
  2. It calls a hardcoded classification tool that fans out to a small model — GPT-mini or Claude Haiku — one call per row.
  3. Results aggregate back into a new column on the virtual table.
  4. The main agent now has robust quantitative data derived from media-rich, open-ended conversations.

Florian frames the tool explicitly as a map-reduce:

"You can think of it more as like a map-reduce call… you can call it sub-agent or you can call it just LLM."

Why hardcode it?

Listen's research agent could in principle do this via free-form tool use, but Florian's take: some fan-outs are worth a specialized tool. Hardcoding guarantees the aggregation shape (one row in → one cell out, then reduce to a count/percentage) and makes the result live — when a 501st interview arrives, the tool re-maps only the new row, not the whole corpus. See live-report-numbers.