Virtual Table Architecture¶

Architectural pattern listen-labs uses to expose qualitative interview data to its research agent. Instead of a virtual file system — the currently fashionable harness pattern (cf. subagent-architecture, harness-engineering) — Listen exposes the corpus as a virtual table.

Shape¶

Rows = one response / one interview transcript.
Columns = a question or an extracted feature (sentiment, topic tag, emotional valence, custom classification).
The agent's primary affordance is creating new columns by calling a classification tool (e.g. classify-with-small-model) that spawns a map-reduce-classification job across all rows.
Underneath, the table is Postgres, not a CSV file. When Python execution is needed (roughly 20% of tasks, typically for bespoke analysis the structured tools don't cover), the data is materialized as a pandas DataFrame inside an E2B sandbox.

Why table > files for this domain¶

Florian Juengermann's argument (source): qualitative interview corpora are already tabular — each respondent is a row, each question is a column. Forcing them into a file-system abstraction loses the natural join/aggregate shape and makes the agent invent its own.

"Right now in the main agent, it's not directly file structure. We think of it more as a table… the agent can basically create new columns."

Contrast with Mitchell Hashimoto's and Ryan Lopopolo's setups where the repository/file-tree IS the substrate — those are code domains where files are the native unit. My inference: the right substrate for an agent is whatever unit the domain already uses to think — code = files, research = rows.

map-reduce-classification — the primary "write" operation on the table
contextual-prompt-engineering — how the agent knows which columns matter
subagent-architecture — complementary pattern for orchestration

Virtual Table Architecture¶

Shape¶

Why table > files for this domain¶

Related¶