Skip to content

title: Trigger: Long-Running Agent vs Early Results created: 2026-05-02 updated: 2026-05-02 type: concept tags: [domain/ai, agents, ux] sources: [raw/transcripts/florian-juengermann-listen-agents-2026.md] confidence: medium


Trigger: Long-Running Agent vs Early Results

Design decision listen-labs makes at the first turn of every research conversation: is this a 30-second live answer, or a 30-minute async deep-dive? The answer changes the entire orchestration — same agent architecture, different parameters (context budget, tool-depth, whether the feedback-subagent blocks on review).

Two modes

Mode Latency User affordance Feedback subagent role
Live chat seconds Streams partial results; user can redirect Runs as async eval after the fact
Long async ~30 min User leaves, comes back; full report delivered Runs inline, blocks final output until review passes

How Listen decides

Not a user toggle — the main agent classifies the incoming question: - "How many respondents mentioned X?" → live, single map-reduce-classification call - "Produce a full competitive landscape report with slide deck" → long async, spawns the powerpoint-subagent

Florian (source):

"We do run like one analysis run up front and that can take like 30 minutes… we use the same agent architecture for that as for the live interaction chat, there's some different parameters."

My inference: the bimodal split is likely a transitional design — as latency improves and evals tighten, the "long async" path shortens and the boundary drifts. Not what Florian said explicitly.