Stefano Fiorucci¶

AI and software engineer. Works on AI orchestration at Deepset, where he develops haystack — an open-source LLM framework for building production-grade NLP and AI pipelines. Outside of work, focuses on small language models, fine-tuning, and reinforcement learning.

Talk: Let LLMs Wander (AI Engineer 2026)¶

Fiorucci presented "Let LLMs Wander: Engineering RL Environments" at the AI Engineer conference (uploaded 2026-04-08, ~40m). The talk covered:

Mapping classic RL concepts (agent, environment, reward, trajectory) to the language model domain
Introduction to Verifiers, an open-source Python library by Prime Intellect for building RL environments as distributable software artifacts
A full experiment: training LFM-2 (Liquid AI) from weak tic-tac-toe play to master-level via SFT warm-up + GRPO/CISPO RL

Key thesis: "We did not just show the model how to play. We gave it a space to play and guided it through rewards." This succinctly captures the shift from supervised-fine-tuning-sft (statistical imitation) to rl-with-verifiable-rewards (environment-driven exploration).

Key contributions to the wiki¶

rl-environment-engineering — the main frame of the talk
rl-with-verifiable-rewards — DeepSeek R1 paradigm explained
llm-wandering — exploration vs exploitation for LLMs
verifiers-library — the Verifiers open-source tool
rl-curriculum-opponent-skill — curriculum via opponent difficulty ramping
synthetic-sft-bootstrap — SFT warm-up before RL

Stefano Fiorucci¶

Talk: Let LLMs Wander (AI Engineer 2026)¶

Key contributions to the wiki¶

See also¶