[HN Gopher] Show HN: FLE v0.3 - Claude Code Plays Factorio
___________________________________________________________________
Show HN: FLE v0.3 - Claude Code Plays Factorio
We're excited to release v0.3.0 of the Factorio Learning
Environment (FLE), an open-source environment for evaluating AI
agents on long-horizon planning, spatial reasoning, and automation
tasks. == What is FLE? == FLE uses the game Factorio to test
whether AI can handle complex, open-ended engineering challenges.
Agents write Python code to build automated factories, progressing
from simple resource extraction (~30 units/min) to sophisticated
production chains (millions of units/sec). == What's new in 0.3.0
== - Headless scaling: No longer needs the game client, enabling
massive parallelization! - OpenAI Gym compatibility: Standard
interface for RL research - Claude Code integration: We're
livestreaming Claude playing Factorio [on
Twitch](http://twitch.tv/playsfactorio) - Better tooling and SDK:
1-line CLI commands to run evaluations (with W&B logging) == Key
findings == We evaluated frontier models (Claude Opus 4.1, GPT-5,
Gemini 2.5 Pro, Grok 4) on 24 production automation tasks of
increasing complexity. Even the best models struggle: - Most
models still rely on semi-manual strategies rather than true
automation - Agents rarely define helper functions or
abstractions, limiting their ability to scale - Error recovery
remains difficult - agents often get stuck in repetitive failure
loops The performance gap between models on FLE correlates more
closely with real-world task benchmarks (like GDPVal) than with
traditional coding/reasoning evals. == Why this matters == Unlike
benchmarks based on exams that saturate quickly, Factorio's
exponential complexity scaling means there's effectively no
performance ceiling. The skills needed - system debugging,
constraint satisfaction, logistics optimization - transfer directly
to real challenges. == Try it yourself == >>> uv add factorio-
learning-environment >>> uv add "factorio-learning-
environment[eval]" >>> fle cluster start >>> fle eval --config
configs/gym_run_config.json We're looking for researchers,
engineers, and modders interested in pushing the boundaries of
agent capabilities. Join our Discord if you want to contribute. We
look forward to meeting you and seeing what you can build! -- FLE
Team
Author : noddybear
Score : 38 points
Date : 2025-10-03 19:32 UTC (3 hours ago)
(HTM) web link (jackhopkins.github.io)
(TXT) w3m dump (jackhopkins.github.io)
| bottydim wrote:
| haha, I am sure somewhere, some PhD student told their
| supervisor: "No, seriously, I have to play 600 hours of
| Factorio... for science."
| georgeh4cks wrote:
| Loving the 'Claude plays' integration. Great work
| dang wrote:
| Related. Others?
|
| _Multi-Agent Coordination in Factorio: FLE v0.2.0_ -
| https://news.ycombinator.com/item?id=43926829 - May 2025 (5
| comments)
|
| _Show HN: Factorio Learning Environment - Agents Build
| Factories_ - https://news.ycombinator.com/item?id=43331582 -
| March 2025 (209 comments)
| noddybear wrote:
| This is our earlier work. Since May we've made it really easy
| for the community to build their own agents to play the game:
| you can now hook up your terminal to get Claude Code to play
| the game.
| dang wrote:
| That's great!
|
| (just for clarity: links to past threads in no way imply that
| the new post isn't welcome! They're just because some readers
| enjoy poking back through past related discussions as well)
| yeasku wrote:
| Are bitters and cliffs disabled?
| noddybear wrote:
| Biters are disabled, but cliffs are not
| kyars wrote:
| Live-stream is epic
___________________________________________________________________
(page generated 2025-10-03 23:00 UTC)