You’ve probably seen MiroFish on your timeline. A Chinese student vibe-coded a multi-agent simulation engine in ten days, got 13K GitHub stars, raised $4M from Shanda Group, and moved out of his dorm. The pitch: upload any document — a press release, a policy draft, a financial report — and watch thousands of AI agents simulate how the public reacts. Posts, comments, arguments, opinion shifts, hour by hour. A digital rehearsal of reality.
It’s a genuinely impressive piece of work. Built for the Chinese market — Chinese UI, Chinese-language backend, cloud-based knowledge graph via Zep. Makes total sense for its original audience. But if you’re outside China, you’re looking at a powerful engine you can’t drive.
What I changed and why
The original MiroFish was built for Chinese users, and it served them well. But I wanted to use it for my own work — and I suspect a lot of people on this timeline do too. Two things stood between “cool demo on Twitter” and “tool I actually run”:
The interface was entirely in Chinese. Every button, label, log message, tooltip. Not a bug — just not built for an international audience yet.
The knowledge graph layer depended on Zep Cloud — a hosted service. Your documents leave your machine. For competitive intelligence, crisis simulation, or anything sensitive, that’s a non-starter.
What I did
I forked MiroFish and used a hybrid approach: harness engineering (as described by both Anthropic and OpenAI — progress tracking, context reconstruction, single-feature focus per session) combined with classic SDLC discipline (structured planning across the full development lifecycle). A pure harness would’ve been overkill for this project, but I wanted the planning rigor. Before writing a single line of code, I created four documents:
tech-spec.md — a full architectural spec. Mapped all 5,934 lines of Zep-dependent code across 6 files. Defined the target stack (Neo4j CE 5.15 + Ollama + nomic-embed-text). Specified the GraphStorage abstraction with 14 core methods, the Neo4j schema with vector indexes (768-dim) and fulltext indexes (BM25), hybrid search scoring (0.7 vector + 0.3 keyword), and detailed data contracts for every return type. Documented trade-offs: why Neo4j over FalkorDB, why LLM-based NER over spaCy, sync vs async.
migration-plan.md — 20 atomic tasks organized into 7 phases with a dependency graph. Phase 0–1: infrastructure (Docker, Neo4j, Ollama) and base modules. Phase 2–4: replacing Zep-dependent files, search tools, service layer. Phase 5–6: integration, E2E testing, CAMEL-AI config, cleanup. Phase 7: publishing. Each task had status, dependencies, specific file changes, and acceptance criteria. Estimated scope: 2,200–2,900 lines.
progress.md — a live tracker. Every phase marked COMPLETE/TODO as Claude Code worked through them. Files created, files deleted, files modified — all logged.
publishing-plan.md — license compliance (AGPL-3.0, mandatory from the original fork), repository setup, branding scope, attribution via NOTICE file, and a 10-item pre-publish checklist.
Then I pointed Claude Code at it.
Migrated the backend to a fully local stack. Replaced Zep Cloud with Neo4j CE 5.15 and Ollama. The knowledge graph pipeline now runs entirely on your machine — Neo4j for graph storage, local LLM calls for entity extraction. No cloud dependencies. Your documents, your simulation, your hardware. The whole backend migration landed almost as a one-shot — Claude Code read the spec and executed.
Replaced all Zep references in the frontend with Neo4j. The backend was local, but the UI still talked about Zep everywhere. Claude Code cleaned that up across every component.
Translated the entire UI to English. Every view, every component, every log message, every code comment. Twenty files, over a thousand unique strings. Claude Code went through all of them — the landing page, the five-step workflow (Graph Build → Env Setup → Simulation → Report → Interaction), the real-time graph visualization, the history database, the agent interaction panel. All English now. I didn’t hand-translate a single line.
The only Chinese that remains is inside regex patterns in the report parser — those match response formats from the backend, which still returns data in Chinese. Changing those would break the parsing. Everything a user sees is English.
The whole localization — scanning 20 files, identifying 1000+ strings, translating them in context, preserving backend-facing regex patterns — took one Claude Code session. No spreadsheets, no i18n framework, no contractor. Just “translate this frontend to English” and watching it work through file after file.
Why this matters
MiroFish isn’t a toy. When you feed it a document, here’s what happens:
Step 1 — it builds a knowledge graph. Extracts every entity (people, companies, events, locations) and every relationship between them. Injects individual and group memory into the graph using Neo4j’s temporal model.
Step 2 — it generates agent personas. Hundreds of them. Each with a distinct personality, opinion bias, reaction speed, influence level, and memory of past events. These aren’t prompt templates — they’re full behavioral profiles derived from the knowledge graph.
Step 3 — it runs a simulation. The agents interact on simulated social platforms — posting, replying, arguing, shifting opinions. The system tracks sentiment evolution, topic propagation, and influence dynamics in real time.
Step 4 — a ReportAgent analyzes the post-simulation environment. It interviews a random focus group of agents, searches the knowledge graph for supporting evidence, and generates a structured analysis.
Step 5 — you can chat with any agent from the simulated world. Ask them why they posted what they posted. Ask the ReportAgent to dig deeper. The full memory and personality persists.
All of this now runs on your laptop with no data leaving your network.
The use cases people are already exploring
PR crisis testing. Draft a press release, simulate the reaction before publishing. See which narratives get amplified, which get attacked, and where the conversation drifts. Adjust the draft. Simulate again.
Trading signal generation. Feed financial news into the system and watch how simulated market participants react. Not a price predictor — a sentiment predictor. Different thing, arguably more useful.
Policy impact analysis. Government teams are testing draft regulations against simulated public response. Not polling — simulation. The difference is that simulation captures cascade effects that polling misses.
Creative experiments. Someone fed the system a classical Chinese novel with a lost ending. The agents role-played the characters and generated a narratively consistent conclusion. Not the intended use case, but fascinating.
The stack
Frontend: Vue 3 + Vite + D3.js for graph visualization. Backend: Python + Neo4j + whatever LLM you point it at (works with any OpenAI-compatible API). Deployment: Docker. Setup time: minutes.
The fork is called MiroFish-Offline. It’s on GitHub under the same license as the original.
One thing you can do today
Clone the repo. Point it at a local LLM (or use your own API key for Claude/GPT). Upload a one-page document about something happening in your industry. Watch 200 agents argue about it for 30 simulated minutes.
The feeling — watching AI agents form opinions about your specific situation, in real time, on your own hardware — is unlike anything else I’ve experienced with AI tools. The original creator wasn’t exaggerating when people compared it to a first ChatGPT moment.
The difference now is that you don’t need to read Chinese or trust a cloud service to experience it.