Autonomous Multi-Agent Research-and-Ship System
Week 7 milestone
You are handed an enterprise mandate: the research division needs a launched product — a system that takes an open-ended technical question, autonomously researches it across many sources, synthesizes a defensible report, and ships the report as a published artifact, with zero human steps in the middle. Build an orchestrator-worker multi-agent system: a lead agent that decomposes the question and spawns specialized worker agents (search, read, synthesize, fact-check), coordinates their results through shared state, and produces a cited deliverable. This is not a notebook demo. The result must be a directly deployable, hyperscalable product: real public hosting, CI/CD on every commit, observability, security hardening, a polished and accessible web UI a non-technical analyst will happily use, and complete go-to-market material — a landing page, a pitch, and a recorded demo. The architecture must absorb concurrent research runs without falling over, and recover from a failed worker. We are not here to babysit the run; ship it as a real product.
Why it matters: Multi-agent research and synthesis systems are being deployed across consulting, finance, and R&D to compress weeks of analyst work into hours. Shipping a coordinated, fault-tolerant agent fleet makes a builder ready for an Agentic Systems Engineer or Applied AI Engineer role at the ₹1-crore tier, where the bar is production reliability, not a demo.
The deliverable
A publicly hosted product with its own domain or stable URL, plus a public repo: the orchestrator and worker agents, an MCP-based tool layer, a fast accessible web UI for submitting questions and reading results, CI/CD running lint/tests/build on every commit, persisted and inspectable run traces, a marketing landing page, a 10-slide pitch, a recorded demo video, and a README documenting the coordination design, the failure-recovery and scaling strategy, and three example end-to-end runs with their published reports.
What it ships
- Submit-a-question interface accepting an open-ended technical or market question with a depth setting (quick scan vs deep dive).
- A lead orchestrator agent that decomposes the question into a research plan and spawns specialized worker agents.
- Specialized workers — web search, source reading, synthesis, and an independent fact-checker that verifies every claim.
- An MCP tool layer exposing search, fetch, and document tools so the same tools are reusable across agents and projects.
- Live run view: a real-time graph of agent activity, sub-questions in flight, and sources being consumed.
- Inline-cited report output where every claim links to the exact retrieved passage that supports it.
- Export to PDF, Markdown, and a shareable public report URL.
- Persisted, replayable run traces with token spend and latency per agent for cost auditing.
- Automatic worker-failure detection and re-dispatch so a crashed worker never aborts a run.
- A workspace history of past research runs with search and one-click re-run.
- Concurrency controls and per-run budget caps so many users can run research in parallel safely.
Stack you orchestrate
Claude API or open-weight LLMModel Context ProtocolLangGraphNode.js or PythonDockerGoogle Cloud Runa tracing backend (LangSmith or OpenTelemetry)
Market signal — who wants thisAgentic deep-research is one of the hottest 2026 categories: the AI agent market is projected to grow from $7.84B in 2025 to $52.62B by 2030 (41% CAGR), and a16z reports a portfolio pivot from copilots to autonomous systems, with Sierra, Glean, and Decagon as comparables and YC W26 funding multi-agent orchestration startups such as Tensol and Korso. Consulting, finance, and corporate R&D teams are actively buying systems that compress weeks of analyst work into hours; investors fund this because it sells time back to high-cost knowledge workers.
How it is graded
- The orchestrator decomposes a question and coordinates at least three specialized worker agents through explicit shared state.
- Tools are exposed through a standard protocol (MCP), not bespoke per-agent glue.
- The system is deployed to real public hosting with CI/CD on every commit and production observability (logs, traces, metrics).
- The architecture handles concurrent research runs under load, and a worker failure mid-run still yields a complete, correct deliverable.
- The web UI is fast, WCAG 2.2 AA accessible, and usable by a non-technical analyst without instruction.
- Every claim in the output report is traceable to a retrieved source, and run traces are persisted and inspectable.
- The project ships complete marketing: a landing page, a 10-slide pitch, and a recorded demo, presentable as a real product.
- The product is publicly reachable and fully reproducible from the repo by a stranger.
Bridges to Distributed Systems — coordination, message passing, and fault tolerance