Own Your Stack.

Own Your Stack/The Instruments/deepdive

Own your research

deepdive

A research agent that plans, searches, fetches, and synthesizes into a cited report — on your machine, through your own router.

github → build note → MIT the instruments research

01What it is

Hosted research tools solve a real problem and quietly take four decisions away from you: your data, your model, your search backend, and how deep the agent is allowed to dig. The question you ask, the sub-queries a planner invents, and every URL it chose to read all leave for a vendor's servers — and the depth is capped where their unit economics say it must be.

deepdive does the same job on your own machine. Ask a question and it plans sub-queries, searches the web through a backend you pick, reads the pages in a real browser, and writes a cited markdown report. A critic loop reviews its own draft, names the gaps, searches for them, and re-synthesizes until the answer stops having holes. Every LLM call routes through your own endpoint — dario, or any Anthropic- or OpenAI-compatible router — so the work bills against a subscription you already pay for, and nothing leaves the box but the searches you run and the URLs the planner picked to read.

deepdive · 60 seconds
# start dario — your local LLM router
dario proxy

# install deepdive
npm install -g @askalf/deepdive

# ask
deepdive "how does claude's rate limiter work" --deep --out=report.md
Fig. 1 — one question in, a cited report out.

02What it does

Plans, searches, fetches, synthesizes

One question becomes a plan of sub-queries. Each runs against the search backend you point at — DuckDuckGo by default with no key, or SearXNG, Brave, Tavily, Exa, Wikipedia, arXiv, and more. The planner's chosen URLs are read in a real headless browser, and the extracted text is synthesized into a single cited markdown report.

A critic loop that closes the gaps

The --deep flag adds a critic that reads its own draft, flags what's missing ("the draft didn't source the 429 header format"), and proposes follow-up queries. The loop re-searches, re-synthesizes from every source so far, and runs again — until the critic says the draft is complete or you hit --deep=N rounds. You set the ceiling, not a vendor.

Citations checked against their sources

After every synthesis, each [N] citation is checked against the extracted text of source N — a cheap, deterministic, lexical recall pass, no second model call. It catches the dominant failure of cited-answer tools: a confident sentence whose cited source doesn't contain the claim. --strict-cites exits non-zero when one fails, so scripts get a hard failure instead of a hallucination.

Your data, your model, your endpoint

No telemetry, no analytics, no retention — deliberately, not aspirationally. The only outbound connections are your LLM endpoint, your search backend, and the URLs the planner decided to read; verify with lsof -i during a run. Mix in local files with --include and they're cited as file:// paths that never leave the machine.

A corpus you keep and can replay

Every run is saved — plan, sources, answer, verification, cost. resume re-synthesizes against the saved sources for one cheap LLM call; continue deepens the corpus; diff and rerun show how an answer moved between two points in time. Your past research is your local corpus, so "what changed since last time?" is a thing you can actually ask.


03Where it sits

Part of The Instruments.

deepdive does the research; The Instruments are the agents that do real work with real tools — research, computer use, fleet nodes, and a browser — not just chat completions.

Own the research, not just the answer.

deepdive is open source and MIT-licensed. Read the code, read the build note, run it on your own box.

View deepdive on GitHub →