Coding
swarmvault
Use SwarmVault when the user needs a local-first knowledge vault that writes durable markdown, graph, search, dashboard, review, and MCP artifacts to disk from books, notes, transcripts, exports, datasets, slide decks, files, URLs, code, and recurring source workflows.
the manual
SwarmVault
Use this skill when the user wants a local-first knowledge vault built on the LLM Wiki pattern — three layers (raw sources, wiki, schema) where the LLM maintains a durable wiki between you and raw sources. Also use it when the project already contains swarmvault.config.json or swarmvault.schema.md.
For onboarding, examples, command references, or troubleshooting, read the bundled README.md, examples/, references/, and TROUBLESHOOTING.md before improvising workflow advice.
Quick checks
- Work from the vault root.
- If the vault does not exist yet, run
swarmvault init. - Use
swarmvault demo --no-servewhen the user wants the fastest zero-config walkthrough before pointing SwarmVault at their own sources. - Use
swarmvault scan <directory> --no-servewhen the user wants the fastest scratch pass over a local repo or docs tree without manually stepping through init + ingest + compile first. - Read
swarmvault.schema.mdbefore compile or query work. It is the vault's operating contract. - If
wiki/graph/report.mdexists, use it before broad repo search.
Core loop
- Initialize a vault with
swarmvault initwhen needed. - Update
swarmvault.schema.mdbefore a serious compile. Use it for naming rules, categories, grounding, freshness expectations, and exclusions. - Use
swarmvault source add <input>when the input is a recurring local file, local directory, public GitHub repo root, or docs hub that should stay registered. - Ingest one-off inputs with
swarmvault ingest <path-or-url>, or ingest a whole repo tree withswarmvault ingest <directory>. Audio files usetasks.audioProviderwhen configured, and supported YouTube URLs go through direct transcript capture instead of generic URL ingest. - Use
swarmvault ingest --guide,swarmvault source add --guide,swarmvault source reload --guide,swarmvault source guide <id>, orswarmvault source session <id>when the human should integrate one source at a time before canonical pages change. Setprofile.guidedIngestDefault: trueinswarmvault.config.jsonto make guided mode the default; use--no-guideto override. Profiles usingguidedSessionMode: "canonical_review"stage approval-queued canonical edits;insights_onlyprofiles keep exploratory synthesis inwiki/insights/. Use--reviewonly for the lighter review-only path. - Use
swarmvault inbox importfor capture-style batches, thenswarmvault watch --lint --repowhen the workflow should stay automated. Add--code-onlywhen the refresh should stay AST-only and defer non-code semantic re-analysis to a latercompile. On tracked repos, code-only changes take that faster compile path automatically. Installswarmvault hook installwhen git checkouts and commits should trigger the same repo-aware code-only refresh automatically. - Compile with
swarmvault compile, usecompile --max-tokens <n>when the generated wiki must stay inside a bounded context budget, or usecompile --approvewhen changes should go through the local review queue first. - Resolve staged work with
swarmvault review list|show|accept|rejectandswarmvault candidate list|promote|archive. - Ask questions with
swarmvault query "<question>". It saves durable answers intowiki/outputs/by default; add--no-saveonly for ephemeral checks. When an embedding provider is configured, query can merge semantic page matches into local search;search.rerank: truelets the currentqueryProviderrerank the merged top hits before answering. - Use
swarmvault explore "<question>" --steps <n>for save-first multi-step research loops, or--format report|slides|chart|imagewhen the artifact should be presentation-oriented. - Run
swarmvault lintwhenever the schema changed, artifacts look stale, or compile/query results drift. Setprofile.deepLintDefault: trueinswarmvault.config.jsonwhen the advisory deep-lint pass should be the default, and use--no-deepwhen you need a structural-only run. Add--webonly when deep lint is enabled and awebSearch.tasks.deepLintProvideradapter is configured; web evidence is scoped to deep lint and does not change compile or query behavior. - Use
swarmvault mcpwhen another agent or tool should browse, search, and query the vault through MCP. - Use
swarmvault graph blast <target>when the user wants reverse-import impact analysis,swarmvault graph servewhen the live workspace or bookmarklet clipper will help,swarmvault diffwhen they need a graph-level change summary against the last committed baseline, orswarmvault graph export --html <output>/graph export --report <output>when sharing will help.graph exportalso supports--html-standalone,--json,--obsidian, and--canvasfor lighter or Obsidian-native sharing.
Working rules
- Prefer changing the schema before re-running compile when organization or grounding is wrong.
- Treat
wiki/andstate/as first-class outputs. Inspect them instead of trusting a single chat answer. - Prefer
wiki/graph/report.md,state/graph.json, and saved wiki pages over ad hoc broad search when they already exist. - Use
source addfor recurring files, directories, public GitHub repo roots, and docs hubs. Useingestandaddfor deliberate one-off inputs. - When the vault lives in a git repo,
ingest|compile|query --commitcan commitwiki/andstate/changes immediately after the run. - The default heuristic provider is a valid local/offline starting point. Add a model provider only when the user wants richer synthesis quality or optional capabilities such as embeddings, vision, image generation, or audio transcription. The recommended fully-local setup is Ollama + Gemma:
ollama pull gemma4then setproviders.llmto{ type: "ollama", model: "gemma4" }and pointtasks.compileProvider,tasks.queryProvider, andtasks.lintProviderat it. - Audio ingest needs
tasks.audioProviderto resolve to a provider that exposesaudiocapability. YouTube transcript ingest does not need a provider. Setgraph.communityResolutionwhen the user wants to pin community clustering instead of using the adaptive default. - If an OpenAI-compatible backend cannot satisfy structured generation, reduce its declared capabilities instead of forcing every task through it.
- Keep raw sources immutable. Put corrections in schema, new sources, or saved outputs rather than manually rewriting generated provenance.
Files and artifacts
swarmvault.schema.md: vault-specific compile and query rules.raw/sources/andraw/assets/: canonical source storage.wiki/: generated pages plus saved outputs.wiki/outputs/source-briefs/: saved onboarding briefs for managed sources.wiki/outputs/source-sessions/: resumable guided-session anchors plus question/answer history for one-source-at-a-time integration.wiki/outputs/source-reviews/: staged source-scoped review pages.wiki/outputs/source-guides/: staged source-integration guides for one-source-at-a-time workflows.wiki/dashboards/: recent sources, reading log, timeline, source sessions, source guides, research map, contradiction, and open-question dashboards.wiki/code/: module pages for ingested JavaScript, JSX, TypeScript (including.mts/.cts), TSX, Bash/shell script (with shebang-based detection for extensionless scripts), Python, Go, Rust, Java, Kotlin, Scala, Dart, Lua, Zig, C#, C, C++ (including.c/.cc/.cpp/.cxxand.h/.hh/.hpp/.hxx), PHP, Ruby, PowerShell (.ps1/.psm1/.psd1), Elixir (.ex/.exs), OCaml (.ml/.mli), Objective-C (.m/.mm), ReScript (.res/.resi), Solidity (.sol), Vue single-file components (.vue), HTML (.html/.htm), and CSS sources.state/extracts/: extracted markdown and JSON sidecars for PDF, the full Word family (.docx/.docm/.dotx/.dotm), RTF (.rtf), OpenDocument (ODT/ODP/ODS), EPUB, CSV/TSV, the full Excel family (.xlsx/.xlsm/.xlsb/.xls/.xltx/.xltm), the full PowerPoint family (.pptx/.pptm/.potx/.potm), Jupyter notebooks (.ipynb), BibTeX (.bib), Org-mode (.org), AsciiDoc (.adoc/.asciidoc), transcripts, Slack exports, email, calendar, audio transcripts, YouTube transcript captures, and image sources (.png/.jpg/.jpeg/.gif/.webp/.bmp/.tif/.tiff/.svg/.ico/.heic/.heif/.avif/.jxl), plus structured previews for config/data files (JSON/JSONC/JSON5/TOML/YAML/XML/INI/ENV/PROPERTIES/CFG/CONF) and content-sniffed text ingest for developer manifests (package.json,Cargo.toml,go.mod,LICENSE,.gitignore,Dockerfile,Makefile, and similar plaintext files).state/code-index.json: repo-aware code aliases and local import resolution data.wiki/projects/: project rollups over canonical pages.wiki/candidates/: staged concept and entity pages awaiting promotion.state/graph.json: compiled graph.state/search.sqlite: local search index.state/sources.jsonandstate/sources/<id>/: managed-source registry entries plus working sync state.state/approvals/: staged review bundles fromcompile --approve.state/sessions/: canonical session artifacts for compile, query, explore, lint, watch, review, and candidate actions.state/jobs.ndjson: watch-mode run log.
Agent integration
swarmvault install --agent codex|claude|cursor|goose|pi|gemini|opencode|aider|copilot|trae|claw|droidinstalls agent-specific rules into the current project.swarmvault install --agent claude|opencode|gemini|copilot --hookinstalls graph-first hook or plugin support for the agents that expose project hook APIs.swarmvault install --agent aiderinstallsCONVENTIONS.mdand wires.aider.conf.ymlto read it when that config is valid YAML.swarmvault mcpexposes tools and resources for page search, page reads, source listing, query, ingest, compile, and lint.
Defaults to preserve
- Keep raw source material immutable under
raw/. - Save useful answers unless the user explicitly wants ephemeral output.
- Prefer reviewable flows such as
compile --approve,review, andcandidatewhen a change should not activate silently. - Treat provider setup as part of serious vault operation. If only
heuristicis configured, say so clearly. - When a vault uses the
profileblock inswarmvault.config.json, respect it as the deterministic behavior layer.swarmvault.schema.mdstill defines the human intent layer.








