There’s something satisfying about asking an AI a question about your own work and getting back an answer grounded in your context. Not the internet’s context. Yours.

It’s 8:45 AM. I have a product sync at 9, a 1:1 with a new team lead at 10, and a candidate interview at 11. I haven’t prepared for any of them. I open Claude and run my morning brief. It pulls the agenda from my calendar, searches my Obsidian vault for notes on each attendee, checks recent Confluence pages and email threads, and hands me three meeting briefs. The product sync brief shows what was carried over from last week. The 1:1 brief has the team lead’s recent projects and open questions from our last conversation. The interview brief pulls the candidate’s resume highlights alongside our rubric and even suggests a few questions about their specific experience.

None of these lives in one place. It’s scattered across a calendar, an inbox, a wiki, and a thousand-odd markdown files. Claude stitched it together because it knows where to look and what to look for.

How? Honestly, the interesting part isn’t the AI. It’s the notes.

The “RAG is dead” misread

In early April 2026, Andrej Karpathy published his LLM Wiki approach: treat your knowledge base as a persistent, compounding artifact maintained by an LLM. Raw sources go in, the model synthesizes them into structured markdown pages with summaries, and the wiki grows richer over time. The post went viral.

The popular takeaway was: context windows are big enough now, just throw your docs in. RAG is dead.

It isn’t.

Searching a few hundred Markdown files in a personal vault is fundamentally different from running a production chatbot over millions of documents. At personal scale, sure, you can stuff things into a context window. At production scale, the latency and token cost kill you. The people declaring RAG dead are generalizing from a toy setup.

But Karpathy’s instinct is right, and the interesting question is why flat RAG feels inadequate. Traditional RAG treats your knowledge as chunks in a bag. Every document gets split, embedded, and thrown into a vector store. At query time, you retrieve the top-K nearest chunks and hope for the best. There’s no structure. A chunk about a person and a chunk about their project sit in the same flat index with no connection between them. It’s like a library where every book has been shredded, and the pages shuffled together.

The answer isn’t to remove retrieval. It’s to build on top of it. Add entities. Add relationships. Give the retrieval layer something to grab onto beyond raw text similarity. At personal scale, QMD with keyword and vector search is plenty. At larger scale, you still need a proper vector database, but the entity layer works either way.

I went deep on this in a previous post on building persistent AI memory with SurrealDB, where entity graphs and vector search work together. What follows here is the simpler, more practical version: how I structure an Obsidian vault so that an AI agent can actually use it.

From PARA to knowledge graph

I started with PARA (Projects, Areas, Resources, Archive), the standard Obsidian organizational system. It lasted about three months.

PARA is a filing taxonomy. It tells you where to put things, not how to find them. When I asked Claude to pull context for a meeting, it had to know that the project lived under Projects, the attendee’s notes were in Areas, the relevant RFC was in Resources, and last quarter’s decision was in Archive. Four different places, organized by lifecycle stage rather than by what the information actually is. For a human clicking through folders, fine. For an agent trying to assemble context programmatically, a nightmare.

I switched to organizing by entity type: People, Teams, Projects, Services. Each entity gets a canonical page. The structure is flat, with no nesting beyond the top-level type directories.

Entities are nodes, not files in a taxonomy. A person’s page links to their team, their projects, and the services they own. A project page links back to its people, its services, and its dependencies. The relationships are explicit, not implied by which folder something landed in.

What entity pages look like

People have the richest frontmatter:

---
Organization: "[[Acme Corp]]"
Title: Senior Engineer
Team: "[[Platform Team]]"
aliases: [jsmith, John]
---

Sections: Work (current role, projects, focus areas), Collaborators (linked people with context), Notes & Observations, Personal (background, former companies).

Projects follow a single generic structure: Summary, Goals, Documentation, People, Timeline. Minimal frontmatter, usually just aliases. The same skeleton works whether it’s a product initiative or a technical migration.

Services range from a short page (overview, limitations, relationship to similar services) to a detailed reference (capabilities, deployment options, compliance, pricing). I also keep pages for external products and services, stuff like Datadog or Terraform or whatever vendor we’re evaluating. I summarize the internet research and add notes on how we actually use it, what’s weird about our setup, and what broke. When Claude reasons about a tool, it gets our context, not the marketing page. Minimal frontmatter across all of these.

Common patterns matter more than specifics. Every entity page has YAML frontmatter with aliases for flexible linking. Every page uses H2 sections as a consistent skeleton. Every page uses [[wiki-links]] for cross-references.

These consistent shapes are what make agent search reliable. When every person page has a “Work” section, and every project page has a “People” section, grep and semantic search hit predictably. The agent doesn’t need to understand your filing system. It just needs to know what to search for.

Amortized computation

Every bit of preprocessing you do on your vault pays off at query time, on every future query. You’re building a database index, except the database is your notes, and the queries come from an LLM.

Entity pages are the obvious example: instead of the agent re-deriving “who is this person and what do they work on” from scattered notes each time, there’s a canonical page to land on. But the same principle applies to images without alt text (invisible to an LLM, so I enrich them with descriptions), static documents from Google Drive or PDFs (converted to markdown and saved in the vault, one format, one search index), and raw meeting transcripts (long and noisy, so a processed summary in the daily note is cheaper to retrieve and more useful when found).

LLMs can work without this structure. They just burn more tokens and take longer to get worse results.

The full loop

Vault is structured. Here’s what a typical day looks like.

Morning brief

A scheduled skill runs against my calendar, pulls today’s events, and for each meeting searches the vault, recent emails, and Confluence for relevant context. Out comes a per-meeting brief in my daily note.

Different meeting types get different templates. A recurring status sync shows what was discussed last time and what’s still open. A 1:1 pulls up the person’s page, their recent activity, and any open threads between us. An interview pulls the candidate’s profile alongside our rubric.

It’s not always right. Sometimes it surfaces stale context or misses something recent. But it’s a better starting point than walking in cold, and I can skim and correct in a minute.

Capture

During meetings, I use Granola for transcription. After each meeting, the transcript gets summarized and inserted into the daily note.

Then Claude does post-processing. The most valuable part is name resolution: the transcription says “John mentioned the migration timeline,” but which John? Claude checks the entity graph. It knows the meeting was with the Platform Team, searches for people linked to that team, finds Jonathan Smith (Senior Engineer, Platform) and John Park (Data Analyst, Finance), and picks the right one based on context. When phonetic matching is needed, the transcriber hears “Sean,” but the team has a “Shawn,” it handles that too.

Graph update

After capture, Claude updates the entity graph. Mentioned people get their pages updated, a project status changed, someone took on a new responsibility, or a decision was made. Projects and services get the same treatment.

Unrecognized entities get flagged. If the transcript mentions a name that doesn’t match anyone in the vault, Claude asks: new person, or mistranscription? Usually, it can be told from context. “The new contractor on the data team” is probably someone new. “Michele from infrastructure” is probably Michael with a transcription error.

The updates don’t need to be 100% correct. I review them, fix what’s wrong, and move on. Over time, the signal-to-noise ratio improves naturally as correct information gets reinforced and errors get corrected. It’s more wiki than database. Eventual consistency through volume and curation.

Enrichment

The last stage runs asynchronously. Image attachments get enriched with alt text. External documents get converted to markdown. New entity pages get their skeleton filled in.

Under the hood

Structure without search is a library with no catalog.

I use QMD, a local search engine that indexes markdown files with both keyword and vector search. It connects to Claude via a skill, a reusable prompt that teaches Claude how to search and what the vault contains. I prefer skills over MCP servers for this: simpler to maintain, version-controlled as markdown, no running process.

flowchart LR
    V["Obsidian Vault\na thousand+ markdown files"] --> Q["QMD Index\nkeyword + vector"]
    Q --> S["Skill\nsearch instructions"]
    S --> C["Claude\nquery + reasoning"]

One thing worth calling out: Obsidian’s link graph, the backlinks and outbound links that make it powerful for human navigation, doesn’t matter much for agents. Humans click links to traverse relationships. Agents don’t click anything. They grep. They search semantically. The entity structure matters because it creates consistent search targets. This plays to coding agents’ strengths. They’re already wired to search for things, not browse for them.

For meeting capture, I reverse-engineered Granola’s local cache to get the transcripts into my pipeline rather than going through their API. Local caches are often simpler and safer than API access, which might trigger rate limits or security flags. Use existing integrations when they exist; when they don’t, look at what’s already on disk. Profile photos get pulled from Google Meet screenshots and processed through an automated pipeline (resize, crop, adjust) then attached to the person’s page. Small touch, but it makes the vault more navigable when I’m the one browsing.

I built all of this as skills first, single-purpose prompts that each handle one step. As individual skills matured, I promoted them to specialized sub-agents that run in parallel, each in its own context window. Cheaper, faster, and easier to debug than cramming everything into one session. If I were starting over, I’d do it the same way: skills first, agents later.

What this changes

My notes used to be a write-only archive. I’d take notes in meetings, file them somewhere reasonable, and never look at them again. Buried under months of accumulated markdown. I used to spend hours cleaning up notes, adding links, and maintaining structure. That overhead is gone now.

The vault is a working memory. Every meeting makes it a little smarter. The morning brief surfaces context I’d forgotten I had. The post-processing catches connections I’d have missed.

It’s not perfect. Name resolution still trips up on mispronounced foreign names, some entity pages drift out of date, and the whole thing requires enough structure that you can’t just dump files in and expect magic. But it’s a different relationship with notes. They’re not records of what happened. They’re context for what’s about to happen.

The gap between “AI can read my files” and “AI understands my work” turns out to be mostly a data structure problem. Context windows and retrieval get you partway there. The rest is in how you organize the vault.