Article

LLM Wiki for Business Owners: What Karpathy's Pattern Means for Your Business

Every AI tool you use forgets everything when you close the tab. Here's the architecture that fixes that — built for service business owners, not developers.

Shannon WinnickiMay 8, 20267 min read

LLM Wiki for Business Owners: What Karpathy's Pattern Means for Your Business

In this article

1.What Did Karpathy Actually Publish, and Why Did It Spread So Fast?
2.How Does RAG Work, and Where Does It Fall Apart?
3.What Does LLM Wiki Do Differently From RAG?
4.Why Does LLM Wiki Matter More for Service Businesses Than for Developers?
5.How Does the Business AI Operating System Context Layer Implement This?
6.When Is RAG Still the Right Choice?

LLM Wiki for Business Owners: What Karpathy's Pattern Means for Your Business

Every time you open a new Claude or ChatGPT session, you brief it from scratch. You paste in context. You re-explain what the business does, who the client is, what decision you made three months ago. The AI answers your question, and the answer disappears when you close the tab. Nothing compounds. Nothing builds. Your business knowledge still lives in your head, and every session starts from zero.

This is not a personal productivity problem. It is an architectural one. The entire first wave of "AI for business" was built on retrieval: find the relevant chunk and answer the question. RAG (Retrieval Augmented Generation) looks smart in demos. It collapses on the nuanced questions that actually matter, the ones that require synthesizing everything your business has learned over time.

The LLM Wiki is a knowledge architecture pattern by Andrej Karpathy where an AI agent synthesizes documents into structured, interlinked pages at the moment of ingestion. For businesses, it replaces stateless document uploads with a compounding knowledge base. Every meeting transcript, client note, and decision gets organized so AI agents can reason from it immediately, not re-discover it from raw files.

This is what that pattern means for service business owners, and why the business angle is the one story nobody has told yet.

What Did Karpathy Actually Publish, and Why Did It Spread So Fast?

On April 4, 2026, Karpathy posted a GitHub gist called "LLM Wiki". It describes a self-maintaining knowledge base where an AI agent synthesizes every new document at ingestion rather than storing raw text for later retrieval. Three layers: raw sources, a wiki of synthesized pages, and a schema that tells the AI how the wiki is organized.

Within days it had 296 HackerNews points. Within weeks: 5,000+ GitHub stars and over 100,000 bookmarks on X. Garry Tan's GBrain implementation hit 13,800+ stars. Developers were building versions in Python, TypeScript, Go. Ten-plus open-source implementations appeared in the first month.

Why it spread so fast: Karpathy co-founded OpenAI in 2015 and directed AI and Autopilot Vision at Tesla from 2017 to 2022. He founded Eureka Labs in 2024. When he publishes an architecture pattern, it gets taken seriously. The developer community moved on it immediately.

What nobody moved on: the business owner version. Every tutorial, every implementation, every YouTube breakdown targets engineers and researchers. The question of what this means for a 10-person service firm running on meeting transcripts, client notes, and SOPs? Still unanswered.

How Does RAG Work, and Where Does It Fall Apart?

RAG works like this. You upload documents. When someone asks a question, the system retrieves the most relevant chunks and feeds them to the AI. The AI reads the chunks and answers.

Think of it like a filing cabinet. You ask "what did we decide about pricing for this client?" The system pulls the most relevant file. The AI reads it. It answers the question.

It works for "what" questions. It breaks on "why" questions.

"Why did we price that client at the lower tier?" requires the AI to connect the pricing call from six months ago to the conversation about their budget constraints from the intake notes to the note you wrote about their growth trajectory. RAG retrieves chunks. It doesn't connect them. It wasn't built to.

Stateless AI tools fail on the questions that actually matter to a business: What does this client care about most? What's the pattern across our last five engagements? What did we learn from the mistake we made in Q3?

Those questions require synthesis across everything the business has learned. Not just what's in the document you uploaded today.

(If you've tried solving this by connecting Claude directly to Obsidian or a second brain app, here's why that approach runs into the same stateless wall.)

What Does LLM Wiki Do Differently From RAG?

RAG retrieves fragments. LLM Wiki synthesizes into a connected knowledge tree.

The LLM Wiki is a knowledge architecture pattern where an AI agent synthesizes documents into structured, interlinked pages at the moment of ingestion. For businesses, it replaces stateless document uploads with a compounding knowledge base. Every meeting transcript, client note, and decision is organized so AI agents can reason from it immediately, not re-discover it from raw files.

Three layers:

The raw sources are the original documents: meeting transcripts, emails, contracts, notes. The AI reads them but never modifies them. They're the record.

The wiki is LLM-maintained markdown pages that synthesize and cross-reference the raw sources. Not a copy of the raw text. A structured understanding of it. "Client X: what they care about, what they've decided, what the pattern is." One page per topic, updated every time new raw sources come in.

The schema is a configuration file, basically a table of contents plus instructions, that tells the AI how the wiki is organized and how to maintain it. This is what makes the whole thing self-maintaining.

RAG makes AI answer faster. LLM Wiki makes AI answer better, because it remembers everything the business has learned, not just what's in the document you uploaded today.

The difference: RAG retrieves at query time. LLM Wiki synthesizes at ingestion. By the time you ask a question, the wiki already understands what the answer should draw from.

Why Does LLM Wiki Matter More for Service Businesses Than for Developers?

Every developer implementation of Karpathy's pattern is about code repositories, research papers, or personal notes. That makes sense. Developers have the tooling and the motivation to experiment first.

But here's what most developer implementations don't capture: service businesses run on a different kind of knowledge.

Meeting transcripts. Discovery call notes. The pricing decision you made for a client based on three different conversations. The SOP that was in your head until you tried to hand it off to someone. The pattern you've noticed across your last six engagements that you've never written down because you haven't had time.

This is the raw material for a business LLM Wiki. And it's the exact kind of knowledge that RAG handles worst, because it requires connecting dots across time, not finding the most relevant paragraph.

A developer's LLM Wiki ingests commit logs and documentation. A service business LLM Wiki ingests what the founder knows. Clients, decisions, patterns, failures, SOPs. The institutional memory that currently lives only in one person's head.

That's not just an interesting technical distinction. It's the difference between a business that runs when the founder is there and a business that runs when they're not.

How Does the Business AI Operating System Context Layer Implement This?

Institutional memory built into a system — not locked in one person's head.

I've been running this pattern in my own business since November 2025. The knowledge/ directory in my Business AI Operating System workspace is exactly the four-operation cycle Karpathy describes: ingest, reflect, commit, lint-wiki.

The /ingest command takes a raw source (a meeting transcript, a research report, a decision document) and synthesizes it into a structured wiki page, connecting it to what's already in the knowledge base. Not just storing it. Synthesizing it.

The /lint-wiki command runs a monthly health check across the entire knowledge base: flags stale claims, orphan pages, contradictions between what was decided in January and what was updated in March.

My CLAUDE.md file is the schema. It tells the AI how the knowledge base is organized, what each subdirectory contains, and how to navigate it. Every session, the AI reads the schema first. It already knows where everything is.

The first client implementation is running now. For Shawn Hamilton at High Holler Whiskey, the raw inputs are investor data, ClickUp records, and transcripts from our work sessions. The wiki is Shawn's structured founder brain: client history, decisions made, the reasoning behind them, the patterns he's noticed across his market. The agent interface is the dashboard he uses to interact with it.

He can ask his AI "what was the reasoning behind our pricing decision for the distributor conversation last fall?" and get an answer that draws from three different source documents, synthesized into something coherent, not just a retrieved chunk.

This is what the Context layer of a Business AI Operating System does. It's the foundation every other layer depends on. Without compounding context, the intelligence layer is static. The automation layer is brittle. And every session starts from zero.

For how this plays out when knowledge management is the core constraint, read How AI-Powered Knowledge Management Turns Founder Expertise Into a Scalable Business Asset.

When Is RAG Still the Right Choice?

RAG wins in specific situations, and it's worth being clear about them.

If you're running a legal practice and need to search across 10,000 case files, RAG is right. The synthesis would be impractical at that volume, and the questions are mostly retrieval questions anyway: find the precedent, find the clause, find the date.

Same for medical literature search, enterprise content repositories, or any situation where the document library is massive and heterogeneous and the questions are mostly "find the document that contains X."

LLM Wiki is better when you want the knowledge to compound. When the questions are about patterns, history, reasoning, and context. When the answer to "what should I do here?" requires understanding what the business has learned, not just what's in the most recent document.

For a 10-person service business: LLM Wiki. For a law firm needing to search case archives: RAG.

Neither is a magic fix. Both require you to actually feed them good raw material.

If you're not sure whether your business has this problem or what it would take to fix it, that's what the fit call is for. It's a free 45-minute diagnostic — think of it like going to the doctor. I ask questions about what's actually going on, and at the end I'll tell you honestly whether what I do is a fit. No pitch, no proposal. Just an honest read. Book a fit call.

Written by

Shannon Winnicki

Founder, Revaya AI

18 years building digital products inside companies like Virgin Mobile, Boost Mobile, and Papa Murphy's. I sat between the business and the builders, translating what the business needed into what the tech actually delivered. Business AI OS is the same work. Different product.

Connect on LinkedIn

Frequently Asked Questions

The LLM Wiki is a knowledge architecture pattern published by Andrej Karpathy on April 4, 2026. It describes a three-layer system where an AI agent maintains structured knowledge that compounds over time, rather than re-discovering information from scratch on every query.

RAG is stateless. The AI rediscovers and pieces together knowledge from raw chunks on every query. LLM Wiki synthesizes knowledge at ingestion, so every query draws on pre-organized, cross-referenced understanding. RAG works for searching large document libraries. LLM Wiki is better for building business knowledge that compounds over time.

Yes. Karpathy wrote the gist as a format-agnostic blueprint. For a service business, the raw layer is meeting transcripts and client notes. The wiki layer is synthesized decisions and SOPs. No vector databases or coding required.

Synthesized knowledge about clients (what they care about, past decisions, pricing calls), business processes (SOPs, how decisions get made), and institutional patterns (what worked, what failed, why). Not raw files. Structured understanding that an AI agent can reason from immediately.

Uploading files to ChatGPT is stateless. Those files are gone the next session. LLM Wiki synthesizes documents into organized, interlinked pages that persist and compound. The AI starts each session already knowing your business context, not re-reading raw files to discover it.

The Context layer is the first of five layers in a Business AI Operating System. It is the structured knowledge base the entire system draws from — business decisions, client history, processes, and institutional knowledge, organized so AI agents can reason from them immediately. The LLM Wiki pattern describes exactly how this layer is built and maintained.

Karpathy co-founded OpenAI in 2015 and directed AI and Autopilot Vision at Tesla from 2017 to 2022. When he publishes an architecture pattern, the developer community moves on it. The gist received 296 HackerNews points in days and 5,000+ GitHub stars within weeks. The business owner version of this story is the one that hasn't been told yet.

No. RAG still works for massive, heterogeneous document libraries where synthesis isn't practical: legal discovery, medical literature, enterprise content repositories. LLM Wiki is better for domain-specific knowledge where you want compounding depth over time.

The three layers are: (1) Raw sources — immutable original documents the AI reads but never modifies; (2) The wiki — LLM-maintained structured pages that synthesize and cross-reference the raw sources; (3) The schema — a configuration document that tells the AI how the wiki is organized and how to maintain it.

The first working version takes a few hours. Define the schema, ingest your highest-priority documents, let the AI write the initial wiki pages. The value compounds over weeks and months. By month three, a system fed on meeting transcripts, client decisions, and process docs can answer nuanced business questions no basic chatbot could touch.

No. The pattern requires a structured folder, a schema document, and an AI agent that can read and write files. Tools like Claude Code and Cursor make this accessible to non-developers. The Revaya Business AI Operating System implements this pattern without the business owner writing a single line of code.

Related Resources

Article

How to Build a Business AI Operating System on Top of Claude's Agentic and Search Capabilities

Most operators are using one of the most capable AI models on the market as a glorified copy-paste tool. Here's what's actually possible instead.

9 min readMay 2, 2026

Article

When to Use Claude vs ChatGPT: A Small Business AI Stack Decision Guide

Most small businesses default to ChatGPT without asking whether it fits the work. Here's when that default costs you.

9 min readMay 1, 2026

Article

How to Prompt Claude's Extended Thinking Mode for Reasoning That Holds Up

Most operators are collapsing reasoning and output into one request — here's what changes when you stop doing that.

9 min readApr 29, 2026

Not sure if your business has this problem?

It's a free 45-minute diagnostic. I ask questions, you answer them. At the end, I'll tell you honestly whether what I do is a fit. No pitch. No proposal.

Book a Fit Call →