codebase-rag
A Python RAG system for chatting with codebases: chunking source code into a vector store, then using LLMs for semantic search and natural-language Q&A over code.
Roy Zhu · Waterloo, ON
Seventeen years of commerce, gaming, and platform engineering, from Beijing-scale distributed systems to a single SQLite file. Now consulting from Waterloo, learning every new field the same way: build something real, find where it breaks, write down what holds up.
A Python RAG system for chatting with codebases: chunking source code into a vector store, then using LLMs for semantic search and natural-language Q&A over code.
A TypeScript multi-channel AI agent runtime: Telegram and Discord connected to one runtime with session persistence, task scheduling, job resumption, and structured observability.
LLM extraction is brutally slow. We solved it without a message broker: a single SQLite table acts as a durable async queue, a transient worker process drains it, and a lease in the meta table guarantees only one worker runs at a time, with built-in crash recovery.
Why pure graph traversal fails on long passages and pure vector search loses logical links, and how fusing SQLite FTS5 BM25 with graph BFS plus a dynamic token budget knapsack cutoff hit 92.3% recall in our offline setting.
How to fix SQLite database locks when multiple AI agents write to memory simultaneously, using WAL mode, threading.local, and application-layer RLocks.
Why Neo4j is overkill for a personal AI agent's memory, and how to implement fast, multi-hop context retrieval using BFS in pure Python over SQLite.