Concurrency in Local-First AI: Making SQLite Multi-Agent Safe

The Crash of Concurrent Agents

When you run multiple local AI agents at once, one monitoring your calendar, another summarizing unread emails, a third answering a direct question, they all extract knowledge triples and try to write to the memory graph at the same time.

If you are using a standard SQLite setup, your Python console will quickly light up with sqlite3.OperationalError: database is locked.

By default, SQLite is designed for simple, serialized access. When one thread begins writing, the entire database file is locked. No other thread can write, and worse, no other thread can read. For a memory layer that needs to stay responsive across multiple threads, this blocking is a real problem.

You don’t need to migrate to PostgreSQL. In membox, we solved this with three changes to the SQLite setup.

Step 1: Bypassing Default Locks with WAL Mode

The first and most important fix happens at the database engine level. The default SQLite rollback journal requires exclusive file locks. We switch this to Write-Ahead Logging (WAL) mode.

conn.execute("PRAGMA journal_mode=WAL;")
conn.execute("PRAGMA synchronous=NORMAL;")

In WAL mode, SQLite appends writes to a separate log file (.wal) instead of modifying the main database directly. The concurrency benefit: readers no longer block writers, and writers no longer block readers. An agent traversing the graph to answer a question will not be paused just because another agent is dumping 50 new relations into the store.

Step 2: Thread-Local Connections

In Python, sqlite3.Connection objects cannot safely be shared across threads. If multiple agent threads share a single global connection, the C-level SQLite library will throw ProgrammingError.

The solution is to maintain a separate database connection for each thread using Python’s threading.local(). In our ConnectionManager class:

import sqlite3
import threading

class ConnectionManager:
    def __init__(self, db_path: str) -> None:
        self.db_path = db_path
        self._local = threading.local()
        self.write_lock = threading.RLock()

    def connection(self) -> sqlite3.Connection:
        conn = getattr(self._local, "conn", None)
        if conn is None:
            # Opened in autocommit mode
            conn = sqlite3.connect(self.db_path, isolation_level=None, timeout=30.0)
            conn.execute("PRAGMA journal_mode=WAL;")
            conn.execute("PRAGMA foreign_keys=ON;")
            self._local.conn = conn
        return conn

This ensures that every worker thread gets its own isolated connection state, while still writing concurrently to the exact same .db file.

Step 3: Protecting the Critical Section with `RLock`

WAL mode and thread-local connections solve database-level locks, but they don’t solve application-logic race conditions.

Consider the “find-or-create” entity pipeline we discussed in previous posts. It is a multi-step operation: check alias → calculate embedding → insert new. If two agents process the entity “Acme Corp” simultaneously, both might see that it doesn’t exist, and both will attempt to INSERT it. The second thread will crash with an IntegrityError due to SQL UNIQUE constraints.

To serialize this specific critical section without locking the entire database for reads, we use an application-layer Reentrant Lock (threading.RLock()):

    def find_or_create_entity(self, name: str, type_: str, description: str, ...) -> int:
        alias = normalize_name(name)

        # Fast path check (no lock)
        eid = self.find_entity_by_alias(alias)
        if eid is not None: return eid

        # The critical section
        with self._cm.write_lock:
            # Re-check under lock (another thread may have created it)
            eid = self.find_entity_by_alias(alias)
            if eid is not None: return eid

            # ... Embedding logic ...

            try:
                return self.create_entity(name, type_, description, embedding)
            except sqlite3.IntegrityError:
                # Fallback: Another distinct process won the race
                return self._resolve_race_winner(name, alias)

The RLock ensures that threads within the same Python process line up orderly to create new entities. The try...except sqlite3.IntegrityError block acts as a safety net: if a completely separate Python process (not just a thread) sneaks an insert in, we catch the database-level error and resolve to the winner’s newly minted entity_id.

Testing and Tradeoffs

In membox, this setup passes our concurrency tests. We throw threads at the database with identical entities and get no lockups, no duplicate rows, and consistent data.

The tradeoff: The RLock is bounded to a single Python process. If you deploy 50 separate Docker containers all hitting the same mounted SQLite file, the Python lock does nothing, and you rely entirely on SQLite’s 30-second connection timeout and the IntegrityError fallback.

For a local-first AI memory layer where the orchestrator runs in a single desktop app or background process, this setup (WAL + threading.local + RLock) gives you reliable concurrency without running a separate database server.

Up Next

With concurrency solved, our memory layer is solid at the infrastructure level. But as the agent operates for months, the graph will grow massive. The next post looks at where pure graph traversal breaks down on long documents, and how combining BM25 keyword search with graph BFS fixes it.