System architecture
Version: 8.1.0 Status: Active Audience: Developers, Systems Architects, Maintainers
1. High-level system overview
Vault Intelligence is an Obsidian plugin that transforms a static markdown vault into an active knowledge base using local and cloud-based AI. It is designed as a Hybrid System that bridges local privacy (Web Workers, Orama) with cloud capability (Gemini).
System context diagram (C4 level 1)
Core responsibilities
- Indexing and retrieval: Converting markdown notes into vector embeddings and maintaining a searchable index.
- Semantic search: Finding relevant notes based on meaning, not just keywords.
- Agentic reasoning: An AI agent that uses tools (Search, Code, Read) to answer user questions using vault data. Supports multilingual system prompts.
- Vault hygiene (Gardener): A specialised agent that proposes metadata and structural improvements to the vault based on a shared ontology.
- Knowledge graph: Maintaining a formal graph structure of note connections (wikilinks) and metadata.
- Ontology management: Defining and enforcing a consistent vocabulary (concepts, entities) across the vault.
2. Core architecture and design patterns
Architectural pattern
The system follows a Service-Oriented Architecture (SOA) adapted for a monolithic client-side application.
- Services (eg
GraphService,ProviderRegistry) encapsulate business logic and are instantiated as singletons inmain.ts. - Strategy pattern is used for the embedding layer (
RoutingEmbeddingServiceswitches betweenLocalandGemini). - Facade pattern:
GraphServiceacts as a facade over the complexWebWorker<->MainThreadcommunication. It provides high-level methods likegetGraphEnhancedSimilarfor views. - Delegation pattern:
AgentServicedelegates search and context assembly toSearchOrchestratorandContextAssembler. It exposesreflexSearchfor fast-path UI feedback. - Plan-review-apply pattern: Used by the
GardenerServiceto ensure user oversight for vault modifications. - Safe Mutation:
MetadataManagercentralises all vault frontmatter updates, ensuring thread safety and idempotency.
Brain vs Body
- The body (views): React is NOT used. Views (
ResearchChatView.ts) are built using native DOM manipulation or simple rendering helpers to keep the bundle size small and performance high. State is local to the view. - The brain (services): All heavy lifting happens in services. Views never touch
app.vaultdirectly; they ask dedicated managers likeVaultManagerorMetadataManagerto perform operations.
Dependency injection
Manual Dependency Injection is used in main.ts. Services are instantiated in a specific order and passed via constructor injection to dependent services.
// main.ts
this.geminiProvider = new GeminiProvider(settings); // Implements IModelProvider, IReasoningClient, IEmbeddingClient
this.providerRegistry = new ProviderRegistry(settings, app, this.geminiProvider); // Manages available AI providers (Gemini, Ollama)
this.embeddingService = new RoutingEmbeddingService(..., this.geminiProvider); // Injects dependency
this.graphService = new GraphService(..., this.embeddingService); // Injects dependency3. Detailed Data Flow
3.1. The "Vectorization" pipeline (indexing)
Indexing pipeline architecture
- Intent: Converts raw markdown edits into searchable vector embeddings and graph relationships.
- Trigger mechanism:
Event: vault.on('modify')(Debounced). - The "black box" contract:
- Input:
TFile - Output:
OramaDocument+GraphNode
- Input:
- Mechanics:
- Processing details:
- Excalidraw sanitization: The worker automatically detects and strips
compressed-jsonblocks from drawings, preserving only the actual text labels to prevent high-entropy JSON metadata from "poisoning" the vector space. - Semantic context injection: The system pre-pends a standard header (Title, Folder Structure, Topics, Tags, Author) to every document chunk. This creates "semantic bridges" that allow the index to associate concepts even without explicit Wikilinks.
- Implicit Graph Injection: The indexing worker actively monitors the
implicitFolderSemanticsmatrix to inject physical folder paths as structural graph edges (source: implicit-folder). It leverages efficientaliasMapvalidation to securely promote valid overarching directories directly into the semantic graph space without cluttering the WebGL visualization with generic storage names like/Inbox/. - Hybrid storage (Slim-Sync):
- Hot Store (IndexedDB): The primary, full-content Orama state used for fast local searches. This is sharded by model hash (eg
orama_index_<model-hash>) to ensure isolation between different embedding models. - Persistence safety: To prevent split-brain collisions between the main thread and the background worker, the main thread uses a separate
orama_index_buffer_<model-hash>namespace for its serialization buffers. - Cold Store (Vault File): A "slim" serialized version synced to the
.vault-intelligencefolder, sharded by model hash (eggraph-state-<model-hash>.msgpack). To ensure cross-device efficiency, actual note content is stripped from the documents (content: "") before save.
- Hot Store (IndexedDB): The primary, full-content Orama state used for fast local searches. This is sharded by model hash (eg
- Serial Queue:
GraphServiceimplements a serialprocessingQueueto handle rate limiting and prevent worker overload.
3.1.1 Graph Link Resolution (Systemic Path Resolution)
To support Obsidian's flexible [[Basename]] linking without creating "Ghost Nodes", the GraphService maintains a global alias map.
- Synchronization:
GraphService.syncAliases()iterates all vault files and mapsbasename.toLowerCase() -> fullPath. - Worker Update: This map is pushed to the
IndexerWorker. - Resolution: During indexing,
resolvePathuses this map to canonicalize all links (eg[[Agentic AI]]->Ontology/Concepts/Agentic AI.md).
3.2. Search and answer loop (data flow)
The RAG cycle
- Intent: User asks a question in the chat.
- Mechanics:
- Tool calling loop (control flow):
The AgentService uses a deliberative loop to handle multiple tool calls (up to maxAgentSteps) before providing a final answer. In Version 8.1.0, this loop is implemented as an asyhchronous generator (chatStream) that yields partial results (tokens and status) to the UI.
Streaming architecture
- Token-driven UI:
GeminiProvideryields raw tokens immediately as they arrive from the Google AI SDK. - Stateful Metadata Aggregation: The provider maintains state during the stream to capture and aggregate structural metadata (eg
thought_signature) that may be split across multiple chunks, ensuring perfect reconstruction of conversation history. - Status Interleaving:
AgentServiceinterleaves text tokens with tool status updates (egisThinking: true) in the same stream. - Recursive Orchestration: If a tool is called, the
chatStreamrecursion handles subsequent LLM calls while continuing to yield to the original UI consumer. - Cancellation (AbortSignal): A shared
AbortSignalis passed from the View to the Provider. If aborted, the loop breaks instantly, and any active network requests are terminated. - Flicker-free Render Swap:
ResearchChatViewperforms throttled (100ms) Markdown rendering into a detached off-screen element, followed by a synchronousappendChildswap. This preserves smooth streaming while providing real-time formatting without layout thrashing.
3.3. Context assembly (relative accordion)
To maximise the utility of the context window while staying within precise token budgets extracted from LLM usage metadata, the ContextAssembler employs Relative Accordion Logic to dynamically scale document density based on the gap between the top match and secondary results:
| Relevance Tier | Threshold | Strategy |
|---|---|---|
| Primary | >= 90% of top | Full file content (subject to 10% soft limit cap). |
| Supporting | >= 70% of top | Contextual snippets extracted around search terms. |
| Structural | >= 35% of top | Note structure (headers) only. Capped at top 10 files. |
| Filtered | < 35% of top | Skipped entirely to reduce prompt noise. |
Hybrid score calibration
To ensure keyword matches do not overwhelm semantic similarity, the SearchOrchestrator applies a Sigmoid Calibration to BM25 scores before blending.
The formula used is: normalizedScore = score / (score + keywordWeight)
Where keywordWeight (default 1.2) is a configurable parameter in the plugin settings. This ensures that while keyword scores are unbounded, the normalized result always approaches 1.0 asymptotically, preserving ranking granularity without breaking the 0-100% scale.
This "Relative Ranking" approach ensures that even in large vaults, the agent only receives high-confidence information, preventing "hallucination by bloat".
3.4. Dynamic model ranking and fetching
The ModelRegistry synchronises available Gemini models and ranks them to ensure the user always has access to the most capable stable versions.
- Fetch: Models are fetched from the Google AI API and cached locally.
- Scoring: A weighted scoring system (
ModelRegistry.sortModels) ranks models based on:- Tier: Gemini 3 > Gemini 2.5 > Gemini 2 > Gemini 1.5.
- Capability: Pro > Flash > Lite.
- Stability: Preview or Experimental versions receive a penalty.
- Budget Scaling: When switching models,
calculateAdjustedBudgetensures the user's context configuration scales proportionally (eg if a user sets a 10% budget on a 1M model, it scales to 10% on a 32k model).
3.5. Model fetching and budget scaling (metadata flow)
Dynamic model reconfiguration
- Intent: Synchronize available Gemini models and ensure context budgets are scaled proportionally to model limits.
- Mechanics:
- Models are ranked based on their capabilities (Flash vs Pro) and version (Gemini 3 > 2 > 1.5). Preview and experimental models receive a penalty in ranking to prefer stable releases for the main user interface.
3.6. System mechanics and orchestration
- Pipeline registry: There is no central registry. Pipelines are implicit in the event listeners registered by
GraphServiceinregisterEvents(). - Extension points: Currently closed. New pipelines require modifying
GraphService. - The event bus: The plugin relies on Obsidian's global
app.metadataCacheandapp.vaultevents.UI Events: Handled by Views.System Events: Handled byVaultManager.
3.7. The "Gardening" cycle (vault hygiene)
Gardener plan-act cycle
- Intent: Systematic improvement of vault metadata and structure.
- Trigger mechanism: Manual command or periodic background scan.
- The "black box" contract:
- Input: Vault subset + Ontology context.
- Output: Interactive Gardener Plan (JSON-in-Markdown).
- Stages:
4. Control flow and interfaces
4.1. Core Service Relationships
4.2. Tool Execution Control Flow
The AgentService manages a deliberative loop where it consults the LLM, executes proposed tool calls, and feeds the results back into the conversation context until a final answer is reached or the step limit is hit.
Service interface documentation
IProvider
Defines the lifecycle of an AI provider.
export interface IProvider {
initialize?(): Promise<void>;
terminate?(): Promise<void>;
}IReasoningCapabilities
Defines the features supported by a reasoning provider.
export interface IReasoningCapabilities {
supportsCodeExecution: boolean;
supportsStructuredOutput: boolean;
supportsTools: boolean;
supportsWebGrounding: boolean;
}IModelProvider
Unified interface for reasoning and lifecycle.
export interface IModelProvider extends IProvider, IReasoningCapabilities {}IReasoningClient
Contract for chat and reasoning capabilities.
export interface IReasoningClient {
generateMessage(messages: UnifiedMessage[], options: ChatOptions): Promise<UnifiedMessage>;
generateMessageStream(messages: UnifiedMessage[], options: ChatOptions): AsyncGenerator<StreamChunk>;
generateStructured<T>(messages: UnifiedMessage[], schema: z.ZodType<T>, options: ChatOptions): Promise<T>;
searchWithGrounding(query: string): Promise<{ text: string }>;
solveWithCode(prompt: string): Promise<{ text: string }>;
}
/**
* Exact History Preservation
*
* To support reasoning models (Gemini 3) that utilize multi-part messages
* (including hidden thought/signature parts), the system captures the
* raw SDK response parts in `UnifiedMessage.rawContent`. This ensures
* history is reconstructed exactly as the provider expects in tool loops.
*/IEmbeddingClient
Contract for vector embedding generation.
export interface IEmbeddingClient {
embedQuery(text: string, priority?: EmbeddingPriority): Promise<{ vector: number[], tokenCount: number }>;
embedDocument(text: string, title?: string, priority?: EmbeddingPriority): Promise<{ vectors: number[][], tokenCount: number }>;
updateConfiguration?(): void;
}WorkerAPI (Comlink interface)
The contract exposed by the Web Worker to the main thread.
export interface WorkerAPI {
initialize(config: WorkerConfig, fetcher?: unknown, embedder?: (text: string, title: string) => Promise<number[]>): Promise<void>;
updateFile(path: string, content: string, mtime: number, size: number, title: string): Promise<void>;
getFileStates(): Promise<Record<string, { mtime: number, hash: string }>>;
deleteFile(path: string): Promise<void>;
renameFile(oldPath: string, newPath: string): Promise<void>;
search(query: string, limit?: number): Promise<GraphSearchResult[]>;
keywordSearch(query: string, limit?: number): Promise<GraphSearchResult[]>;
searchInPaths(query: string, paths: string[], limit?: number): Promise<GraphSearchResult[]>;
getSimilar(path: string, limit?: number): Promise<GraphSearchResult[]>;
getNeighbors(path: string, options?: { direction?: 'both' | 'inbound' | 'outbound'; mode?: 'simple' | 'ontology'; decay?: number }): Promise<GraphSearchResult[]>;
getCentrality(path: string): Promise<number>;
getBatchCentrality(paths: string[]): Promise<Record<string, number>>;
getBatchMetadata(paths: string[]): Promise<Record<string, { title?: string, headers?: string[] }>>;
getFileState(path: string): Promise<{ mtime: number, hash: string } | null>;
updateAliasMap(map: Record<string, string>): Promise<void>;
saveIndex(): Promise<Uint8Array>;
loadIndex(data: string | Uint8Array): Promise<void>;
updateConfig(config: Partial<WorkerConfig>): Promise<void>;
clearIndex(): Promise<void>;
fullReset(): Promise<void>;
}IOntologyService (Internal)
Manages the knowledge model and classification rules.
export interface IOntologyService {
getValidTopics(): Promise<{ name: string, path: string }[]>;
getOntologyContext(): Promise<{ folders: Record<string, string>, instructions?: string }>;
validateTopic(topicPath: string): boolean;
}IModelRegistry (Static Interface)
Registers and sorts available AI models.
export interface ModelDefinition {
id: string;
label: string;
provider: 'gemini' | 'local';
inputTokenLimit?: number;
outputTokenLimit?: number;
}
export class ModelRegistry {
public static fetchModels(app: App, apiKey: string, cacheDurationDays?: number): Promise<void>;
public static getChatModels(): ModelDefinition[];
public static getEmbeddingModels(provider?: 'gemini' | 'local'): ModelDefinition[];
public static getGroundingModels(): ModelDefinition[];
public static calculateAdjustedBudget(current: number, oldId: string, newId: string): number;
}GraphService (Facade)
Manages the semantic graph and vector index worker.
export class GraphService {
public initialize(): Promise<void>;
public search(query: string, limit?: number): Promise<GraphSearchResult[]>;
public keywordSearch(query: string, limit?: number): Promise<GraphSearchResult[]>;
public getSimilar(path: string, limit?: number): Promise<GraphSearchResult[]>;
public getNeighbors(path: string, options?: any): Promise<GraphSearchResult[]>;
public getGraphEnhancedSimilar(path: string, limit: number): Promise<GraphSearchResult[]>;
public scanAll(forceWipe?: boolean): Promise<void>;
public forceSave(): Promise<void>;
}SearchOrchestrator
Orchestrates hybrid search strategies.
export class SearchOrchestrator {
public search(query: string, limit: number): Promise<VaultSearchResult[]>;
}AgentService
Orchestrates the AI agent activities, manages tool execution, chat history, and context assembly.
export class AgentService {
public chat(messages: ChatMessage[], currentPrompt: string, contextFiles?: TFile[], options?: any): Promise<{ createdFiles: string[]; files: string[]; text: string }>;
public chatStream(messages: ChatMessage[], currentPrompt: string, contextFiles?: TFile[], options?: any): AsyncIterableIterator<StreamChunk>;
public prepareContext(inputMessage: string, modelId?: string): Promise<{ contextFiles: TFile[], cleanMessage: string, warnings: string[] }>;
public reflexSearch(query: string, limit: number): Promise<VaultSearchResult[]>;
}McpClientManager
Manages connections to standalone Model Context Protocol (MCP) servers, executing tools and retrieving resources.
export class McpClientManager implements IProvider {
public initialize(): Promise<void>;
public terminate(): Promise<void>;
public getAvailableTools(): Promise<IToolDefinition[]>;
public executeTool(namespaceName: string, args: Record<string, unknown>, signal?: AbortSignal): Promise<{ text: string }>;
public getAvailableResources(): Promise<Record<string, unknown>[]>;
public readResource(serverId: string, uri: string): Promise<Record<string, unknown>>;
}MetadataManager
Centralizes and safely manages vault modifications and frontmatter updates to prevent race conditions.
export class MetadataManager {
public updateFrontmatter(file: TFile, updates: (frontmatter: Record<string, unknown>) => void): Promise<void>;
public hasKey(file: TFile, key: string): boolean;
public getKeyValue(file: TFile, key: string): unknown;
public createFolderIfMissing(path: string): Promise<void>;
public createFileIfMissing(path: string, content: string): Promise<void>;
}ProviderRegistry
Central registry for instantiating and retrieving reasoning and embedding clients dynamically.
export class ProviderRegistry {
public getReasoningClient(modelId?: string): IReasoningClient;
public getEmbeddingClient(provider?: 'gemini' | 'local' | 'ollama'): IEmbeddingClient;
public getModelProvider(modelId?: string): IModelProvider;
}5. Magic and configuration
Constants reference (src/constants.ts)
| Constant | Value | Description |
|---|---|---|
WORKER_INDEXER_CONSTANTS.SEARCH_LIMIT_DEFAULT | 5 | Default number of results for vector search. |
WORKER_INDEXER_CONSTANTS.SIMILARITY_THRESHOLD_STRICT | 0.001 | Minimum cosine similarity to consider a note "related". |
WORKER_INDEXER_CONSTANTS.KEYWORD_TOLERANCE | 2 | Levenshtein distance allowed for fuzzy keyword matching. |
WORKER_INDEXER_CONSTANTS.RECALL_THRESHOLD_PERMISSIVE | 1.0 | Orama threshold setting for maximum recall (Permissive/OR logic). |
SEARCH_CONSTANTS.CHARS_PER_TOKEN_ESTIMATE | 4 | Heuristic for budget calculation (English). |
SEARCH_CONSTANTS.SINGLE_DOC_SOFT_LIMIT_RATIO | 0.10 | Prevent any single doc from starving others in context assembly. |
GARDENER_CONSTANTS.PLAN_PREFIX | "Gardener Plan" | Prefix for generated hygiene plans. |
WORKER_CONSTANTS.CIRCUIT_BREAKER_RESET_MS | 300000 | (5 mins) Time before retrying a crashed worker. |
Anti-pattern watchlist
- Direct
app.vaultaccess in views: NEVER access the vault directly in a View for write operations. UseVaultManagerorMetadataManager. - Blocking the main thread: NEVER perform synchronous heavy math or huge JSON parsing on the main thread. Use the indexer worker.
- Local state in services: Services should remain stateless where possible, deferring state to
settingsor theGardenerStateService.
6. External integrations
LLM provider abstraction
As of Version 8.1.0, the system is provider-agnostic and uses a ProviderRegistry to dynamically manage multiple AI engines. While Google Gemini is the default cloud implementation (GeminiProvider), it also supports fully local models (OllamaProvider). All core services (AgentService, GardenerService, SearchOrchestrator) communicate via generic interfaces:
- Reasoning:
IReasoningClient(Messages, Structured JSON, Grounding, Code). - Embeddings:
IEmbeddingClient(Vectors). - Capabilities:
IModelProvider(Flag-based feature detection).
Failover and retry logic
- API Clients: Providers should implement internal retry logic (eg exponential backoff for
429 Too Many Requests).GeminiProvideruses this for all model interactions. - Local worker: Implements a "Progressive Stability Degradation" (ADR-003). If the worker crashes, it restarts with simpler settings (threads -> 1, SIMD -> off).
7. Developer onboarding guide
Build pipeline
- Tool:
esbuild. - Config:
esbuild.config.mjs. - Worker bundling: The worker source (
src/workers/*.ts) is inlined into base64 strings and injected intomain.jsusingesbuild-plugin-inline-worker. This allows the plugin to remain a single file distributable.
Testing strategy
- Automated tests:
- Unit tests for utilities (eg link parsing).
- Regression tests for worker-side token accumulation logic (
tests/worker_accumulation.test.ts). - Lifecycle and sharding integration tests for
GraphService.
- Manual testing:
- Use the "Debug Sidebar" (in Dev settings) to inspect the Worker state.
- Use
npm run devto watch for changes and hot-reload.