# Architecture Documentation
## Table of Contents
- [Overview](#overview)
- [Code Quality & Production Readiness](#code-quality--production-readiness)
- [System Architecture](#system-architecture)
- [Agentic Team Architecture](#agentic-team-architecture)
- [Component Design](#component-design)
- [Data Flow](#data-flow)
- [Adapter Pattern](#adapter-pattern)
- [Local Model Integration and Limitations](#local-model-integration-and-limitations)
- [Workflow Engine](#workflow-engine)
- [Security Architecture](#security-architecture)
- [Monitoring & Observability](#monitoring--observability)
- [Deployment Architecture](#deployment-architecture)
- [Design Patterns](#design-patterns)
- [Graph Context System](#graph-context-system)
- [Project-Scoped Graphs](#project-scoped-graphs)
- [Project Scanner](#project-scanner)
- [Obsidian Vault Export](#obsidian-vault-export)
- [Graphify — Code Knowledge Graph Engine](#graphify--code-knowledge-graph-engine)
- [Agentic Infrastructure](#agentic-infrastructure)
- [Performance Considerations](#performance-considerations)
- [Scalability](#scalability)
- [Optional: MCP Integration Layer](#optional-mcp-integration-layer)
## Overview
The AI Coding Tools Orchestrator is built on a modular, extensible architecture that enables multiple AI agents to collaborate effectively. The system follows enterprise design patterns and best practices for scalability, reliability, and maintainability.
### Core Principles
- **Modularity**: Clear separation of concerns between components
- **Extensibility**: Easy to add new agents and workflows
- **Reliability**: Robust error handling and retry logic
- **Performance**: Async execution and intelligent caching
- **Security**: Input validation, rate limiting, and audit logging
- **Observability**: Comprehensive metrics, structured logging, and automated report generation
## Code Quality & Production Readiness
The codebase has undergone a production-readiness overhaul achieving **Pylint 10.00/10** (up from 9.39/10) with 520 warnings eliminated, 386 tests passing, and all 15 pre-commit hooks green.
### Quality Metrics
| Metric | Value |
|--------|-------|
| **Pylint Score** | 10.00 / 10 (perfect — zero warnings) |
| **Test Suite** | 386 tests passing |
| **Pre-commit Hooks** | 15/15 passing (black, isort, flake8, mypy, bandit, pyupgrade, …) |
| **Warnings Eliminated** | 520 across the entire codebase |
### Pylint Configuration Philosophy
The `pyproject.toml` pylint configuration follows a strict philosophy: **suppress intentional design-pattern violations; fix everything else**.
**Line length** — `max-line-length = 120`. Black formats at 100 characters; pylint allows 120 to give slack for long strings, URLs, and generated code.
**Intentional suppressions (with documented rationale):**
| Code | Name | Rationale |
|------|------|-----------|
| R0801 | `duplicate-code` | `orchestrator/` and `agentic_team/` are independent by architectural design — parallel structure is intentional |
| R0902 | `too-many-instance-attributes` | Domain dataclasses legitimately carry many fields |
| R0917 | `too-many-positional-arguments` | Domain methods require multiple parameters |
| C0415 | `import-outside-toplevel` | Lazy imports for optional dependencies (Ollama, llama.cpp) |
| W0718 | `broad-exception-caught` | Error boundaries at adapter/CLI layer intentionally catch broadly |
| R0914 | `too-many-locals` | Complex algorithms (graph traversal, search ranking) |
| W0613 | `unused-argument` | Interface conformance — adapters implement `BaseAdapter` signatures |
| W0603 | `global-statement` | Singleton patterns for configuration and metrics |
**Similarity analysis** — minimum 8 similar lines, ignoring imports, docstrings, and comments to avoid false positives from boilerplate.
### Code Quality Patterns Enforced
- **Logging**: All logging uses lazy `%s` formatting — no f-string overhead in log calls
- **File I/O**: All file operations use explicit `encoding="utf-8"`
- **Abstract methods**: Use docstring-only body (no `pass` or `...`)
- **Subprocess calls**: Annotated with pylint disable comments where context manager usage isn't feasible
- **No stray `print()`**: All print statements in production code converted to `logger` calls
- **Pydantic compatibility**: `FieldInfo` false positives suppressed with inline comments
## System Architecture
### High-Level Architecture
```mermaid
graph TD
subgraph "User Interfaces"
CLI[CLI Interface
Click + Rich]
UI[Web UI
Vue 3 + Flask]
end
subgraph "Core Orchestration Layer"
ORCH[Orchestrator Core]
WF[Workflow Engine]
TM[Task Manager]
CFG[Config Manager]
end
subgraph "Cross-Cutting Concerns"
SEC[Security Layer]
CACHE[Cache Layer]
METRICS[Metrics System]
LOG[Logging System]
RETRY[Retry Logic]
end
subgraph "Adapter Layer"
BASE[Base Adapter]
COMM[CLI Communicator]
CLA[Claude Adapter]
COD[Codex Adapter]
GEM[Gemini Adapter]
COP[Copilot Adapter]
OLL[Ollama Adapter]
LLAMA[LlamaCpp Adapter]
end
subgraph "Runtime Controls"
OFF[Offline Detector]
FB[Fallback Manager]
end
subgraph "External AI Services"
CLAUDE[Claude Code CLI]
CODEX[Codex CLI]
GEMINI[Gemini CLI]
COPILOT[Copilot CLI]
OLLAMA_API[Ollama API]
OPENAI_LOCAL[OpenAI-Compatible Local API]
end
CLI --> ORCH
UI --> ORCH
ORCH --> WF
ORCH --> TM
ORCH --> CFG
ORCH --> OFF
ORCH --> FB
ORCH -.-> SEC
ORCH -.-> CACHE
ORCH -.-> METRICS
ORCH -.-> LOG
ORCH -.-> RETRY
WF --> BASE
BASE --> COMM
BASE --> CLA
BASE --> COD
BASE --> GEM
BASE --> COP
BASE --> OLL
BASE --> LLAMA
CLA --> CLAUDE
COD --> CODEX
GEM --> GEMINI
COP --> COPILOT
OLL --> OLLAMA_API
LLAMA --> OPENAI_LOCAL
```
### Component Layers
1. **Interface Layer** - User-facing interfaces (CLI and Web UI)
2. **Orchestration Layer** - Core business logic and workflow management
3. **Cross-Cutting Layer** - Security, caching, metrics, logging
4. **Adapter Layer** - AI agent integrations
5. **Runtime Controls** - Offline detection and fallback routing
6. **External Services** - Third-party AI CLIs and local model APIs
## Agentic Team Architecture
`AGENTIC_TEAM` is a separate runtime path for role-based autonomous team communication. It does not execute through the orchestrator workflow engine.
### Runtime Boundary
```mermaid
flowchart TB
subgraph Orchestrator Runtime
OCLI[ai-orchestrator run/shell]
OCORE[orchestrator.core]
OWF[Workflow Engine]
end
subgraph Agentic Team Runtime
AUI[agentic_team/ui/app.py]
ASHELL[ai-orchestrator agentic-shell]
AENGINE[agentic_team.engine]
end
OCLI --> OCORE --> OWF
AUI --> AENGINE
ASHELL --> AENGINE
```
### Core Components
```mermaid
graph TD
subgraph Agentic Team Runtime
ENG[AgenticTeamEngine]
CFG[Team Config Loader]
VAL[Role Mapping Validator]
FB[Fallback Manager]
ADP[Adapter Pool]
end
subgraph Interfaces
UIAPI[Standalone UI Backend]
REPL[Agentic Shell REPL]
end
subgraph UI Runtime
EVT[Socket Events]
GRAPH[Live Communication Graph]
TL[Turn Timeline]
LOGS[Runtime Logs]
end
UIAPI --> ENG
REPL --> ENG
ENG --> CFG
ENG --> VAL
ENG --> FB
ENG --> ADP
UIAPI --> EVT
EVT --> GRAPH
EVT --> TL
EVT --> LOGS
```
### Turn Loop and Decision Routing
```mermaid
sequenceDiagram
participant Lead as Lead Role
participant Engine as AgenticTeamEngine
participant Role as Target Role
participant Adapter as Bound Model Adapter
Lead->>Engine: initial request + message
loop each turn
Engine->>Adapter: role prompt (task + roster + transcript + incoming message)
Adapter-->>Engine: decision JSON
Engine->>Engine: parse/normalize action and route
alt action=message
Engine->>Role: next turn handoff
else action=finalize and role=lead
Engine-->>Lead: final output complete
end
end
```
### Communication Event Pipeline
```mermaid
flowchart LR
STEP[Engine turn_callback step] --> T1[team_turn event]
STEP --> T2[team_communication event]
STEP --> T3[progress_log event]
T1 --> UI1[Timeline]
T2 --> UI2[Directed edge graph]
T3 --> UI3[Runtime log panel]
```
### Graph Aggregation Model
```mermaid
classDiagram
class TeamTurn {
+int turn
+string from_role
+string to_role
+string from_agent
+string to_agent
+string action
+string message
}
class CommunicationEdge {
+string from_role
+string to_role
+int count
+bool latest
+bool selected
}
TeamTurn --> CommunicationEdge : grouped by route
```
### Validation and Fallback Pipeline
```mermaid
flowchart TD
START[Task request] --> V1{Any available agents?}
V1 -->|No| ERR1[Reject run]
V1 -->|Yes| V2{All role mappings valid?}
V2 -->|No| ERR2[Reject run with missing role:agent map]
V2 -->|Yes| RUN[Execute turn loop]
RUN --> EXE[Execute role agent via fallback manager]
EXE --> F{Primary success?}
F -->|Yes| DEC[Parse decision]
F -->|No| FBTRY[Try fallback adapter]
FBTRY --> DEC
DEC --> NEXT{Lead finalized?}
NEXT -->|Yes| DONE[Return final output]
NEXT -->|No and max turns reached| TIMEOUT[Return bounded fallback output]
NEXT -->|No| RUN
```
## Component Design
### Orchestrator Core
The central component that coordinates all operations.
```mermaid
graph LR
A[Orchestrator Core] --> B[Workflow Manager]
A --> C[Task Manager]
A --> D[Context Manager]
A --> E[Result Aggregator]
B --> F[Workflow Execution]
C --> G[Task Distribution]
D --> H[Session Storage]
E --> I[Output Formatting]
```
**Responsibilities:**
- Task reception and parsing
- Workflow selection and execution
- Agent coordination
- Result aggregation
- Session management
**Key Files:**
- `orchestrator/core.py` - Main orchestrator logic
- `orchestrator/workflow.py` - Workflow management
- `orchestrator/task_manager.py` - Task distribution
### Workflow Engine
Manages workflow definitions and execution.
```mermaid
stateDiagram-v2
[*] --> LoadWorkflow
LoadWorkflow --> ValidateWorkflow
ValidateWorkflow --> InitializeAgents
InitializeAgents --> ExecuteStep
ExecuteStep --> CollectFeedback
CollectFeedback --> ShouldIterate
ShouldIterate --> ExecuteStep: Yes
ShouldIterate --> AggregateResults: No
AggregateResults --> [*]
```
**Workflow Execution Characteristics:**
1. **Sequential Steps** - Agents execute one after another
2. **Iterative Refinement** - Workflow cycles until stop conditions are met
3. **Step-Level Fallback** - If a step fails due to recoverable connectivity/API issues, fallback agent can run
4. **Offline Filtering** - In offline mode, non-local agents are skipped at initialization
**Configuration (Supported Forms):**
```yaml
agents:
codex:
type: cli
command: codex
enabled: true
my-custom-llama:
type: llamacpp
endpoint: http://localhost:9000
offline: true
enabled: true
workflows:
default:
- agent: "codex"
task: "implement"
- agent: "gemini"
task: "review"
- agent: "claude"
task: "refine"
offline-default:
description: "Local-only workflow"
steps:
- agent: "local-code"
role: "implementer"
- agent: "local-instruct"
role: "reviewer"
```
### Adapter Layer
Abstracts AI agent interactions through a common interface.
```mermaid
classDiagram
class BaseAdapter {
<>
+name: str
+command: str
+timeout: int
+get_capabilities() List[AgentCapability]
+execute_task(task, context) AgentResponse
+execute_task_async(task, context) AgentResponse
+is_available() bool
}
class ClaudeAdapter {
+execute_task(task, context)
}
class CodexAdapter {
+execute_task(task, context)
}
class GeminiAdapter {
+execute_task(task, context)
}
class CopilotAdapter {
+execute_task(task, context)
}
class OllamaAdapter {
+execute_task(task, context)
+execute_task_async(task, context)
+list_models()
+pull_model()
+remove_model()
}
class LlamaCppAdapter {
+execute_task(task, context)
+execute_task_async(task, context)
+list_models()
}
BaseAdapter <|-- ClaudeAdapter
BaseAdapter <|-- CodexAdapter
BaseAdapter <|-- GeminiAdapter
BaseAdapter <|-- CopilotAdapter
BaseAdapter <|-- OllamaAdapter
BaseAdapter <|-- LlamaCppAdapter
```
**Base Adapter Interface:**
```python
class BaseAdapter(ABC):
@abstractmethod
def get_capabilities(self) -> List[AgentCapability]:
"""Declare supported capability set."""
pass
@abstractmethod
def execute_task(self, task: str, context: Dict[str, Any]) -> AgentResponse:
"""Execute task with the AI agent."""
pass
async def execute_task_async(self, task: str, context: Dict[str, Any]) -> AgentResponse:
"""Async execution hook (default delegates to sync)."""
...
```
### CLI Communicator
Handles robust communication with external CLI tools.
```mermaid
sequenceDiagram
participant O as Orchestrator
participant C as CLI Communicator
participant A as AI Agent CLI
O->>C: execute_command(cmd, input)
C->>C: validate_input()
C->>C: apply_timeout()
C->>A: spawn_process(cmd)
A-->>C: stdout/stderr
C->>C: parse_output()
C->>C: handle_errors()
C-->>O: AgentResponse
```
**Features:**
- Process management
- Timeout handling
- Error recovery
- Output parsing
- Retry logic
## Data Flow
### Task Execution Flow
```mermaid
sequenceDiagram
participant U as User
participant CLI as CLI/UI
participant O as Orchestrator
participant W as Workflow Engine
participant A as Adapter
participant AI as AI Agent
U->>CLI: Submit task
CLI->>O: execute_task(task, workflow)
O->>O: Validate input
O->>O: Load configuration
O->>W: set_workflow(steps)
O->>W: execute_workflow_iteration(...)
loop For each agent in workflow
W->>A: execute_task(task, context)
A->>AI: Send command
AI-->>A: Response
A->>A: Parse & normalize
A-->>W: AgentResponse
W->>W: Update context
end
W-->>O: WorkflowResult
O->>O: Aggregate results
O-->>CLI: Final output
CLI-->>U: Display results
```
### Conversation Mode Flow
```mermaid
sequenceDiagram
participant U as User
participant S as Shell
participant C as Context Manager
participant O as Orchestrator
U->>S: Initial task
S->>O: execute(task)
O-->>S: Result
S->>C: store_context(task, result)
U->>S: Follow-up message
S->>S: detect_followup()
S->>C: get_context()
C-->>S: Previous context
S->>O: execute(followup, context)
O-->>S: Result
S->>C: update_context(result)
```
### File Generation Flow
```mermaid
graph LR
A[Task Execution] --> B[Agent Response]
B --> C[Extract Code Blocks]
C --> D[Validate File Paths]
D --> E[Check Workspace]
E --> F{File Exists?}
F -->|Yes| G[Create Backup]
F -->|No| H[Create New File]
G --> H
H --> I[Write Content]
I --> J[Update File Registry]
J --> K[Return File Paths]
```
## Adapter Pattern
### Why Adapters?
Adapters provide a consistent interface to heterogeneous AI agent CLIs:
- **Abstraction**: Hide CLI-specific details
- **Consistency**: Uniform interface for all agents
- **Flexibility**: Easy to swap or add agents
- **Testability**: Mock adapters for testing
- **Resilience**: Isolated error handling
### Adapter Implementation
```python
class OllamaAdapter(BaseAdapter):
def __init__(self, config: Dict[str, Any]):
local_config = dict(config)
local_config.setdefault("offline", True)
super().__init__(local_config)
self.model = local_config.get("model", "codellama:13b")
self.endpoint = str(local_config.get("endpoint", "http://localhost:11434")).rstrip("/")
self.timeout = int(local_config.get("timeout", 300))
async def execute_task_async(self, task: str, context: Dict[str, Any]) -> AgentResponse:
prompt = self._build_local_llm_prompt(task, context)
async with httpx.AsyncClient(timeout=self.timeout) as client:
resp = await client.post(
f"{self.endpoint}/api/generate",
json={"model": self.model, "prompt": prompt, "stream": False},
)
resp.raise_for_status()
data = resp.json()
return AgentResponse(success=True, output=data.get("response", ""))
```
## Local Model Integration and Limitations
Local backends (Ollama, llama.cpp, LocalAI, and other OpenAI-compatible servers) are integrated as standard adapters and participate in:
- workflow step execution,
- offline-only filtering,
- cloud-to-local fallback routing,
- local model health/model discovery endpoints.
Execution semantics differ from CLI adapters:
| Adapter family | Transport | Workspace edit path |
|---|---|---|
| CLI adapters (`codex`, `claude`, `gemini`, `copilot`) | Local CLI process | Can modify files when workspace execution is used |
| Local model adapters (`ollama`, `llamacpp`, `localai`, `openai-compatible`) | HTTP completion endpoints | Text output only; no direct file writes |
Design implication:
- Assigning a local model to an "implement" role is supported, but that step behaves as advisory text generation unless another editing-capable agent applies changes.
Best use:
- local drafting, critique/review, and resilience fallback in hybrid workflows.
> [!CAUTION]
> The local model itself doesn’t edit files, but you can make it do so by adding an agent/tool layer around it (same idea as Claude/Codex/Copilot CLIs): give it tools like read_file, write_file, apply_patch, run_tests, then let an orchestrator execute those tool calls.
>
> In this project, that would mean extending local adapters from “text completion only” to a workspace-execution loop (or routing local models through an MCP/tool-calling bridge). The hard part is not feasibility, it’s safety and reliability: permissions, diff constraints, validation/tests before write, rollback, and preventing bad edits.
## Workflow Engine
### Workflow Execution
```mermaid
graph TD
START([Start Workflow]) --> LOAD[Load Workflow Definition / Dynamic Planner]
LOAD --> VALIDATE[Validate Workflow]
VALIDATE --> INIT[Initialize Agents]
INIT --> ITER{Iteration < Max?}
ITER -->|Yes| EXEC[Execute Workflow Steps]
EXEC --> STEP1[Agent 1: Implementation]
STEP1 --> STEP2[Agent 2: Review]
STEP2 --> STEP3[Agent 3: Refinement]
STEP3 --> COLLECT[Collect Feedback]
COLLECT --> CHECK{Sufficient
Suggestions?}
CHECK -->|Yes| UPDATE[Update Context]
UPDATE --> ITER
CHECK -->|No| AGGREGATE[Aggregate Results]
ITER -->|No| AGGREGATE
AGGREGATE --> REPORT[Generate Report]
REPORT --> END([End])
```
### Dynamic Planner Agent & Metrics-Based Routing
The Orchestrator features a **Dynamic Planner Agent** (`orchestrator/core/planner.py`) that replaces static YAML workflows. When a task is executed using the `dynamic` workflow (or if a requested workflow is missing), the Planner Agent:
1. **Reads Observability Metrics:** It fetches live success/failure rates from Prometheus metrics (`orchestrator_agent_calls_total`).
2. **Evaluates Routing Policy:** Any agent with a success rate below `0.6` is deprioritized to avoid cascading failures.
3. **Generates a Plan:** It uses a preferred LLM adapter to break the task down into sequential steps (e.g., `implement`, `review`, `refine`) and assigns healthy, available agents dynamically.
This metrics-based routing ensures the system automatically adapts to API outages, degraded model performance, or local backend unavailability without manual configuration changes.
### Workflow Configuration
Workflows can still be defined statically in YAML (if not using the dynamic planner):
```yaml
workflows:
thorough:
- agent: "codex"
task: "implement"
description: "Create initial implementation"
- agent: "copilot"
task: "suggestions"
description: "Get alternative approaches"
- agent: "gemini"
task: "review"
description: "Comprehensive code review"
- agent: "claude"
task: "refine"
description: "Implement feedback"
- agent: "gemini"
task: "review"
description: "Verify improvements"
hybrid:
description: "Local draft with cloud review + fallback"
steps:
- agent: "local-code"
role: "implementer"
- agent: "claude"
role: "reviewer"
fallback: "local-instruct"
settings:
max_iterations: 5
fallback:
enabled: true
map:
claude: local-instruct
offline:
enabled: false
auto_detect: true
```
### Offline and Fallback Runtime
`Orchestrator` resolves runtime mode and adapter availability before execution:
1. Determine offline mode from `--offline`, `settings.offline.enabled`, and cached connectivity auto-detection.
2. Initialize adapters dynamically from `agents..type`.
3. In offline mode, skip non-local agents.
4. For each step, try primary adapter.
5. On recoverable connection/API failure, execute mapped or step-level fallback adapter.
## Security Architecture
### Security Layers
```mermaid
graph TD
INPUT[User Input] --> VAL[Input Validation]
VAL --> SANITIZE[Sanitization]
SANITIZE --> RATE[Rate Limiting]
RATE --> AUTH[Authorization Check]
AUTH --> EXECUTE[Execute Task]
EXECUTE --> AUDIT[Audit Logging]
AUDIT --> OUTPUT[Return Output]
```
### Security Components
1. **Input Validation**
- Command injection prevention
- Path traversal protection
- Malicious payload detection
2. **Rate Limiting**
- Token bucket algorithm
- Per-user limits
- Global rate limits
3. **Secret Management**
- Environment variables
- Secure key storage
- No hardcoded credentials
4. **Audit Logging**
- All security events logged
- Tamper-proof logs
- Retention policies
**Implementation:**
```python
class SecurityManager:
def validate_input(self, user_input: str) -> bool:
# Check for command injection
if self._contains_shell_metacharacters(user_input):
raise SecurityError("Potential command injection")
# Check for path traversal
if self._contains_path_traversal(user_input):
raise SecurityError("Path traversal detected")
return True
def rate_limit_check(self, user_id: str) -> bool:
if not self.rate_limiter.allow_request(user_id):
raise RateLimitError("Rate limit exceeded")
return True
```
## Monitoring & Observability
### Metrics Architecture
```mermaid
graph LR
A[Application] --> B[Metrics Collector]
B --> C[Prometheus]
C --> D[Grafana]
D --> E[Dashboards]
A --> F[Structured Logging]
F --> G[Log Aggregator]
G --> H[Log Analysis]
A --> I[Report Generator]
I --> J[JSON Reports]
I --> K[HTML Dashboard]
```
### Report Generation
The `ReportGenerator` (`orchestrator/observability/report_generator.py`) automatically produces reports after each task execution when `create_reports: true` is set in config. Reports are written as JSON files plus an interactive HTML dashboard.
```mermaid
flowchart LR
ENG[Engine.execute_task] --> RG[ReportGenerator]
RG --> EXEC[exec_*.json
Execution Summary]
RG --> PERF[perf_*.json
Agent Performance]
RG --> WF[workflow_*.json
Workflow Analytics]
RG --> HEALTH[health_*.json
System Health]
RG --> CFG[config_*.json
Config Audit]
RG --> DASH[dashboard_*.html
Chart.js Dashboard]
RG --> IDX[INDEX.json
Report Catalog]
style DASH fill:#276749,stroke:#22543d,color:#fff
```
**Report types:**
- **Execution Summary** — Per-task results with steps, agents, fallbacks, suggestions, and duration
- **Agent Performance** — Aggregated success rates, call counts, and task type distribution
- **Workflow Analytics** — Per-workflow run counts, success rates, and average iterations
- **System Health** — Health check results with disk, memory, Python version, and platform info
- **Config Audit** — Agent availability, workflow structure, and settings snapshot
- **HTML Dashboard** — Interactive Chart.js dashboard with KPI cards, daily volume bar chart, agent success/failure stacked bar, duration trend line, and workflow distribution doughnut
### Key Metrics
**Task Metrics:**
- `orchestrator_tasks_total` - Counter
- `orchestrator_task_duration_seconds` - Histogram
- `orchestrator_task_failures_total` - Counter
**Agent Metrics:**
- `orchestrator_agent_calls_total` - Counter
- `orchestrator_agent_errors_total` - Counter
- `orchestrator_agent_response_time_seconds` - Histogram
**System Metrics:**
- `orchestrator_cache_hits_total` - Counter
- `orchestrator_cache_misses_total` - Counter
- `orchestrator_active_sessions` - Gauge
### Structured Logging
```python
import structlog
logger = structlog.get_logger()
logger.info(
"task_executed",
task_id="task-123",
workflow="default",
duration_ms=1234.56,
agent="codex",
success=True
)
```
## Deployment Architecture
### Container Architecture
```mermaid
graph TD
subgraph "Kubernetes Cluster"
subgraph "Namespace: ai-orchestrator"
POD1[Pod: Orchestrator]
POD2[Pod: UI Backend]
POD3[Pod: UI Frontend]
SVC1[Service: Orchestrator]
SVC2[Service: UI]
ING[Ingress Controller]
end
subgraph "Namespace: monitoring"
PROM[Prometheus]
GRAF[Grafana]
end
PVC1[PersistentVolume: Workspace]
PVC2[PersistentVolume: Sessions]
PVC3[PersistentVolume: Logs]
end
POD1 --> SVC1
POD2 --> SVC2
POD3 --> SVC2
SVC2 --> ING
POD1 -.-> PVC1
POD1 -.-> PVC2
POD1 -.-> PVC3
POD1 -.-> PROM
PROM -.-> GRAF
```
### Docker Compose Setup
```yaml
version: '3.8'
services:
orchestrator:
build: .
volumes:
- ./workspace:/app/workspace
- ./sessions:/app/sessions
ports:
- "9090:9090" # Metrics
environment:
- LOG_LEVEL=INFO
- ENABLE_METRICS=true
prometheus:
image: prom/prometheus
volumes:
- ./monitoring/prometheus.yml:/etc/prometheus/prometheus.yml
ports:
- "9091:9090"
grafana:
image: grafana/grafana
ports:
- "3000:3000"
environment:
- GF_SECURITY_ADMIN_PASSWORD=admin
```
## Design Patterns
### Patterns Used
#### 1. Adapter Pattern
Provides a uniform interface to different AI agent CLIs.
#### 2. Strategy Pattern
Workflows implement different strategies for task execution.
#### 3. Chain of Responsibility
Request processing through validation, execution, and post-processing.
#### 4. Observer Pattern
Real-time updates in Web UI via Socket.IO.
#### 5. Factory Pattern
Agent and workflow creation.
#### 6. Singleton Pattern
Configuration manager, metrics collector.
#### 7. Decorator Pattern
Retry logic, caching, logging decorators.
### Example: Retry Decorator
```python
from functools import wraps
from tenacity import retry, stop_after_attempt, wait_exponential
def with_retry(max_attempts=3):
def decorator(func):
@wraps(func)
@retry(
stop=stop_after_attempt(max_attempts),
wait=wait_exponential(multiplier=1, min=2, max=10)
)
def wrapper(*args, **kwargs):
return func(*args, **kwargs)
return wrapper
return decorator
@with_retry(max_attempts=3)
def execute_agent_task(agent, task):
return agent.execute_task(task, {"role": "implement"})
```
## Graph Context System
The Graph Context System provides persistent memory capabilities for AI agents, enabling learning from past conversations, tasks, and mistakes. The system follows the project's core architectural boundary: **two fully independent context implementations** — one for the Orchestrator and one for the Agentic Team — with zero shared imports between them.
### Dual Context Architecture
Both engines maintain isolated context databases with identical schemas but independent codebases:
```mermaid
graph TB
subgraph "Orchestrator Context"
direction TB
ODB[(~/.ai-orchestrator/context.db)]
subgraph "orchestrator/context/"
OMM[memory_manager.py]
subgraph "models/"
OSCH[schemas.py
Node, Edge, NodeType, EdgeType]
end
subgraph "store/"
OGS[graph_store.py
SQLite + WAL + FTS5]
end
subgraph "search/"
OBM[bm25_index.py]
OEM[embeddings.py
sentence-transformers]
OHY[hybrid_search.py
RRF fusion]
OAD[advanced_search.py
temporal, tags, importance]
end
subgraph "ops/"
OAN[analytics.py]
OPR[pruning.py]
OEX[export.py]
OVR[versioning.py]
end
end
end
subgraph "Agentic Team Context"
direction TB
ADB[(~/.agentic-team/context.db)]
subgraph "agentic_team/context/"
AMM[memory_manager.py]
subgraph "models/ "
ASCH[schemas.py
Node, Edge, NodeType, EdgeType]
end
subgraph "store/ "
AGS[graph_store.py
SQLite + WAL + FTS5]
end
subgraph "search/ "
ABM[bm25_index.py]
AFT[fts_search.py
SQLite FTS5 native]
end
subgraph "ops/ "
AAN[analytics.py]
APR[pruning.py]
AEX[export.py]
end
end
end
subgraph "Context Dashboard"
DASH[Flask + vis.js + Chart.js
port 5003]
end
ODB --> DASH
ADB --> DASH
```
> **Key difference:** The Orchestrator context includes sentence-transformer embeddings for semantic search, a full hybrid search engine with RRF fusion, advanced search (temporal, tag, importance queries), and a versioning system. The Agentic Team context uses SQLite FTS5 natively for lighter-weight full-text search without embedding dependencies.
### Node Types
Both systems define 10 node types via `NodeType` enum in their respective `models/schemas.py`:
| Node Type | Description | Key Fields |
|-----------|-------------|------------|
| `conversation` | Past chat sessions with AI agents | messages, agent, timestamp |
| `task` | Completed tasks with outcomes | description, outcome, duration, agent |
| `mistake` | Errors with corrections and prevention | description, correction, prevention, category |
| `pattern` | Reusable code patterns and techniques | code, language, use_case, tags |
| `decision` | Architectural decisions with rationale | decision, rationale, alternatives, status |
| `code_snippet` | Useful code fragments for reuse | code, language, description, tags |
| `preference` | Learned user preferences | key, value, confidence |
| `file` | File references tracked in context | path, language, summary |
| `concept` | Domain concepts and definitions | name, definition, related_topics |
| `agent_output` | Raw AI agent outputs | agent, task, output, quality_score |
Each node carries: `id`, `node_type`, `content`, `title`, `metadata` (dict), `tags` (list), `created_at`, `updated_at`, `embedding` (optional float vector), and `importance_score` (float, default 1.0).
### Edge Types
12 semantic edge types via `EdgeType` enum for building the knowledge graph:
| Edge Type | Purpose | Example |
|-----------|---------|---------|
| `RELATED_TO` | General relationship | Task ↔ Conversation |
| `CAUSED_BY` | Error causation chain | Mistake → Root Cause Task |
| `FIXED_BY` | Solution mapping | Mistake → Fix Pattern |
| `SIMILAR_TO` | Semantic similarity | Task ↔ Similar Task |
| `DEPENDS_ON` | Task/concept dependencies | Task → Prerequisite Task |
| `PRECEDED_BY` | Temporal ordering (before) | Task → Earlier Task |
| `FOLLOWED_BY` | Temporal ordering (after) | Task → Later Task |
| `LEARNED_FROM` | Learning provenance | Pattern → Source Conversation |
| `REFERENCES` | Cross-referencing | Decision → Code Snippet |
| `CONTAINS` | Hierarchical containment | Conversation → Task |
| `PRODUCED_BY` | Output attribution | Code Snippet → Agent Output |
| `USED_IN` | Usage tracking | Pattern → Task |
Each edge carries: `id`, `source_id`, `target_id`, `edge_type`, `weight` (float), `metadata` (dict), and `created_at`.
### Search Architecture
The Orchestrator context implements a three-tier search system — keyword, semantic, and advanced — combined via Reciprocal Rank Fusion:
```mermaid
sequenceDiagram
participant C as Caller
participant MM as MemoryManager
participant HS as HybridSearch
participant BM as BM25 Index
participant EM as Embeddings
(all-MiniLM-L6-v2)
participant RRF as RRF Fusion
participant AS as AdvancedSearch
C->>MM: search("auth patterns", mode="hybrid")
MM->>HS: hybrid_search(query, top_k)
par Parallel Execution
HS->>BM: keyword_search(query)
BM-->>HS: keyword_results (ranked by term frequency)
and
HS->>EM: generate_embedding(query)
EM-->>HS: semantic_results (ranked by cosine similarity)
end
HS->>RRF: fuse(keyword_results, semantic_results, k=60)
RRF-->>HS: merged_ranking
HS-->>MM: top_k results
Note over C,AS: Advanced queries bypass hybrid search
C->>MM: search(mode="temporal")
MM->>AS: search_temporal(start, end)
AS-->>MM: time-filtered results
```
**Search modes:**
| Mode | Engine | Description |
|------|--------|-------------|
| BM25 keyword | `bm25_index.py` | Term frequency–inverse document frequency ranking |
| Semantic | `embeddings.py` | Sentence-transformer (`all-MiniLM-L6-v2`) cosine similarity |
| Hybrid | `hybrid_search.py` | Parallel BM25 + semantic merged via RRF (k=60) |
| FTS5 native | `fts_search.py` (Agentic Team) | SQLite FTS5 full-text search with `MATCH` syntax |
| Temporal | `advanced_search.py` | Filter nodes by `created_at` date ranges |
| Tag-based | `advanced_search.py` | Filter nodes by tag sets |
| Importance | `advanced_search.py` | Filter nodes above an importance threshold |
### Operations
Each context system provides operational tooling via its `ops/` sub-package:
**Analytics** (`ops/analytics.py`) — Comprehensive graph metrics:
- `get_node_distribution()` — Count of nodes grouped by type
- `get_edge_distribution()` — Count of edges grouped by relationship type
- `get_temporal_growth(days)` — Node creation timeline
- `get_success_rate_by_agent()` — Per-agent task success/failure rates
- `get_top_patterns(limit)` / `get_top_mistakes(limit)` — Most referenced patterns and most common mistakes
- `get_database_stats()` — DB size, table counts, index health
- `get_agent_activity_heatmap(days)` — Activity by agent over time
- `get_comprehensive_report()` — Full analytics summary
**Pruning** (`ops/pruning.py`) — Graph maintenance strategies:
- `prune_by_age(max_age_days)` — Remove nodes older than threshold
- `prune_duplicates()` — Detect and merge near-duplicate nodes
- `prune_low_importance(threshold)` — Remove nodes below importance score
- `prune_all()` — Run all strategies in sequence
**Export / Import** (`ops/export.py`) — Data portability:
- `export_json(output_path, node_types)` — Export graph to JSON (optional type filter)
- `import_json(input_path)` — Import graph from JSON
- `export_graphml(output_path)` — Export to GraphML for external graph tools
- `export_obsidian(output_path, node_types)` — Export as [Obsidian](https://obsidian.md) vault with `[[wikilinks]]` and graph-view colors
- `get_export_summary()` — Preview of export contents
**Versioning** (`ops/versioning.py`, Orchestrator only) — Node history tracking:
- `record_version(node_id, change_type)` — Snapshot node state on update
- `get_versions(node_id)` / `get_version(node_id, version)` — Retrieve version history
- `rollback(node_id, version)` — Restore a previous version
- `get_change_log(limit)` — Recent changes across all nodes
- `diff_versions(node_id, v1, v2)` — Compare two versions of a node
### Context Dashboard
The Context Dashboard (`context_dashboard/`) provides a web-based visualization and management UI at **port 5003**:
```mermaid
graph LR
subgraph "Context Dashboard (Flask)"
APP[app.py
Flask + CORS]
TPL[templates/dashboard.html]
end
subgraph "Frontend Libraries"
VIS[vis-network 9.1.6
Graph visualization]
CHT[Chart.js 4.4.0
Analytics charts]
end
subgraph "Data Sources"
ODB[(Orchestrator
context.db)]
ADB[(Agentic Team
context.db)]
end
ODB --> APP
ADB --> APP
APP --> TPL
TPL --> VIS
TPL --> CHT
```
The dashboard aggregates both context databases and provides:
- **Interactive graph explorer** — vis-network powered node/edge visualization with click-to-inspect
- **Analytics charts** — Node distribution, temporal growth, agent activity heatmaps via Chart.js
- **Search interface** — Query across both context systems
- **Export controls** — Download graph data as JSON, GraphML, or **Obsidian vault**
### Integration
Both engines automatically store task results and mistakes into their respective context graphs:
```python
# Automatic storage in execute_task()
result = engine.execute_task("Build login system")
# → Task node automatically stored with outcome, duration, agent metadata
# Log a mistake to prevent repetition
manager.log_mistake(
description="Used string formatting in SQL query",
correction="Changed to parameterized query",
prevention="Always use ? placeholders",
category="security"
)
# → Mistake node stored, edges link to related tasks/patterns
# Retrieve relevant context for a new task
context = engine.get_relevant_context("authentication patterns")
# → Hybrid search returns ranked results from past tasks, mistakes, patterns
```
### Auto-Seeding
The script `scripts/seed_context_graphs.py` pre-populates both context databases with sample data (conversations, tasks, mistakes, patterns, decisions) for development and testing. The Context Dashboard also calls `_auto_seed_if_empty()` on startup to ensure a non-empty graph for first-run exploration.
### Project-Scoped Graphs
Both systems support project-scoped context graphs that isolate knowledge per user project. This enables portable, multi-project operation without context bleed.
```mermaid
graph TB
subgraph "Project-Scoped Architecture"
direction TB
ENV["PROJECT_PATH env var
or settings.project_path"] --> ENGINE[Engine Startup]
ENGINE --> REG[register_project]
REG --> SCANNER[ProjectScanner]
SCANNER --> PID["project_id = SHA-256[:16]
of normalized absolute path"]
subgraph "Graph Partitioning"
direction LR
G1["Project A
All nodes tagged with pid_A"]
G2["Project B
All nodes tagged with pid_B"]
G3["Global
project_id='' (universal knowledge)"]
end
PID --> G1
PID --> G2
subgraph "Atomic Operations"
UPSERT["add_node: INSERT ON CONFLICT UPDATE
(preserves edges)"]
BULK["delete_nodes_by_project
(single transaction)"]
RESCAN["rescan_project: delete + rebuild
(atomic swap)"]
end
end
style G1 fill:#2b6cb0,stroke:#2c5282,color:#fff
style G2 fill:#276749,stroke:#22543d,color:#fff
style G3 fill:#744210,stroke:#975a16,color:#fff
```
**Data Integrity Guarantees:**
| Operation | Guarantee | Implementation |
|-----------|-----------|----------------|
| Node upsert | Edge-preserving | `INSERT ... ON CONFLICT(id) DO UPDATE SET` (no cascade delete) |
| Project deletion | Atomic | Single-transaction `DELETE FROM nodes WHERE project_id = ?` |
| Project rescan | Atomic swap | `delete_nodes_by_project()` then `register_project()` |
| Schema migration | Race-safe | `ALTER TABLE` with catch on existing column |
**Configuration:**
```yaml
# In orchestrator/config/agents.yaml or agentic_team/config/agents.yaml
settings:
project_path: "/path/to/user/project" # or set PROJECT_PATH env var
```
### Project Scanner
The `ProjectScanner` module (`orchestrator/context/ops/project_scanner.py` and its independent copy at `agentic_team/context/ops/project_scanner.py`) analyzes a project directory and produces context graph nodes.
```mermaid
flowchart TD
PATH[Project Root Path] --> WALK[os.walk with SKIP_DIRS filter]
WALK --> FILES["File Metadata
(path, size, language, extension)"]
WALK --> DETECT["Language Detection
(extension mapping)"]
WALK --> FW["Framework Detection
(indicator files)"]
WALK --> STRUCT["Structure Analysis
(top-level directories)"]
FILES --> FN[File Nodes]
DETECT --> PN[Pattern Nodes]
FW --> DN[Decision Nodes]
STRUCT --> PROJ[Project Node]
FN & PN & DN & PROJ --> EDGES[Relationship Edges]
EDGES --> GRAPH[(Context Graph)]
style GRAPH fill:#2b6cb0,stroke:#2c5282,color:#fff
```
**Scanner capabilities:**
- Detects 30+ programming languages via file extension mapping
- Identifies 20+ frameworks from indicator files (package.json, requirements.txt, Cargo.toml, etc.)
- Respects `.gitignore`-style skip patterns (node_modules, __pycache__, .git, etc.)
- Safety limit of 5,000 files per scan to prevent runaway on monorepos
- Produces `ProjectNode`, `FileNode`, `PatternNode`, and `DecisionNode` objects with relationship edges
### Obsidian Vault Export
All three graph systems support exporting to [Obsidian](https://obsidian.md)-compatible vaults, enabling interactive visual exploration of code structure, context memory, and team interactions through Obsidian's native graph view.
```mermaid
flowchart TD
subgraph Sources["Graph Data Sources"]
GS[Graphify GraphStore
Code structure: classes, functions, imports]
OS[Orchestrator GraphStore
Context memory: tasks, decisions, patterns]
AS[Agentic Team GraphStore
Team context: tasks, decisions, agent outputs]
end
subgraph Export["Export Pipeline"]
GS --> GE["GraphExporter.to_obsidian()"]
OS --> OE["ContextExporter.export_obsidian()"]
AS --> AE["ContextExporter.export_obsidian()"]
end
subgraph Vault["Obsidian Vault Structure"]
direction TB
GE & OE & AE --> NOTES["Per-Node Markdown Notes
YAML frontmatter + body + [[wikilinks]]"]
GE & OE & AE --> FOLDERS["Typed Folders
Classes/ Tasks/ Decisions/ ..."]
GE & OE & AE --> INDEX["_Index.md
Map of Content with stats"]
GE & OE & AE --> CONFIG[".obsidian/ Config
graph.json · appearance.json · core-plugins.json"]
end
subgraph Obsidian["Obsidian App"]
CONFIG --> GRAPH["Graph View (Ctrl/Cmd+G)
Color-coded node types
Interactive exploration"]
NOTES --> LINKS["Backlink Navigation
Click [[wikilinks]] to traverse"]
INDEX --> MOC["Map of Content
Browse by category"]
end
style GS fill:#4CAF50,color:#fff
style OS fill:#2196F3,color:#fff
style AS fill:#FF9800,color:#fff
style CONFIG fill:#7C3AED,color:#fff
style GRAPH fill:#7C3AED,color:#fff
```
**Note format (per node):**
```markdown
---
type: "task"
tags: ["task", "auth", "security"]
importance: 0.85
created: "2025-06-15T10:30:00Z"
project_id: "a1b2c3d4"
---
# ✅ Implement JWT Authentication
Task content and description...
## Relationships
### → Related To
- [[Decisions/Use SQLite for storage|Use SQLite for storage]]
### ← Used In
- [[Patterns/Adapter pattern|Adapter pattern]]
```
**`.obsidian/graph.json` color configuration:**
Each exporter generates a `graph.json` with `colorGroups` that assign distinct colors to each node type using tag-based queries (`tag:#task`, `tag:#class`, etc.). This means the graph view immediately renders a color-coded relationship web with no manual configuration required.
| Component | Graphify Colors | Context System Colors |
|-----------|----------------|----------------------|
| Core nodes | 🟢 Classes, 🔵 Functions, 📄 Files | ✅ Tasks, ⚖️ Decisions, 🔁 Patterns |
| Structural | 📦 Modules, 📂 Directories | 💬 Conversations, ❌ Mistakes |
| References | 📥 Imports, 🧪 Tests | 💻 Code Snippets, 💡 Concepts |
### Graphify — Code Knowledge Graph Engine
`graphify/` is a standalone system (zero imports from orchestrator or agentic_team) that builds deep, queryable knowledge graphs from any project directory using AST parsing and pattern analysis.
```mermaid
graph TB
subgraph "Graphify Pipeline"
DIR[Project Directory] --> SCAN[Scanner]
SCAN --> CACHE[SHA-256 Cache]
SCAN --> PY[Python AST Analyzer]
SCAN --> JS[JavaScript Analyzer]
SCAN --> DOC[Doc Analyzer]
SCAN --> CFG[Config Analyzer]
SCAN --> GEN[Generic Analyzer
Go/Rust/Java/C++]
PY & JS & DOC & CFG & GEN --> STORE[GraphStore
SQLite + FTS5]
STORE --> SEARCH[FTS Search + Query Engine]
STORE --> API[REST API]
STORE --> EXPORT[JSON / DOT / GraphML / HTML / Obsidian]
STORE --> METRICS[Scan Metrics]
STORE --> SNAP[Snapshots & Diffs]
end
```
**Key capabilities:**
- **6 analyzers**: Python (AST-based), JavaScript/TypeScript, Markdown/RST, YAML/JSON/TOML/Dockerfile, Go/Rust/Java/C++/etc. (generic)
- **15 node types**: PROJECT, DIRECTORY, FILE, MODULE, CLASS, FUNCTION, IMPORT, DEPENDENCY, CONFIG, DOCUMENTATION, TEST, PATTERN, VARIABLE, RATIONALE, COMMUNITY
- **11 edge types**: CONTAINS, IMPORTS, INHERITS, CALLS, DEPENDS_ON, TESTS, DOCUMENTS, CONFIGURED_BY, EXPORTS, SIBLING, MEMBER_OF
- **23 languages**: Python, JS, TS, Java, Go, Rust, Ruby, C++, C, C#, Swift, Kotlin, PHP, Shell, SQL, HTML, CSS, YAML, JSON, TOML, Markdown, Dockerfile
- **Schema migrations**: v1 → v2 (confidence/provenance) → v3 (metrics/snapshots tables)
- **Intelligence**: God node analysis, community detection, BFS path finding, complexity hotspots
- **Operations**: File watching (watchdog + polling), graph snapshots & diffing, scan metrics, SHA-256 content cache
See [GRAPHIFY.md](GRAPHIFY.md) for comprehensive documentation with Mermaid diagrams.
## Agentic Infrastructure
The platform provides comprehensive infrastructure to empower AI agents through specialized agents, a skills library, domain rules, and MCP tools. Configuration lives in `.claude/`, `.codex/`, and the project root (`AGENTS.md`).
### Specialized Agents
Agents are role-specific AI personas with deep domain expertise. Each agent file defines a system prompt, preferred tools, and references to relevant skills and rules.
```mermaid
mindmap
root((Specialized
Agents))
Web Development
web-frontend
Backend
backend-api
database-architect
Security
security-specialist
code-reviewer
Infrastructure
devops-infrastructure
performance-engineer
AI/ML
ai-ml-engineer
Mobile
mobile-developer
Quality
test-runner
Documentation
documentation-writer
```
#### Claude Agents (11)
Defined as Markdown files in `.claude/agents/`. Invoked via `@agent-name` in Claude Code:
| Agent | File | Expertise |
|-------|------|-----------|
| web-frontend | `.claude/agents/web-frontend.md` | React, Vue, Angular, CSS, Accessibility |
| backend-api | `.claude/agents/backend-api.md` | REST, GraphQL, Microservices, Flask/FastAPI |
| security-specialist | `.claude/agents/security-specialist.md` | OWASP, Secure Coding, Audits |
| devops-infrastructure | `.claude/agents/devops-infrastructure.md` | Docker, K8s, CI/CD, Cloud |
| ai-ml-engineer | `.claude/agents/ai-ml-engineer.md` | ML Pipelines, LLMs, RAG, Embeddings |
| database-architect | `.claude/agents/database-architect.md` | Schema Design, Query Optimization, Migrations |
| mobile-developer | `.claude/agents/mobile-developer.md` | React Native, Flutter, Native iOS/Android |
| performance-engineer | `.claude/agents/performance-engineer.md` | Profiling, Caching, Load Testing |
| documentation-writer | `.claude/agents/documentation-writer.md` | API Docs, Architecture, READMEs, Tutorials |
| code-reviewer | `.claude/agents/code-reviewer.md` | Code Quality, Best Practices, PR Reviews |
| test-runner | `.claude/agents/test-runner.md` | Test Execution, Failure Diagnosis, Coverage |
#### Codex Agents (13)
Defined as TOML files in `.codex/agents/`. Invoked via Codex CLI agent selection:
| Agent | File | Expertise |
|-------|------|-----------|
| web-frontend | `.codex/agents/web-frontend.toml` | React, Vue, Angular, CSS |
| backend-api | `.codex/agents/backend-api.toml` | REST, GraphQL, Server Architecture |
| security-specialist | `.codex/agents/security-specialist.toml` | OWASP, Vulnerability Analysis |
| devops-infrastructure | `.codex/agents/devops-infrastructure.toml` | Docker, K8s, CI/CD |
| ai-ml-engineer | `.codex/agents/ai-ml-engineer.toml` | ML Pipelines, LLM Integration |
| database-architect | `.codex/agents/database-architect.toml` | Schema Design, Query Optimization |
| mobile-developer | `.codex/agents/mobile-developer.toml` | React Native, Flutter, Native |
| performance-engineer | `.codex/agents/performance-engineer.toml` | Profiling, Caching, Optimization |
| documentation-writer | `.codex/agents/documentation-writer.toml` | API Docs, Architecture Docs |
| code-reviewer | `.codex/agents/code-reviewer.toml` | Code Quality, PR Reviews |
| test-runner | `.codex/agents/test-runner.toml` | Test Execution, Failure Diagnosis |
| explorer | `.codex/agents/explorer.toml` | Codebase Navigation, Research |
| implementer | `.codex/agents/implementer.toml` | Feature Implementation, Refactoring |
### Skills Library (22 Skills)
Skills are reusable knowledge documents in `.claude/skills/` that provide patterns, best practices, and guidelines. They auto-activate based on task context — when an agent works on a task matching a skill's domain, the relevant skill content is injected into the prompt.
```mermaid
graph TB
subgraph "Skills Library — .claude/skills/"
subgraph "Development (6)"
D1["react-components.md"]
D2["rest-api-design.md"]
D3["python-async.md"]
D4["graphql-development.md"]
D5["database-queries.md"]
D6["error-handling.md"]
end
subgraph "Testing (4)"
T1["unit-testing.md"]
T2["integration-testing.md"]
T3["test-driven-development.md"]
T4["performance-testing.md"]
end
subgraph "Security (4)"
S1["input-validation.md"]
S2["authentication.md"]
S3["secure-coding.md"]
S4["vulnerability-assessment.md"]
end
subgraph "DevOps (3)"
O1["docker-containerization.md"]
O2["ci-cd-pipelines.md"]
O3["kubernetes-deployment.md"]
end
subgraph "AI/ML (3)"
M1["embeddings-retrieval.md"]
M2["llm-integration.md"]
M3["rag-pipeline.md"]
end
subgraph "Documentation (3)"
C1["api-documentation.md"]
C2["architecture-docs.md"]
C3["code-documentation.md"]
end
end
```
| Category | Skill | File Path | Description |
|----------|-------|-----------|-------------|
| Development | react-components | `.claude/skills/development/react-components.md` | Component patterns, hooks, state management |
| Development | rest-api-design | `.claude/skills/development/rest-api-design.md` | RESTful endpoint design, status codes, pagination |
| Development | python-async | `.claude/skills/development/python-async.md` | asyncio, async/await, concurrency patterns |
| Development | graphql-development | `.claude/skills/development/graphql-development.md` | Schema design, resolvers, N+1 prevention |
| Development | database-queries | `.claude/skills/development/database-queries.md` | Query optimization, parameterized queries, ORMs |
| Development | error-handling | `.claude/skills/development/error-handling.md` | Exception hierarchies, retry logic, graceful degradation |
| Testing | unit-testing | `.claude/skills/testing/unit-testing.md` | pytest patterns, mocking, fixtures, assertions |
| Testing | integration-testing | `.claude/skills/testing/integration-testing.md` | Service integration, test databases, API testing |
| Testing | test-driven-development | `.claude/skills/testing/test-driven-development.md` | Red-green-refactor, test-first workflows |
| Testing | performance-testing | `.claude/skills/testing/performance-testing.md` | Load testing, benchmarking, profiling |
| Security | input-validation | `.claude/skills/security/input-validation.md` | Pydantic validation, sanitization, injection prevention |
| Security | authentication | `.claude/skills/security/authentication.md` | JWT, OAuth, session management, bcrypt |
| Security | secure-coding | `.claude/skills/security/secure-coding.md` | OWASP top 10, secure defaults, least privilege |
| Security | vulnerability-assessment | `.claude/skills/security/vulnerability-assessment.md` | CVE analysis, dependency auditing, threat modeling |
| DevOps | docker-containerization | `.claude/skills/devops/docker-containerization.md` | Multi-stage builds, layer optimization, security |
| DevOps | ci-cd-pipelines | `.claude/skills/devops/ci-cd-pipelines.md` | GitHub Actions, Jenkins, pipeline patterns |
| DevOps | kubernetes-deployment | `.claude/skills/devops/kubernetes-deployment.md` | Manifests, Helm charts, scaling strategies |
| AI/ML | embeddings-retrieval | `.claude/skills/ai-ml/embeddings-retrieval.md` | Vector stores, similarity search, indexing |
| AI/ML | llm-integration | `.claude/skills/ai-ml/llm-integration.md` | API integration, prompt engineering, token management |
| AI/ML | rag-pipeline | `.claude/skills/ai-ml/rag-pipeline.md` | Retrieval-augmented generation, chunking, re-ranking |
| Documentation | api-documentation | `.claude/skills/documentation/api-documentation.md` | OpenAPI specs, endpoint docs, examples |
| Documentation | architecture-docs | `.claude/skills/documentation/architecture-docs.md` | System diagrams, ADRs, component docs |
| Documentation | code-documentation | `.claude/skills/documentation/code-documentation.md` | Docstrings, type hints, inline comments |
Additionally, three operational skills live at the top level of `.claude/skills/`:
- `generate-reports/SKILL.md` — Generate orchestrator execution and analytics reports
- `run-tests/SKILL.md` — Run pytest suite with marker/file filtering
- `health-check/SKILL.md` — System health checks and status reports
### Domain Rules (11)
Rules in `.claude/rules/` define **mandatory coding standards** that Claude enforces across all tasks. Unlike skills (which provide guidance), rules are always active constraints:
| Rule | File | Enforces |
|------|------|----------|
| adapters | `.claude/rules/adapters.md` | BaseAdapter interface, adapter patterns |
| api-design | `.claude/rules/api-design.md` | RESTful conventions, status codes, versioning |
| testing | `.claude/rules/testing.md` | pytest standards, coverage thresholds, markers |
| performance | `.claude/rules/performance.md` | Caching, async patterns, profiling requirements |
| config | `.claude/rules/config.md` | YAML config, environment variables, no hardcoded secrets |
| ai-ml | `.claude/rules/ai-ml.md` | Model integration, embedding standards, fallback handling |
| observability | `.claude/rules/observability.md` | Logging, metrics, error reporting |
| frontend | `.claude/rules/frontend.md` | Component patterns, accessibility, styling |
| ci-cd | `.claude/rules/ci-cd.md` | Pipeline standards, deployment gates |
| security | `.claude/rules/security.md` | Input validation, parameterized queries, no shell=True |
| database | `.claude/rules/database.md` | Schema conventions, migration patterns, indexing |
### How It All Connects
```mermaid
flowchart LR
USER[User Task] --> SELECT{Agent Selection}
SELECT --> CLAUDE["Claude Agent
(.claude/agents/*.md)"]
SELECT --> CODEX["Codex Agent
(.codex/agents/*.toml)"]
CLAUDE --> RULES["Domain Rules
(.claude/rules/*.md)
Always active"]
CLAUDE --> SKILLS["Skills Library
(.claude/skills/**/*.md)
Auto-activated by context"]
CODEX --> SKILLS
RULES --> EXEC[Task Execution]
SKILLS --> EXEC
EXEC --> MCP["MCP Tools (34+)
Code Analysis · Security
Testing · DevOps · Context"]
MCP --> CTX["Context Memory
Store task + mistakes"]
CTX --> RESULT[Result + Learning]
RESULT -.->|Future tasks| SELECT
```
### MCP Tools
34+ tools exposed via Model Context Protocol:
| Category | Tools | Purpose |
|----------|-------|---------|
| Code Analysis | 4 | Complexity, patterns, dependencies |
| Security | 4 | Secrets, injection, headers, audit |
| Testing | 4 | Test cases, stubs, coverage |
| DevOps | 5 | Docker, compose, CI, deploy |
| Context | 7 | Store, search, retrieve, learn |
### Configuration Files
| File | Purpose |
|------|---------|
| `AGENTS.md` | Shared instructions read by Codex, Gemini CLI, and other agentic tools |
| `.claude/CLAUDE.md` | Claude Code-specific instructions (imports AGENTS.md) |
| `.claude/settings.json` | Claude Code tool permissions and settings |
| `.codex/agents/*.toml` | Codex agent role definitions |
📚 **See [AGENTIC_INFRA.md](AGENTIC_INFRA.md) for complete documentation.**
## Performance Considerations
### Caching Strategy
```mermaid
graph LR
A[Request] --> B{Cache Hit?}
B -->|Yes| C[Return Cached]
B -->|No| D[Execute Task]
D --> E[Store in Cache]
E --> F[Return Result]
```
**Cache Types:**
- **In-memory**: Fast, volatile (TTL: 5 minutes)
- **File-based**: Persistent, slower (TTL: 24 hours)
- **Distributed**: Redis/Memcached (optional)
### Async Execution
```python
import asyncio
async def execute_workflow_async(tasks: List[Task]):
# Adapter-level async execution for HTTP-backed local agents
results = await asyncio.gather(
*[agent.execute_task_async(task.description, task.context) for task in tasks],
return_exceptions=True
)
return results
```
## Scalability
### Horizontal Scaling
- **Stateless Design**: Sessions stored externally
- **Load Balancing**: Multiple orchestrator instances
- **Database**: Shared configuration and state
- **Message Queue**: Task distribution (future enhancement)
### Vertical Scaling
- **Connection Pooling**: Reuse connections to AI services
- **Worker Threads**: Parallel task processing
- **Memory Management**: Efficient caching strategies
- **Resource Limits**: CPU and memory constraints
## Optional: MCP Integration Layer
Both systems can optionally be exposed to external MCP-compatible clients via a FastMCP 3.x server (`mcp_server/`). This is a **separate, optional component** — neither system depends on it.
```mermaid
graph TD
subgraph "MCP Clients (optional)"
CD[Claude Desktop]
CC[Claude Code]
LA[LLM Agent]
end
subgraph "MCP Server (mcp_server/)"
S[FastMCP 3.x]
S --> OT[Orchestrator Tools ×4]
S --> ATT[Agentic Team Tools ×5]
S --> ST[Shared Tools ×1]
end
subgraph "Core Systems (independent)"
ORCH[Orchestrator]
ATE[Agentic Team]
end
CD & CC & LA -->|MCP Protocol| S
OT --> ORCH
ATT --> ATE
```
See [`MCP.md`](MCP.md) for the complete MCP documentation.
---
For more information:
- [Features Documentation](FEATURES.md)
- [Agentic Team Documentation](AGENTIC_TEAM.md)
- [Orchestrator Documentation](ORCHESTRATOR.md)
- [MCP Server Documentation](MCP.md)
- [Setup Guide](SETUP.md)
- [Adding Agents Guide](ADD_AGENTS.md)
> **Easter egg:** Go to our [wiki page](https://hoangsonww.github.io/AI-Agents-Orchestrator/) and enter Konami code (↑ ↑ ↓ ↓ ← → ← → B A) for a surprise!