▶
About this layer
The topmost layer handles all persistent knowledge, memory, and organizational intelligence. This is where the system accumulates context across sessions — daily logs, curated long-term memory, living entity graphs, and cross-referenced knowledge bases. Every conversation, decision, and insight flows upward to this layer for permanent storage and retrieval. The Knowledge & Data layer is what gives the stack continuity — without it, each session starts from zero.
📓 Obsidian Vaults
6 vaults: altnativ, nexdex, vibestreet, inclination, infra, strategy
All at ~/Documents/
▸ Technical Details
Purpose
Obsidian serves as the human-readable knowledge base. Each company has its own vault with standardized folder structure (00-CHAIRMAN-VAULT, Ideas, Decisions, Knowledge, Inbox, People, etc.). Markdown files are the source of truth for strategy docs, meeting notes, research, and operational context.
Architecture
- Local-first: all vaults stored on-disk as plain Markdown
- Obsidian Git plugin provides version control per vault
- Indexed by gBrain for hybrid vector + full-text search
- Cross-vault connections discovered by Knowledge Graph Connection Finder
Integration Points
gbrain import <vault> — syncs vault pages into knowledge graph
gbrain embed --stale — re-embeds changed pages
- Knowledge Distiller cron (Sundays 6AM) — structural analysis across vaults
- Capture Endpoint — auto-routes thoughts/ideas to correct vault
⚡ Custom Skills
31 skills including:
code-pipeline, contentloop, knowledge-graph
gstack/* (16 sub-skills)
grill-me, browse-learn, daily-briefing
▸ Technical Details
Purpose
Skills are reusable, composable instruction sets that teach the agent how to perform specialized tasks. Each skill is a directory containing a SKILL.md (instructions) plus optional scripts, reference files, and config. They load on-demand when the agent detects a matching task.
Skill Categories
- Build: code-pipeline, gstack (16 sub-skills for Claude Code orchestration)
- Research: browse-learn, grill-me, content-recycler
- Operations: daily-briefing, healthcheck, fullbackup, gitbackup
- Communication: linkedin-post, linkedin-feedback, linkedin-engagement, xurl
- Knowledge: knowledge-graph, supermemory, obsidian
- Infrastructure: openclaw-ops, session-logs, node-connect
Resolution
When a task arrives, the agent scans all skill descriptions. If exactly one matches, it reads the SKILL.md and follows instructions. Skills are never stacked — only one per task.
🧠 Context Persistence Layer
entity-state.json — Living ontology
active-threads.json — WIP tracker
cross-ref-index.json — Entity relations
session-snapshots/ — Pre-compaction state
▸ Technical Details
Purpose
The CPL bridges the gap between session-based memory (volatile) and permanent storage (Obsidian vaults). It maintains a living model of the current operational state — what's being worked on, who's involved, and how entities connect. This is what allows the agent to resume context after compaction or restart.
Components
- entity-state.json — Tracks all known people, companies, projects, and their current status. Updated after every significant interaction.
- active-threads.json — WIP tracker. Records unfinished tasks, pending decisions, and open conversations across channels.
- cross-ref-index.json — Bidirectional entity relations (person → company → project → vault). Enables "who works on what and where is it documented" queries.
- session-snapshots/ — Full state capture before context compaction. Run via
bash context/session-snapshot.sh "<topic>"
Lifecycle
Read at session start → updated during session → saved before compaction. The snapshot script captures the full state so no context is lost during token compaction.
💾 Memory & Recovery
MEMORY.md — Curated long-term
memory/YYYY-MM-DD.md — Daily logs
Disaster-Recovery/ — Rolling 3-backup
- Git repo + hooks (pre/post tool-use)
▸ Technical Details
Purpose
Three-tier memory system providing different persistence guarantees: daily logs for raw events (append-only), MEMORY.md for curated long-term knowledge (manually maintained), and Disaster-Recovery for full system restore capability.
Memory Hierarchy
- Working memory: Current session context window (up to 100K tokens, compacted via qwen3.5)
- Daily logs:
memory/YYYY-MM-DD.md — raw chronological events, append-only during pre-compaction flush
- Long-term:
MEMORY.md — curated, deduplicated, human-readable summary of permanent knowledge
- Recovery:
Disaster-Recovery/ — rolling 3-backup of full workspace (Git bundle + tar.gz)
Backup Process
Each backup creates a timestamped subfolder with Git bundle + full workspace tar.gz. After every new backup, only the 3 most recent are retained — older ones auto-deleted. Triggered via /fullbackup or /gitbackup commands.
▶
About this layer
The AI Model Fleet provides the reasoning, coding, and creative capabilities of the stack. Each model is selected for its strengths: GLM-5.1 for cost-effective general intelligence as the primary brain, Claude Sonnet for code-critical tasks, Kimi K2.6 for deep research, and Fal.ai for visual assets. The layer is designed for model diversity — no two consecutive pipeline stages share the same model, ensuring quality through independent verification.
🟣 GLM-5.1 PRIMARY
Provider: zai | Cost: ~$45/mo
Role: Apex main brain | Context: 128K tokens
▸ Technical Details
Purpose
Primary reasoning engine for all non-code tasks: strategy, research, communication, decision-making, memory management, and agent orchestration. Handles the majority of daily operations at ~$1.50/day.
Capabilities
- 128K token context window — handles full whitepapers, code reviews, multi-hour sessions
- Strong multi-language support (English, Urdu, Chinese)
- Cost-effective for high-volume operations
- Used in Code Pipeline as Architect and Tester stages
When It's Used
- All channel message routing (Discord, WhatsApp, Web)
- Heartbeat checks, daily briefings, cron jobs
- Knowledge graph operations, content generation
- Paperclip issue management
🟠 Claude Sonnet 4.6 FALLBACK
Provider: OpenRouter | Cost: Pay-per-use
Role: Code-critical, fallback | Compaction: Haiku 4.5
▸ Technical Details
Purpose
High-fidelity model reserved for tasks where code accuracy and nuanced reasoning are critical. Also serves as the fallback when GLM-5.1 is unavailable or when the task requires stronger instruction-following capabilities.
When It's Used
- Code Pipeline Coder stage (implementation from Architect's spec)
- Complex debugging, security-sensitive code review
- Whitepaper debate partner (independent perspective)
- Fallback routing when primary model fails
Compaction Model
Claude Haiku 4.5 handles context compaction — summarizing long conversations to fit within the 100K token hard cap. This is a cost-saving measure: Haiku is fast and cheap for summarization tasks.
🔵 Kimi K2.6 RESEARCH
Provider: OpenRouter | Cost: $10/mo cap
Role: Research & deep analysis
▸ Technical Details
Purpose
Deep research and analysis model. Used for tasks requiring extensive web search synthesis, long-document analysis, and cross-domain reasoning. Monthly cost capped at $10 via OpenRouter usage limits.
When It's Used
- Industry research and competitive analysis
- Long-document summarization and synthesis
- Cross-referencing multiple sources for due diligence
- Research tasks that benefit from its large context window
🎨 Fal.ai Flux Dev IMAGES
Role: Image generation | Cost: Pay-per-use
Watercolor illustrations, UI assets
▸ Technical Details
Purpose
Visual asset generation for all companies. Produces watercolor-style illustrations, social media graphics, UI mockups, and presentation assets on demand.
Integration
- Called via
image_generate tool in OpenClaw
- Supports text-to-image, image editing, and style transfer
- Output automatically saved to managed media directory
- Used for LinkedIn post images, whitepaper diagrams, pitch deck assets
▶
About this layer
The human-facing interface layer. All interactions with the Chairman, co-workers, and external parties flow through these channels. OpenClaw Gateway handles channel multiplexing — a single agent session can receive and respond across Discord, WhatsApp, and the Web UI simultaneously. Each channel has its own formatting rules and delivery semantics, but the agent logic is channel-agnostic.
💬 Discord PRIMARY
Channels: #apex-infra, #apex-verysmart, #apex-nexdex, #apex-altnativ, #apex-vibestreet
Bridge: Paperclip Bridge (Node.js sync)
▸ Technical Details
Purpose
Primary communication hub for all company operations. Each company has a dedicated channel for focused context. Discord also hosts the Paperclip Bridge — a Node.js service that syncs issue status between Discord and the Paperclip backend.
Channel Architecture
- #apex-infra — Infrastructure ops, system alerts, gateway status
- #apex-verysmart — Personal brand, social media, job hunting
- #apex-nexdex — Trading platform development
- #apex-altnativ — Automation agency, client work
- #apex-vibestreet — ZHC marketplace
- #apex-emerge — emergE Architecture project
Co-Worker Access
Trusted co-workers (Brenda, Alizain) interact via Discord with scoped permissions. Brenda gets read access + sandboxed workspace. Alizain gets full technical access across company channels with Paperclip CRUD.
📱 WhatsApp
Chairman direct line, Heartbeat alerts, Briefing delivery
▸ Technical Details
Purpose
Direct line to the Chairman for time-sensitive alerts, daily briefings, and quick interactions. Used for heartbeat notifications when the system detects urgent items requiring human attention.
Message Types
- Heartbeat alerts: Urgent emails, events <2h away, prolonged silence detection
- Daily briefings: Morning summary of overnight activity, pending decisions, schedule
- Quick commands: Chairman can send short instructions from mobile
- Approval flow: External action confirmations (email drafts, social posts)
Rules
- Quiet hours: 23:00–08:00 PKT (no outbound unless urgent)
- No markdown tables — use bold text or CAPS for structure
- Batch information — no drip-feeding
🌐 Web UI
OpenClaw control panel, Session management
▸ Technical Details
Purpose
Browser-based control panel for the OpenClaw Gateway. Provides session management, configuration editing, tool testing, and real-time agent monitoring. Used primarily for debugging and administration.
Features
- Session listing and history inspection
- Agent configuration and model overrides
- Tool testing and approval management
- Gateway status and health monitoring
- Cron job management
▶
About this layer
The automation layer provides the tools and agents that execute tasks beyond simple text generation. This includes browser automation for web scraping and interaction, voice interfaces, scheduled task execution, and remote terminal access. These are the "hands" of the system — they turn decisions into actions across digital interfaces.
🦊 CamoFox Browser :9377
- Stealth browser automation, VNC support
- LinkedIn scraping, Cookie injection
- CamoFox anti-detection engine
▸ Technical Details
Purpose
Stealth browser automation for web scraping, social media interaction, and any task requiring a real browser footprint. Uses CamoFox anti-detection engine to avoid bot detection while performing legitimate automation tasks.
Technical Stack
- Based on Playwright with anti-fingerprint patches
- VNC support for visual debugging
- Cookie injection for authenticated sessions
- Proxy rotation for rate-limit avoidance
Use Cases
- LinkedIn profile and job scraping
- Competitive research across protected sites
- Social media posting via browser (not API)
- Form filling and data entry automation
🎙️ Apex Voice Bot
Wake word: "apex" | Stack: Whisper STT, macOS TTS, Python
Discord voice integration
▸ Technical Details
Purpose
Voice-activated interface for hands-free interaction. The bot listens for the wake word "apex" and routes spoken commands through the same OpenClaw Gateway as text messages.
Technical Stack
- STT: OpenAI Whisper (local inference via Whisper.cpp)
- TTS: macOS native text-to-speech
- Runtime: Python, runs as LaunchAgent
- Integration: Discord voice channel capture
Limitations
- Discord DAVE encryption blocks live voice capture — uses voice messages in text channels instead
- Emojis stripped before speaking (configurable)
⏰ Cron Jobs
- Weekly Vault Digest (Mon 9AM)
- gBrain sync, Git repo sync
- Knowledge Distiller, Connection Finder
▸ Technical Details
Purpose
Scheduled automation for maintenance, monitoring, and recurring tasks. All cron jobs run through OpenClaw's built-in scheduler and use Ollama models (qwen3.5/gemma4) for zero incremental cost.
Active Jobs
- Mon 9AM: Weekly Vault Digest — summarizes all vault changes
- Daily 9AM: LinkedIn Performance Check — engagement metrics analysis
- Daily 8AM: CXO Job Scout — discovers and scores executive roles
- Mon-Fri 10AM: LinkedIn Engagement — monitors feeds, drafts comments
- Sun 6PM: LinkedIn Network — batch connection requests
- Sun 8PM: Social Listening Sweep — industry trend analysis
- Sun 9PM: Weekly Content Planner — schedules next week's content
- Sun 6AM: Knowledge Distiller — structural analysis across 6 vaults
- Sun 7AM: Connection Finder — cross-vault relationship discovery
- Daily 6AM: Structural Distiller — cognitive analysis
💻 Web Terminal :8900
Browser-based terminal access
Remote management
▸ Technical Details
Purpose
Browser-based terminal for remote system administration. Allows the Chairman or authorized users to access the Mac mini's command line from any device without SSH setup.
Capabilities
- Full shell access (zsh) via web browser
- Real-time command output streaming
- Useful for emergency debugging from mobile
- No VPN/SSH required — accessible from any network
▶
About this layer
The nervous system of the stack. OpenClaw Gateway is the central orchestrator routing messages, managing sessions, and coordinating all layers. Paperclip provides the Company OS — structured project management across 6 companies with issue tracking, status boards, and Discord integration. gBrain gives the stack a searchable knowledge graph spanning all vaults. Together, these three services form the operational backbone that everything else depends on.
⚡ OpenClaw Gateway :18789
- Agent orchestration, Session management
- Context compaction (qwen3.5)
- Tool routing, Channel multiplexing
- Sub-agent spawning, Heartbeat system
- Skill loading
▸ Technical Details
Purpose
Central orchestrator for all agent operations. Routes incoming messages from channels (Discord, WhatsApp, Web) to the appropriate agent session, manages context windows, handles tool calls, and coordinates sub-agent spawning.
Core Functions
- Session management: Each channel+conversation gets a persistent session with its own context window
- Context compaction: When context approaches the 100K token hard cap, qwen3.5 summarizes and compresses older messages
- Tool routing: Maps agent tool calls to actual system operations (file read/write, shell exec, web fetch, etc.)
- Channel multiplexing: One agent can serve multiple channels simultaneously
- Sub-agent spawning: Isolate complex tasks in child sessions (debates, coding, research)
- Heartbeat system: Periodic check-ins for urgent items (email, calendar, schedule conflicts)
Startup Flow
Gateway starts → loads skills → initializes channels → restores sessions → begins heartbeat loop. All state persists across restarts via on-disk session storage.
📎 Paperclip :3100
Company OS — Issue tracking, Status boards
Entities: ALTA NEX VIB VER INC SPE
Backend: PostgreSQL :54329 + Discord Bridge
▸ Technical Details
Purpose
Company OS providing structured project management across all Paperclip companies. Every task for a tracked company goes through Paperclip — create issue, get identifier, execute, mark done. No exceptions.
Entity Prefixes
- ALTA — Altnativ (enterprise automation agency)
- NEX — Nexdex (trading automation platform)
- VIB — Vibestreet (ZHC marketplace)
- VER — VerySmart (personal brand management)
- INC — Inclination (AI shopping assistant)
- SPE — Special (cross-company tasks)
Integration
- Discord Bridge syncs issue status to company channels
- Code Pipeline tracks stages via Paperclip issue labels
- Agent creates issues automatically when tasks are identified
- PostgreSQL backend on custom port 54329
🧠 gBrain Knowledge Graph
- PGLite embedded DB, 70+ pages
- Hybrid search (vector + full-text)
- 6 vaults indexed,
gbrain CLI
- Knowledge Distiller, Connection Finder
▸ Technical Details
Purpose
Searchable knowledge graph that indexes all 6 Obsidian vaults and enables hybrid search across the entire organizational knowledge base. Powers the agent's ability to retrieve relevant context from hundreds of documents instantly.
Technical Architecture
- Storage: PGLite (embedded PostgreSQL) — zero-config, file-based
- Search: Hybrid vector similarity + full-text search
- Embeddings: nomic-embed-text via Ollama (local, free)
- CLI:
gbrain command at ~/.bun/bin/gbrain
Key Commands
gbrain import <vault> — Import vault pages into knowledge graph
gbrain embed --stale — Re-embed changed pages only
gbrain search <query> — Hybrid search across all vaults
gbrain doctor — Verify index health and coverage
Stats
720+ pages indexed across 6 vaults, 99% coverage. 4 oversized distillations hit Ollama context limit (non-critical).
🐍 Hermes Agent
77 skills, Python venv, Browser automation engine
Reserve agent (not daily driver)
▸ Technical Details
Purpose
Legacy agent platform with 77 built-in skills and browser automation. Currently in reserve — not the daily driver but available for tasks that benefit from its extensive skill library or Python-native tooling.
Capabilities
- 77 pre-built skills for various automation tasks
- Python virtual environment with full library access
- Browser automation engine (pre-CamoFox)
- Can be activated for specific workloads
Status
Active reserve. Superseded by OpenClaw as the primary agent framework, but maintained for specialized tasks and as a fallback.
▶
About this layer
The foundation services that everything else builds on. Ollama provides free local inference for compaction, embeddings, and backup reasoning — keeping costs near zero for routine operations. PostgreSQL and Redis provide persistent and ephemeral data storage. LM Studio enables local model testing and inference for experimentation. All services run as macOS LaunchAgents for automatic startup and recovery.
🤖 Ollama :11434
qwen3.5 — compaction, safeguard
gemma4 — compaction, backup
gemma4-paperclip — Paperclip agent
nomic-embed-text — embeddings
▸ Technical Details
Purpose
Local LLM inference engine providing zero-cost AI capabilities for high-volume, latency-tolerant tasks. Runs on Apple Silicon GPU via Metal acceleration.
Models
- qwen3.5: Context compaction, safeguard checks, background tasks. Primary workhorse.
- gemma4: Backup compaction model, cognitive distiller analysis
- gemma4-paperclip: Dedicated model for Paperclip issue management agent
- nomic-embed-text: Text embeddings for gBrain knowledge graph
Why Local
- Zero incremental cost — critical for high-frequency operations like compaction
- No network latency — compaction runs in <10 seconds
- Privacy — sensitive context never leaves the machine
- Resilience — works offline, no API dependency
🔴 Redis :6379
Cache, Queue, Session state
▸ Technical Details
Purpose
In-memory data store used for caching, message queuing, and ephemeral session state. Provides sub-millisecond access for frequently-needed data.
Usage
- Cache: LLM response caching, tool result memoization
- Queue: Background job queue for cron tasks and sub-agent coordination
- Session state: Active session metadata, heartbeat state tracking
- Rate limiting: API call tracking and throttling
🐘 PostgreSQL :54329
Paperclip embedded DB, Issue tracking
▸ Technical Details
Purpose
Primary relational database for the Paperclip Company OS. Stores all issue data, company entities, status boards, and project metadata. Runs on a non-standard port (54329) to avoid conflicts with any development PostgreSQL instances.
Schema
- Issues table with company prefix, status, assignee, labels
- Company entities (ALTA, NEX, VIB, VER, INC, SPE)
- Status transition history and audit log
- Discord Bridge sync state
🧪 LM Studio :41345
Local inference, Model testing
▸ Technical Details
Purpose
GUI-based local model testing environment. Used for experimenting with new models before deploying them to Ollama, and for ad-hoc inference tasks that benefit from a visual interface.
Usage
- Model evaluation before Ollama deployment
- Prompt engineering and testing
- Ad-hoc inference with custom model parameters
- Not a production dependency — testing only
⬇ LAYER 1 — HARDWARE & RUNTIME
Mac mini M4
macOS Darwin 25.4.0
Hardware & Runtime Details
The entire stack runs on a single Mac mini M4 — Apple Silicon ARM64 architecture. The unified memory architecture provides high bandwidth for local LLM inference via Metal GPU acceleration. All services run as macOS LaunchAgents (16 total) for automatic startup, crash recovery, and resource management.
- CPU: Apple M4 (ARM64)
- OS: macOS Darwin 25.4.0
- Runtime: Node.js v25.9.0 (OpenClaw), Python 3 (Hermes/Voice Bot), Bun (gBrain)
- Services: 16 LaunchAgents — auto-start, crash recovery, resource-managed
- Cost: $0/hosting — single machine, residential connection