Skip to content

Instantly share code, notes, and snippets.

@chrija76
Created January 25, 2026 17:26
Show Gist options
  • Select an option

  • Save chrija76/7ea3341aac8bd492dcc9a214e648312f to your computer and use it in GitHub Desktop.

Select an option

Save chrija76/7ea3341aac8bd492dcc9a214e648312f to your computer and use it in GitHub Desktop.

Knowledge Hub: Technical Deep Dive

A companion to "Building a Knowledge Hub in 13 Days"


What It Is

Knowledge Hub is an enterprise knowledge aggregation platform that syncs data from multiple sources into a searchable vector database with AI-powered retrieval-augmented generation (RAG).

Primary use case: Ask natural language questions across all company data and get AI-generated answers with source citations.


Tech Stack

Layer Technology
Backend Python 3.11, Flask 3.1, Gunicorn
Database PostgreSQL (production), SQLite (development)
ORM Flask-SQLAlchemy with Flask-Migrate (Alembic)
Vector Store Qdrant (1536-dimensional embeddings)
Embeddings OpenAI text-embedding-3-small
LLM Anthropic Claude (claude-sonnet-4-20250514)
Reranking Cohere API with local fallback
Search Hybrid: semantic + BM25 keyword + cross-encoder reranking
Spell Correction SymSpellPy
Auth Google OAuth 2.0 with encrypted credential storage
Background Jobs APScheduler
Slack Slack Bolt SDK with Socket Mode
MCP FastMCP (stdio + SSE transports)
Document Processing PDFPlumber, python-docx, openpyxl, python-pptx, Pillow
Deployment Replit

Architecture

flowchart TB
    subgraph SUPERVISOR[" β˜‘ SUPERVISOR "]
        health["Health Checks<br/>Auto-Restart"]
    end

    subgraph SERVICES[" ⚑ SERVICES "]
        flask["Flask App<br/>119 Endpoints"]
        mcp["MCP Server<br/>Claude Integration"]
        slack["Slackbot<br/>Team Q&A"]
    end

    subgraph CORE[" βš™ CORE - 36 Modules "]
        sync["Sync<br/>Manager"]
        rag["RAG<br/>Engine"]
        query["Query<br/>Processor"]
        search["Hybrid<br/>Search"]
        rerank["Reranker"]
        circuit["Circuit<br/>Breaker"]
    end

    subgraph CONNECTORS[" πŸ”Œ CONNECTORS "]
        sources["Gmail β€’ Drive β€’ Slack β€’ Zendesk<br/>Attio β€’ Granola β€’ ChatGPT β€’ Dropbox"]
    end

    subgraph STORAGE[" πŸ’Ύ STORAGE "]
        postgres[("PostgreSQL<br/>Users, OAuth<br/>Sync State")]
        qdrant[("Qdrant<br/>Vectors<br/>BM25 Index")]
    end

    subgraph EXTERNAL[" 🌐 EXTERNAL APIs "]
        openai["OpenAI<br/>Embeddings"]
        anthropic["Anthropic<br/>Claude LLM"]
        cohere["Cohere<br/>Reranking"]
    end

    SUPERVISOR --> flask & mcp & slack
    flask & mcp & slack --> CORE
    CORE --> CONNECTORS
    CORE --> postgres & qdrant
    CORE --> openai & anthropic & cohere
Loading

Component Overview

Component Purpose
Supervisor Process manager with health checks and auto-restart
Flask App Web dashboard, 119 REST API endpoints, Google OAuth
MCP Server Enables Claude Desktop/Web to query the knowledge base
Slackbot Team members can ask questions from Slack
Sync Manager Orchestrates parallel data sync (up to 6 concurrent sources)
Query Processor Spell correction, intent classification, query optimization
RAG Engine Answer synthesis with source citations
Hybrid Search Combines semantic vectors + BM25 keyword search
Reranker Cross-encoder reranking via Cohere with local fallback
Circuit Breaker Resilience pattern for external service failures

Data Sources

Source What's Synced Special Features
Gmail Emails, attachments (PDF, DOCX, images) 20 parallel workers, attachment extraction
Google Drive Docs, Sheets, Slides, PDFs Format conversion, 10MB file limit
Slack Messages, threads, channels User enrichment, rate-limit aware
Zendesk Investment opportunities, deal tracking HTML stripping, 5 parallel workers
Attio Companies, contacts, notes, lists Configurable object filtering, 365-day recency
Granola AI meeting notes, transcripts ProseMirror JSON parsing
ChatGPT Exported conversation history Staging DB with approval workflow
Dropbox Documents, PDFs, text files 10+ file types, 10MB limit

Base Connector Features:

  • HTTP session pooling for connection reuse
  • Retry logic with exponential backoff (max 3 retries)
  • Rate limit handling (429 status code detection)
  • Request timeout handling (30 seconds default)

Key Features

Search & Retrieval

  • Hybrid search – Vector similarity + BM25 keyword matching
  • Query intent classification – Factual, Exploratory, Navigational, Troubleshooting, Person Lookup, Temporal
  • Spell correction – SymSpellPy integration
  • Cross-encoder reranking – Cohere API with local fallback
  • HyDE – Hypothetical Document Embeddings (optional advanced retrieval)
  • Dynamic relevancy thresholding – Adjusts cutoff based on result quality
  • Source-specific weighting – Freshness decay per source type
  • Query caching – LRU cache with TTL

RAG (Retrieval-Augmented Generation)

  • Multi-turn conversation context (up to 10 turns)
  • Session-based history tracking
  • Entity mention tracking across turns
  • Source citations in responses

Data Sync

  • Parallel multi-source syncing (up to 6 concurrent)
  • Full and incremental sync modes
  • Automatic stall detection (watchdog)
  • Scheduled syncing (hourly/daily/weekly/monthly)
  • Comprehensive sync logging to database

Resilience

  • Circuit breaker pattern – Auto state transitions (CLOSED β†’ OPEN β†’ HALF_OPEN)
  • Retry queue – Failed sync items automatically retried
  • Health monitoring – Database, Qdrant, OpenAI API checks
  • Auto-restart – Failed services automatically recovered

Security & Privacy

  • Multi-user with data isolation (user-scoped vector queries)
  • Role-based access control (user/admin)
  • GDPR-compliant user deletion with verification
  • OAuth credential encryption
  • API key management with scopes and rate limiting

Integrations

  • Slack – @mention handling, company-specific search, context-aware answers
  • Claude Desktop – Local MCP server (stdio transport)
  • Claude.ai – Remote MCP server (SSE transport)

Codebase Statistics

Lines of Code

Category Lines
Total Python ~43,800
Main application (app.py) 6,290
Sync Manager 1,515
Vector DB wrapper 1,352
API v1 endpoints 1,053
Auth module 708
Database models 193
Templates (HTML) ~12,400

File Counts

Type Count
Python files 95
HTML templates 18
Core modules 36
Data connectors 8
Test files 12
API endpoints 119

Project Structure

knowledgehub/
β”œβ”€β”€ app.py                    # Main Flask app (6,290 LOC, 119 endpoints)
β”œβ”€β”€ supervisor.py             # Process manager
β”œβ”€β”€ run_slackbot.py          # Slack bot runner
β”œβ”€β”€ mcp_server.py            # Claude Desktop MCP (stdio)
β”œβ”€β”€ remote_mcp_server.py     # Claude Web MCP (SSE)
β”œβ”€β”€ src/
β”‚   β”œβ”€β”€ core/                # 36 core modules
β”‚   β”‚   β”œβ”€β”€ sync_manager.py  # Parallel sync orchestration
β”‚   β”‚   β”œβ”€β”€ vector_db.py     # Qdrant wrapper
β”‚   β”‚   β”œβ”€β”€ embeddings.py    # OpenAI embeddings with cache
β”‚   β”‚   β”œβ”€β”€ rag_generator.py # Answer synthesis
β”‚   β”‚   β”œβ”€β”€ query_processor.py # Intent classification
β”‚   β”‚   β”œβ”€β”€ hybrid_search.py # Vector + BM25 fusion
β”‚   β”‚   β”œβ”€β”€ reranker.py      # Cross-encoder reranking
β”‚   β”‚   β”œβ”€β”€ circuit_breaker.py # Resilience pattern
β”‚   β”‚   β”œβ”€β”€ health_monitor.py # Component health checks
β”‚   β”‚   β”œβ”€β”€ user_deletion.py # GDPR compliance
β”‚   β”‚   └── ...
β”‚   β”œβ”€β”€ connectors/          # 8 data source connectors
β”‚   β”‚   β”œβ”€β”€ base.py          # Common patterns
β”‚   β”‚   β”œβ”€β”€ gmail.py
β”‚   β”‚   β”œβ”€β”€ google_drive.py
β”‚   β”‚   β”œβ”€β”€ slack.py
β”‚   β”‚   β”œβ”€β”€ zendesk.py
β”‚   β”‚   β”œβ”€β”€ attio.py
β”‚   β”‚   β”œβ”€β”€ granola.py
β”‚   β”‚   β”œβ”€β”€ chatgpt/staging_db.py
β”‚   β”‚   └── dropbox.py
β”‚   └── api/                 # REST API
β”‚       β”œβ”€β”€ v1/__init__.py   # API v1 endpoints
β”‚       └── openapi.py       # OpenAPI spec
β”œβ”€β”€ templates/               # 18 HTML templates
β”œβ”€β”€ static/                  # CSS, JS assets
└── tests/                   # 12 test files

Performance Optimizations

Optimization Implementation
Parallel embedding generation 20 workers for Gmail, configurable per connector
Batched vector writes 4 parallel workers for Qdrant operations
Embedding cache LRU cache (1000 entries, 1-hour TTL)
Lazy module loading Heavy modules imported only when needed
Dashboard cache 5-second TTL on expensive calculations
HTTP connection pooling Session reuse across all connectors
Stall detection Watchdog timer with auto-recovery

Development Statistics

Timeline

Milestone Date
Project started January 13, 2026
Current state January 25, 2026
Total duration 13 days

Git Activity

Metric Count
Total commits ~275
Pull requests 95
Avg commits/day ~21

Authorship

All code was generated by AI:

Tool Percentage
Replit Agent ~77%
Claude Code ~23%

A human directed the work through prompts and reviewed pull requests, but wrote zero lines of code directly.

Dependencies

  • Production dependencies: 42
  • Development dependencies: 4

Key Challenges

MCP Server Reliability

Required multiple iterations to achieve stable process supervision, health monitoring, and automatic recovery. The MCP protocol's transport mechanisms and authentication requirements were learned through extensive debugging.

External API Integration

Handling timeouts, rate limits, and varying API response formats across 8 different services. Each connector required custom retry logic and error handling.

Sync State Management

Ensuring reliable state persistence and recovery after interruptions. Implemented watchdog timer to detect stalled syncs and automatic restart capabilities.

Performance at Scale

Initial sync was extremely slow with sequential processing. Achieved 10-50x speedup through parallel processing, batching, and connection pooling.

Query Quality

Building an effective search pipeline required combining multiple techniques: semantic search, keyword matching, spell correction, intent classification, and cross-encoder reranking.


Summary

Metric Value
Development time 13 days
Lines of code ~43,800
Data sources 8
API endpoints 119
Core modules 36
Commits ~275
Human-written code 0 lines

Last updated: January 25, 2026

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment