Data Harbor is a Technical & Coding Research Agent in which privacy-first, fully offline Retrieval-Augmented Generation (RAG) system designed to intelligently analyze user-uploaded PDF documents. It extracts text, performs context-aware chunking, generates semantic embeddings, and retrieves the most relevant content using Qdrant vector search before producing structured answers via a locally hosted Mistral LLM. Built with Streamlit, SentenceTransformers, Qdrant, and Ollama, the system runs entirely on local GPU — ensuring zero API costs, no external data exposure, and high-performance inference. With transparent retrieval, confidence scoring, and modular architecture, it provides accurate, explainable, and enterprise-ready technical document analysis.
The Technical & Coding Research Agent follows a modular Retrieval-Augmented Generation (RAG) architecture designed for efficient, private, and GPU-accelerated document analysis.
Pipeline Flow:
User Query
↓
Query Embedding Generation
↓
Vector Similarity Search (Top-K via Qdrant)
↓
Relevant Context Retrieval
↓
Prompt Construction
↓
Local LLM Inference (Mistral via Ollama)
↓
Structured Response + Confidence Score
The system is divided into independent modules for PDF extraction, chunking, embeddings, vector storage, LLM inference, and orchestration — ensuring maintainability and scalability.
The user uploads a PDF document.
Text is extracted and split into context-aware chunks with overlap.
Each chunk is converted into semantic embeddings.
Embeddings are stored locally in Qdrant vector database.
When a user asks a question:
The query is embedded.
Top-K relevant chunks are retrieved.
Retrieved context is injected into a structured prompt.
The local Mistral model generates a grounded, structured answer.
A confidence score is computed based on retrieval coverage.
The entire process runs locally with no external API calls.
Fully offline RAG pipeline
GPU-accelerated local inference
Semantic vector search (Top-K retrieval)
Context-aware chunking with overlap
Fast Mode (concise answers)
Deep Mode (detailed technical analysis)
Confidence score estimation
Transparent retrieval display
Modular and clean architecture
Multi-PDF indexing and cross-document querying
Conversational memory support
Similarity-weighted confidence scoring
Hybrid search (semantic + keyword)
Authentication and user management
Optional cloud deployment mode