LectureLens: Turn Any Lecture Video URL Into Smart, Searchable Chapters — No Download Needed

Instant semantic video chaptering from any YouTube/News/Instagram/URL — processes thousands of frames in minutes using YOLO + CLIP + PaddleOCR while never downloading a single byte.

Description

I built LectureLens because I was tired of wasting hours scrubbing through long lectures. Just paste any video link and get perfect timestamps + a fully searchable database in minutes — all without downloading the video. I used it for searching specific scenes , like a airplane crash , car chase from long movies saved locally on my device .

Project Inspiration

I’m a 3rd-year student and I was fed up wasting hours scrubbing through 3-4 hour university lectures on YouTube or recorded classes. The professor would spend 15 minutes explaining one concept, then suddenly write something crucial on the board — and I had no idea where it was. Existing tools like YouTube auto-captions or Whisper-based summarizers only listen to audio and completely miss visual content like diagrams, board text, and slide changes. That frustration became the spark for LectureLens. I wanted a tool that actually watches the video like a human would — intelligently, efficiently, and without forcing me to download gigabytes of data. So I built it for myself first, and now I’m sharing it with everyone who hates long video scrubbing. It also efficiently makes ppt , notes with equations,graphs in LateX exportable to Notion.

What It Does

LectureLens takes any video URL (YouTube, private lecture recording, Drive link,Local video, instagram video etc.) and instantly turns it into clean, meaningful chapters with accurate timestamps with explanations for each frame it has processed. It detects board text, diagrams, scene changes, and important moments using computer vision, then collapses boring repetitive parts (like 10 minutes of talking) into one smart timestamp while highlighting every new concept or slide. At the end, it builds a fully searchable vector database — you can type “explain torque” or “show the circuit diagram” and jump straight to the exact moment. No full download, no storage bloat, and results stream live as it processes. It’s built especially for technical lectures where 70% of the value is visual, not just spoken words. It makes video specific notes and qizzes .

How I Built It

I started with the core idea of sparse sampling — why process every single frame when most of a lecture is static? I combined yt-dlp for streaming with FFmpeg’s I-frame extraction to grab only keyframes super fast. Then I layered YOLOv8 for object/scene detection, PaddleOCR for board text (much better than Tesseract on handwritten content), and CLIP embeddings for semantic understanding. The magic is the real-time deduplication engine using Redis and cosine similarity — it decides on the fly whether a new frame is worth keeping. Everything runs asynchronously with Celery so the results appear live in the browser. I built the entire thing in under 3 weeks during nights after college.

Tech Stack

  • Backend & Orchestration: FastAPI + Celery + Redis (for async & live streaming)

  • Video Streaming & Frame Extraction: yt-dlp + FFmpeg (I-frame only) + OpenCV

  • Vision Models: YOLOv8 (Ultralytics) + PaddleOCR (best for lecture boards)

  • Semantic Understanding: CLIP (ViT-B/32) + Sentence-Transformers embeddings

  • Search & Storage: pgvector + Reciprocal Rank Fusion (hybrid BM25 + vector search)

Some Code From Project

The heart of the speed is this FFmpeg command that extracts only I-frames without decoding everything:

ffmpeg -i "$(yt-dlp -g "$VIDEO_URL")" -vf "select='eq(pict_type,I)'" -vsync vfr -frame_pts true keyframes/%04d.jpg

Achievements I’m Proud Of

  • Built a complete end-to-end system that actually works on real lecture videos.

  • Made it fully open-source and community-friendly so other students can extend it

  • Created something that genuinely saves hours of study time — I’ve already used it for my own semester prep.

Try It Out

GitHub Repo: https://github.com/adityagupta9695/LectureLens

Issues & PRs Board
No issues or pull requests added.