AI video learning

Watch, ask AI,
auto-generate notes

VideoLearn imports videos from Bilibili or your local files, transcribes speech in real time, and answers questions grounded in the current playback position. Each conversation is distilled into a structured note.

Try it now
1

Import a video

Paste a Bilibili link or upload a local video; downloading and processing runs automatically

2

AI transcription + Q&A

Speech is converted to text in real time; ask questions grounded in the video context

3

Auto-generated notes

AI distills the key points and concepts from the conversation into a structured note

Core capabilities

A complete learning loop from import to notes

🎥

Bilibili QR-code import

Paste a Bilibili link and authorize via QR scan on your phone. BV IDs and b23.tv short links are both supported. yt-dlp runs server-side to bypass IP-based anti-scraping.

🎤

AI speech transcription

Cloud-based recognition backed by Zhipu GLM-ASR, sliced into precise 30-second segments. On-demand transcription fills in missing segments while you chat — no waiting for the whole video to finish.

💬

Video-context Q&A

The assistant builds a precise "from last question to current playback" context window, using transcript text plus key frames to answer — it genuinely understands what you're watching.

🔖

Timeline bookmarks

Mark key moments with one click during playback. Ask questions tied to a specific bookmark to quickly locate and revisit important parts.

📝

AI-generated notes

From your conversation with the assistant, AI distills headings, key points, and core concepts into a structured JSON note. No manual cleanup — you finish learning with a complete note.

SSE streaming

Answers stream back in real time over Server-Sent Events, word by word. Markdown rendering and code highlighting keep the interaction smooth and natural.

Who it's for

Every video session ends with something to keep

🎓 Online course learning

Import a Bilibili tutorial or open lecture; ask "why is this code written this way" or "how is this formula derived" while you watch, and get chapter-level notes automatically.

📚 Academic lecture capture

Upload a recorded lecture or meeting; the assistant transcribes it in full. Mark the important moments with bookmarks, ask questions to go deeper, and finish with a complete note.

💻 Tech-talk review

For team-internal tech talks or conference talks, the assistant extracts the core approach, architecture, and technical details — no need to re-watch repeatedly.

🏫 Teaching aid

Teachers upload class recordings; the assistant helps check coverage. Students use Q&A to fill gaps in what they didn't catch in class — personalized learning on demand.

Tech stack

A modern full-stack architecture — async and performant

React 19 + TypeScript Frontend SPA with Zustand for state and TanStack Query for server cache
FastAPI + Celery Async REST API + SSE streaming; Celery handles video downloads and transcoding
PostgreSQL + MongoDB Seven relational tables + three document collections — a hybrid storage architecture
FFmpeg + yt-dlp Video pipeline: audio extraction, thumbnails, scene keyframe detection
Zhipu GLM-4 + ASR Cloud speech recognition + streaming LLM chat; on-demand transcription cuts wait time
JWT + Docker HttpOnly Cookie auth; one-command Docker Compose deployment

Video learning, reimagined with AI

Import a video, talk to AI, generate notes automatically. Try it free with demo / demo123.

Visit VideoLearn

Explore more products