Progress Log

Documenting our thinking as we build the knowledge base

Current Status

Phase: Wan KB COMPLETE - Two models live

Knowledge Bases Live:

Next: Consider HunyuanVideo, FLUX, or CogVideoX extraction.

February 3, 2026 - Night

Wan Knowledge Base COMPLETE

Built comprehensive static HTML knowledge base for the Wan ecosystem at kb/wan/.

KB Structure (10 sections, ~1,200 lines):

  • Overview: Wan family explanation (2.1, 2.2, VACE, Fun)
  • Choosing a Model: Decision tree + 2.1 vs 2.2 comparison table
  • Hardware: VRAM requirements for all 10+ models
  • Generation Modes: T2V, I2V, FLF, S2V with recommended settings
  • Control Methods: VACE, Fun Control, Camera (ReCamMaster), Motion (ATI, WanAnimate)
  • Character & Likeness: Phantom, MAGREF, EchoShot, Lynx
  • Lip-Sync & Audio: HuMo, MultiTalk, InfiniteTalk, FantasyTalking
  • Speed & Optimization: LightX2V, CausVid, Wan2GP, TeaCache
  • Training: LoRA tips, frameworks
  • Troubleshooting: 15+ common errors with solutions
Content synthesis approach: Rather than dumping all 316K messages of extracted knowledge, we curated the most useful items from Discord extractions, combined with technical specs from external docs (GitHub READMEs, tutorials). Result is a readable reference guide vs. a data dump.
February 3, 2026 - Late

Wan enrichment COMPLETE - Ready for static KB

Gathered all source materials for the Wan knowledge base. Now ready to synthesize into static HTML pages.

External sources gathered:

  • 60+ URLs from official repos, community projects, ComfyUI docs
  • Technical content fetched: VRAM requirements, features, installation steps
  • Covers: Phantom, MAGREF, HuMo, MultiTalk, LightX2V, CausVid, ReCamMaster, VideoX-Fun, and more

#updates channel extracted:

  • 1,987 curated posts from @pom (Aug 2023 - Jan 2026)
  • 946 knowledge items extracted ($1.44)
  • High-value editorial content: 214 resources, 187 community creations, 142 workflows
  • Covers full history: AnimateDiff → SVD → SDXL → Wan → LTX → FLUX
Key insight: The #updates channel has a very different profile than raw chat - it's curated highlights with 83% having 10+ reactions. High on resources and community creations, low on troubleshooting.
February 3, 2026 - Evening

Wan extraction 100% COMPLETE

Completed the 5 previously failed months (Jul-Nov 2025). Full Wan ecosystem now extracted.

Final 5 months extracted:

  • July 2025: 32,584 msgs → $6.42
  • August 2025: 41,050 msgs → $7.68
  • September 2025: 25,790 msgs → $4.95
  • October 2025: 18,574 msgs → $3.45
  • November 2025: 10,533 msgs → $2.10

Total Wan extraction:

  • ~316,000 messages across 5 channels
  • 11 months of wan_chatter (Feb 2025 - Feb 2026)
  • Full runs of wan_gens, wan_training, wan_comfyui, wan_resources
  • Estimated cost: ~$65-70 total
Next steps: (1) Combine all extractions into NotebookLM-ready file, (2) Add external sources (official docs, blog posts, #updates channel), (3) Synthesize into cohesive static KB with better attribution.
February 3, 2026 - Earlier

Wan extraction 90% complete - Pipeline insights

Ran full Wan ecosystem extraction over ~5.5 hours. Most channels complete, 5 monthly chunks failed due to network errors.

Completed extractions (~3.5MB total):

  • wan_chatter: Feb-Jun 2025, Dec 2025-Feb 2026 (8 months)
  • wan_gens: 487KB - gallery/showcase content
  • wan_training: 457KB - LoRA training knowledge
  • wan_comfyui: 233KB - workflow implementation
  • wan_resources: 206KB - curated resources

Failed (need retry): wan_chatter Jul, Aug, Sep, Oct, Nov 2025 - network connection errors mid-extraction.

Pipeline insight - Extraction is just step 1: The raw extraction produces thousands of fragmented knowledge items. For a good static KB, we need additional steps: (1) enrich with external sources (official docs, blog posts), (2) synthesize/deduplicate with another LLM pass, (3) improve attribution format from "— Username" to "— Discord, Jan 2026" with links where possible.

Updated project plan: Added Step 3 (External Sources) and Step 4 (Synthesis) to the pipeline. See docs/project-plan.md.

Actual cost: ~$45 for Wan (vs $35 estimate) - more messages than expected (316K vs 200K).

February 2, 2026

LTX 2 KB validated - Project plan complete

Major milestone: LTX 2 knowledge base is functional end-to-end. NotebookLM tested and "works pretty well" per user feedback. Static HTML KB live.

What we built:

  • NotebookLM upload: Combined 8 extraction files into for_notebooklm/ltx2/2026-02-01/ltx2_january_combined.md (695KB, organized with section headers)
  • Static HTML KB: kb/ltx2/ - Comprehensive page with sticky TOC, collapsible sections, hardware tables, settings tables, troubleshooting guides
  • KB index: kb/index.html - Model selector page (LTX2 active, Wan/FLUX/others coming soon)
  • Project plan: docs/project-plan.md - Full scope, Wan ecosystem breakdown, cost estimates, 4-week timeline
Key insight - NotebookLM vs Static KB: These serve complementary roles. NotebookLM excels at specific Q&A ("What VRAM for 720p 97 frames?"). Static KB excels at browsing, decision trees ("Should I use Wan 2.1 or 2.2?"), and rich media (embedded videos, downloadable workflows). Build both.

Wan ecosystem is complex: Asked user to test NotebookLM with Wan questions. Learned it's not one model but an entire ecosystem:

  • Generations: Wan 2.1 (standard DiT) vs 2.2 (MoE architecture with High/Low noise split)
  • Control systems: VACE (full video control, inpainting, style transfer) vs Fun Control (Canny/Depth/Pose inputs)
  • Character models: Phantom (T2V consistency), MagRef (I2V likeness), HuMo (audio-reactive)
  • Lip-sync: MultiTalk, InfiniteTalk, HuMo S2V
  • Optimization LoRAs: LightX2V, CausVid, Pusa
  • Implementations: WanVideoWrapper (Kijai) vs ComfyUI Native

Media handling solved: Discord CDN URLs expire, but @pom built a refresh API endpoint. Will use that initially; migrate to Cloudflare R2 if reliability issues arise.

Cost estimates for full project:

  • LTX 2: $7.65 (complete)
  • Wan ecosystem (~200K msgs): ~$35
  • FLUX (~80K msgs): ~$14
  • All models combined (~800K msgs): ~$140

Next: Start Wan extraction - scope channels, estimate costs, begin processing.

February 1, 2026 - Evening

LTX 2 January extraction COMPLETE

Processed all LTX-related channels for January 2026. Total: ~44,500 messages across 4 channels, extracting ~4,345 knowledge items.

ChannelMessagesItemsCost
ltx_chatter (full month)34,7513,053$5.52
ltx_training2,850358$0.59
ltx_gens4,100554$0.83
ltx_resources2,891380$0.71
Total~44,500~4,345$7.65

Output: 8 markdown files in data/ directory (~19,000 lines total), ready for NotebookLM. Also JSON versions for structured use.

Lessons learned:

  • Running extractions in parallel hits rate limits (30K tokens/min). Run sequentially for reliability.
  • Cost tracking was accurate - actual $7.65 vs estimated $5-6. Each ~400 message chunk costs ~$0.15-0.20.
  • Forum channel (ltx_resources) works with time-chunked approach - still captures valuable content even without thread structure.
Insight: We now have comprehensive LTX 2 knowledge ready for NotebookLM testing. The extracted data includes technical discoveries, troubleshooting guides, hardware requirements, limitations, workflows, and community creations - everything needed for a useful knowledge base.

Next: Combine files into single comprehensive document, test in NotebookLM, then build static HTML knowledge base.

February 1, 2026 - Afternoon

LTX 2 chunked extraction working

Built extract_chat_chunks.py - processes chat in time-ordered chunks to capture ALL knowledge, not just Q&A pairs.

Why chunked approach: Chat contains more than Q&A - discoveries, comparisons, tips shared proactively, hardware benchmarks, links to resources. Processing in 400-message chunks preserves conversation context.

12 extraction categories:

  • Original 8: discoveries, troubleshooting, comparisons, tips, news, workflows, settings, concepts
  • New 4: resources (links to models/repos), limitations (what doesn't work), hardware (VRAM/RAM requirements), community_creations (LoRAs/nodes people made)

Prompt improvements (learned from pom's code):

  • Added accuracy guidelines: "Do NOT jump to conclusions unsupported by evidence"
  • Use reactions as quality signal (marked with ★) but don't over-index
  • Explicit skip instructions for jokes, casual chat, unsubstantiated claims
Insight: The new categories capture critical info. Jan 7 extraction found 46 limitations (things LTX 2 can't do well), 44 hardware requirements (specific VRAM/RAM for different GPUs), and 44 resource links (HuggingFace models, GitHub repos).

Sample extractions:

  • Limitation: "Can't do people turning around - gets back-to-front mutation horrors"
  • Hardware: "3090: safe at 81 frames, OOMs at 121 frames"
  • Resource: LTX official workflows link, SageAttention installation guide

Output: data/ltx_chatter_20260106_knowledge.md and data/ltx_chatter_20260107_knowledge.md - clean markdown ready for NotebookLM.

February 1, 2026 - Morning

Prototype extraction successful

Built and tested extraction scripts for both forum threads and chat Q&A. Results are high quality.

Forum thread extraction (4 threads tested):

  • FlippinRad Motion Morph LoRA (394 msgs, 80 reactions) - Extracted LoRA details, requirements, 6 issues with solutions, 8 contributors
  • Wan HuMo SVI Pro v5 Workflow (935 msgs) - Lip-sync workflow with HuMo, settings, 5 issues/solutions
  • SYSTMS Transition Workflow (140 msgs) - VACE transitions, shift settings, 6 troubleshooting entries
  • Creative Video Upscaler (406 msgs) - Multi-pass 480p→1080p upscaling, AnimateDiff techniques

Chat Q&A extraction (99 pairs from wan_chatter):

  • 3 troubleshooting fixes (TEAcache compatibility, VACE frame errors, direction mask inversion)
  • 5 tips (InfiniteTalk recommendation, VACE 14B preference, Qwen LoRAs for likeness)
  • 3 settings recommendations (speed/quality optimization compatibility chart, Krea LoRA settings)
  • 4 concept explanations (direction masks, self forcing, block-based LoRA training)
Insight: Forum threads yield richer, more structured knowledge (~$0.05/thread with Sonnet). Chat Q&A is thinner but captures troubleshooting that doesn't appear in forum posts. Both are valuable.

Scripts created:

  • scripts/extract_forum_thread.py - Process a single forum thread
  • scripts/extract_chat_qa.py - Extract Q&A pairs from chatter channels

Output files:

  • data/thread_*_knowledge.json - Extracted forum knowledge
  • data/chat_qa_*.json - Extracted Q&A knowledge
January 30, 2026 - Morning

Discovered forum structure & planned cost-effective extraction

Key realization: Resources channels use Discord's forum feature, not regular chat. The thread_id field identifies which "post" each message belongs to.

Actual forum post counts:

  • wan_resources: 50 posts (not 6,600 messages - those are replies within posts)
  • ltx_resources: 45 posts
  • resources: 114 posts
  • Total: ~209 curated workflow/resource posts
Insight: Initial query showed 6,605 "messages without reference_id" which seemed like 6,605 posts. But these are actually all messages across ~50 forum threads. Each forum post averages ~200 comments/replies. The thread_id field (not reference_id) is what groups forum messages.

LLM cost analysis:

  • Processing all 1M messages naively: ~$1,500+ with Opus (too expensive)
  • Smart filtering to ~100K high-value messages: ~$60-110 with Opus
  • Same with Sonnet: ~$15-25

High-value subsets identified:

  • Messages with 3+ reactions: 42K (community-validated)
  • Messages with attachments: 155K (workflows, examples)
  • Long messages (>300 chars): ~21K (substantive content)
  • Kijai's messages: 104K (expert knowledge)

Decision: Don't trust existing daily summaries. They only cover 87 days and earlier ones may have errors (GPT verification added late Jan 2026). Will regenerate from raw messages to cover full 2.5 year archive.

January 29, 2026 - Late Evening

Found the summary generation source code

Nathan found the code that generates daily summaries: brain-of-bdnc/news_summary.py

How summaries are generated:

  • Model: Claude Sonnet 4.5 for generation, GPT-5.2 with "high reasoning effort" for verification
  • Chunking: 1000 messages at a time, then combined to top 3-5 items
  • Verification checks: Attribution errors, unsupported claims, logical leaps, invented details

The prompt explicitly prioritizes (in order):

  1. Original creations by community members (nodes, workflows, tools, LoRAs, scripts)
  2. Notable achievements and demonstrations
  3. High-engagement content (reactions/comments signal community interest)
  4. New features people are excited about
  5. Shared workflows with examples

Key prompt instructions:

  • "Do NOT jump to conclusions unsupported by evidence"
  • "Only report what is explicitly stated or clearly demonstrated"
  • "Distinguish between facts, opinions, and speculation"
  • "Always credit creators with bold usernames"
Insight: The summaries ARE capturing reference knowledge - but framed as "news". When someone discovers "FP32 compute improves quality", it's captured as a news item even though it's durable reference knowledge. For a KB, we need to re-process to extract the timeless content and organize by topic rather than date.
Important caveat: Peter (@pom) noted that the GPT-5.2 verification step was only added this week. Summaries before ~late January 2026 may contain inaccuracies (attribution errors, unsupported claims, etc.). This adds even more reason to re-process everything rather than using summaries as-is.

Proposed KB approach:

  1. Re-process summaries - Extract reference content, strip the news framing
  2. Cross-reference with raw Q&A - Summaries miss troubleshooting that happens in back-and-forth chat
  3. Organize by topic - All Z-Image tips together, all Wan troubleshooting together, not scattered across dates
January 29, 2026 - Evening

Daily summaries contain more than "news"

Re-examined daily summaries after initially thinking they were mostly "news" (model releases, community activity). Found they actually contain significant reference knowledge:

  • Technical settings: FP32 vs BF16 compute flags, sampler recommendations, resolution tables
  • Workflow techniques: Dual-model approaches (Base + Turbo), step counts for different effects
  • Training knowledge: LoRA strength conversions, captioning best practices, specific commit versions
  • Troubleshooting: SageAttention breaking Z-Image Base, facial changes during relighting
Insight: Daily summaries may be better than raw chat for many KB use cases - they're already synthesized, structured, and attributed. The "news" framing was too narrow.
January 29, 2026 - Afternoon

Synthesized first troubleshooting guide from raw chat

Took the extracted Q&A data from wan_chatter and synthesized it into a structured troubleshooting guide. Created both JSON and Markdown outputs.

Results: 14 troubleshooting entries, 6 tips, 5 FAQs. Examples:

  • mat1/mat2 CLIP loader fix: pip install transformers==4.48.0
  • NAG attention error: disable WanVideo Apply NAG node
  • Sampler preview missing: check ComfyUI settings, not Manager
  • Shift values: use 5 for distilled LoRAs, increase for higher res

Files: data/troubleshooting_wan_chatter.json, data/troubleshooting_wan_chatter.md

January 29, 2026 - Afternoon

Extracted reference knowledge from raw chat

Built scripts/extract_reference_knowledge.py to find Q&A patterns, errors, and solutions buried in Discord messages. Ran against wan_chatter channel (50K messages).

Results surprised us:

  • 11,544 potential questions (~23% of messages match question patterns)
  • 5,605 Q&A reply pairs (using reference_id to track who replied to what)
  • 662 messages mentioning fixes/solutions
  • 793 error-related discussions
Insight: There's substantial reference knowledge in raw chat that doesn't appear in daily summaries. The reference_id field is key - it lets us connect questions to their answers.
January 29, 2026 - Morning

Discussion: What makes a good knowledge base?

Nathan shared that NotebookLM (chat-with-docs) has been the most useful KB approach he's tried. This led to thinking about what makes knowledge useful:

  • "News" vs "Reference" - Daily summaries capture what happened (news), but users often need how-to information (reference)
  • Update frequency - AI video/image space moves fast. Content becomes outdated quickly.
  • Audience - Primarily technical users who want to get unstuck or learn techniques

Initial hypothesis: Daily summaries = news, raw chat = buried reference knowledge. (This hypothesis was later revised - see evening entry.)

January 28, 2026

Analyzed top contributors

Built script to find who contributes most to the community. Key finding: Kijai has sent 103,556 messages - about 10% of all messages in the database. Clear power-law distribution.

Top 5: Kijai (103K), pom (34K), Juampab12 (26K), spacepxl (21K), Juan Gea (20K)

Created stats.html to display top 20 contributors with their most active channels.

January 28, 2026

Database gap filled!

Peter (@pom) filled in the 12-month data gap (Feb 2024 - Jan 2025). Database now has:

  • 1,046,692 messages (up from 727K)
  • 6,624 members (up from 4,477)
  • 272,750 messages recovered from the gap period

This data includes FLUX release, Stable Diffusion 3, CogVideoX, and early HunyuanVideo discussions.

January 28, 2026

Project started

Goal: Transform the Banodoco Discord database into a useful knowledge base about open source AI tools (video generation, image generation, training, ComfyUI, etc.)

Initial exploration revealed:

  • 4 core tables: discord_messages, discord_members, discord_channels, daily_summaries
  • 29 months of data (Aug 2023 - Jan 2026)
  • AI-generated daily summaries with structured JSON, attribution, and media links
  • A 12-month gap in the data (Feb 2024 - Jan 2025) - later filled

Created database.html to visualize the database structure and coverage.