chore: update 6 file(s)

2025-12-17 22:30:41 +01:00
parent a53c0e2902
commit 4343b7a5a2
6 changed files with 1122 additions and 220 deletions
--- a/QUICK_START.md
+++ b/QUICK_START.md
@@ -1,105 +1,156 @@
 # Quick Start Guide
-## Dutch Language (Nederlands)
+## 1. Setup Audio Devices
 ### Basic Dutch Transcription
 ```bash
-./RUN_DUTCH.sh
+# List available audio devices
 ./run_transcribe.sh --list-devices
 ```
 - ✅ GPU-accelerated (RTX 4060 Ti)
 - ✅ Sentence extraction (complete zinnen)
 - ✅ Base model (goede balans snelheid/nauwkeurigheid)
-### Dutch with LLM Analysis
+Find your:
 - **Microphone** - Your input device (e.g., "USB Microphone")
 - **Monitor** - Speaker capture device (e.g., "Monitor of Built-in Audio")
 ---
 ## 2. Basic Usage
 ### Simple Transcription
 ```bash
-./RUN_DUTCH_LLM.sh
+# Auto-detect devices
 ./run_transcribe.sh --model medium --language en
 # Specify devices
 ./run_transcribe.sh --mic "USB Mic" --monitor "Monitor"
 ```
 - ✅ All features from basic version
 - ✅ Fact-checking van uitspraken
 - ✅ Automatische vraag generatie
 - Uses llama3.2:latest model
-### Save to File
+### With File Output
 ```bash
-./RUN_DUTCH.sh --output transcript.txt
+./run_transcribe.sh --model medium --language en --output transcript.txt
-./RUN_DUTCH_LLM.sh --output enriched.txt
+```
 ### With LLM Analysis
 ```bash
 ./run_transcribe.sh --model medium --enable-llm --output enriched.txt
 ```
 ---
-## English Language
+## 3. Language Examples
-### Basic English Transcription
+### Dutch (Nederlands)
 ```bash
-./RUN_GPU.sh
+./run_transcribe.sh --model medium --language nl --enable-llm
 ```
 ### English with LLM
 ```bash
 ./RUN_GPU.sh --enable-llm
 ```
 ---
 ## Other Languages
 ### Spanish
 ```bash
-./RUN_GPU.sh --language es
+./run_transcribe.sh --model medium --language es
 ```
 ### French
 ```bash
-./RUN_GPU.sh --language fr
+./run_transcribe.sh --model medium --language fr
 ```
 ### German
 ```bash
-./RUN_GPU.sh --language de
+./run_transcribe.sh --model medium --language de
 ```
 ---
-## Available Ollama Models
+## 4. Model Selection
-You have these models installed:
+| Model  | Speed    | Quality | Command                          |
- `llama3.2:latest` (2.0 GB) - **Default** - Fast and accurate
+|--------|----------|---------|----------------------------------|
- `llama3:8b` (4.7 GB) - More powerful
+| tiny   | Fastest  | Basic   | `--model tiny`                   |
- `qwen2.5:3b` (1.9 GB) - Fast alternative
+| base   | Fast     | Good    | `--model base`                   |
- `qwen2.5:7b` (4.7 GB) - Powerful alternative
+| small  | Moderate | Better  | `--model small`                  |
- `qwen2.5:0.5b` (397 MB) - Very fast, less accurate
+| medium | Slow     | Great   | `--model medium` **(recommended)** |
 | large  | Slowest  | Best    | `--model large`                  |
-To use a different model:
+---
 ## 5. Optimization Tips
 ### High Quality Transcription
 ```bash
-./RUN_DUTCH_LLM.sh --llm-model "llama3:8b"
+./run_transcribe.sh --model large --interval 8 --min-duration 4
 ```
 ### Fast Real-Time
 ```bash
 ./run_transcribe.sh --model tiny --interval 3 --min-duration 2
 ```
 ### Best Dutch Transcription (Your Setup)
 ```bash
 ./run_transcribe.sh --model medium --interval 8 --min-duration 4 --enable-llm --language nl
 ```
 ---
-## Tips
+## 6. LLM Configuration
-### Better Accuracy
+### Default Model (qwen2.5:3b - Fast)
 Use larger Whisper model (slower):
 ```bash
-./RUN_DUTCH.sh --model medium  # or: large
+./run_transcribe.sh --enable-llm
 ```
-### Faster Processing
+### Larger Model (Better Analysis)
 Use smaller model or reduce interval:
 ```bash
-./RUN_DUTCH.sh --model tiny --interval 3
+# Install model first
-```
+ollama pull llama3.2
-### Debug LLM Issues
+# Use it
-```bash
+./run_transcribe.sh --enable-llm --llm-model llama3.2
 ./RUN_DUTCH_LLM.sh --llm-debug
 ```
 ---
-## Controls
+## 7. Output Examples
- **Ctrl+C** to stop transcription
+### Console Output
- Speak clearly into your microphone
+```
- Wait ~5 seconds for transcription to appear
+🎤 [14:23:15] User speaking via microphone
- Sentences appear with 📝 emoji
+🔊 [14:23:20] Audio from speakers
 🎤 [14:23:25] The Earth orbits the Sun in 365 days.
   ✅ FACTUAL (0.98): Scientifically accurate.
   ❓ Questions:
      1. Why do we need leap years?
      2. How does orbital speed vary?
      3. What affects Earth's orbit?
 ```
 ### File Output
 Saved to `transcript.txt` or your specified file with timestamps and analysis.
 ---
 ## 8. Controls
 - **Ctrl+C** - Stop transcription
 - Processing happens every `--interval` seconds (default: 5s)
 - Minimum `--min-duration` audio required (default: 2s)
 ---
 ## Troubleshooting
 **No devices found:**
 ```bash
 ./run_transcribe.sh --list-devices
 ```
 **Ollama errors:**
 ```bash
 ollama serve
 ollama pull qwen2.5:3b
 ```
 **Force CPU (GPU issues):**
 ```bash
 ./run_transcribe.sh --force-cpu
 ```
--- a/README.md
+++ b/README.md
@@ -1,16 +1,15 @@
 # Verbatim Dicta
-Real-time audio transcription using Whisper AI with optional LLM-powered analysis. Captures system audio via loopback and transcribes it with configurable models and processing options.
+Real-time audio transcription using Whisper AI with optional LLM analysis. Captures microphone and speaker audio simultaneously for comprehensive transcription.
 ## Features
- Real-time transcription of system audio (Windows/Linux)
+- **Dual audio capture** - Record microphone and speaker output simultaneously
- Multiple Whisper model sizes (tiny to large)
+- **Real-time transcription** - Process audio as it's captured with Whisper models
- Multi-language support
+- **LLM analysis** - Optional fact-checking and question generation via Ollama
- **Sentence extraction mode** - Stitches audio chunks into complete sentences
+- **Multi-language** - Support for 50+ languages
- Optional LLM analysis for fact-checking and question generation (via Ollama)
+- **File output** - Save transcripts with timestamps and analysis
- GPU acceleration support
+- **GPU acceleration** - CUDA support for faster processing
 - Flexible audio device configuration
 ## Quick Start
@@ -18,17 +17,14 @@ Real-time audio transcription using Whisper AI with optional LLM-powered analysi
 # Install dependencies
 pip install -r requirements.txt
 # Basic transcription (no LLM)
 python transcribe_speakers.py
 # With LLM analysis (optional)
 python transcribe_speakers.py --enable-llm
 # With sentence extraction
 python transcribe_speakers.py --sentence-mode
 # List audio devices
-python transcribe_speakers.py --list-devices
+./run_transcribe.sh --list-devices
 # Basic transcription
 ./run_transcribe.sh --model medium --language en
 # With LLM analysis and file output
 ./run_transcribe.sh --model medium --enable-llm --output transcript.txt
 ```
 ## Requirements
@@ -58,172 +54,153 @@ For CUDA 12.1:
 pip install torch==2.8.0+cu121 --index-url https://download.pytorch.org/whl/cu121
 ```
-### 3. Audio Loopback Setup
+### 3. Audio Setup
-**Windows - Option A (Stereo Mix):**
+**Linux (PulseAudio/PipeWire):**
-1. Right-click speaker icon → Sounds → Recording tab
+```bash
-2. Right-click → Show Disabled Devices
+# List devices to find your monitor device
-3. Enable and set Stereo Mix as default
+./run_transcribe.sh --list-devices
-**Windows - Option B (VB-Cable, recommended):**
+# Use with monitor device
-1. Download from [vb-audio.com](https://vb-audio.com/Cable/)
+./run_transcribe.sh --monitor "alsa_output.monitor"
-2. Install and restart
+```
 3. Use `--device "CABLE Output"`
-**Linux:**
+**Windows:**
-Configure PulseAudio loopback or use `transcribe_dual_linux.py`
+- Enable "Stereo Mix" in Sound settings, or
 - Install VB-Cable from [vb-audio.com](https://vb-audio.com/Cable/)
-### 4. LLM Features (Optional)
+### 4. LLM Support (Optional)
 ```bash
 # Install Ollama from ollama.ai
-ollama pull llama3.2
+ollama pull qwen2.5:3b
 ```
 ## Usage
-### Available Scripts
+### Command Line Options
 - `transcribe_speakers.py` - Main script with all features (LLM optional via `--enable-llm`)
 - `transcribe_dual_linux.py` - Linux-specific with dual audio support
 ### Common Commands
 ```bash
-# Quick start with GPU (English)
+python transcribe.py [OPTIONS]
 ./RUN_GPU.sh
-# Dutch language
+Options:
-./RUN_DUTCH.sh
+  --model {tiny,base,small,medium,large}  Whisper model (default: tiny)
-
+  --language CODE                         Language code (default: en)
-# Dutch with LLM analysis
+  --mic DEVICE                            Microphone device name
-./RUN_DUTCH_LLM.sh
+  --monitor DEVICE                        Speaker monitor device name
-
+  --interval SECONDS                      Processing interval (default: 5.0)
-# With LLM analysis
+  --min-duration SECONDS                  Minimum audio duration (default: 2.0)
-./RUN_GPU.sh --enable-llm
+  --enable-llm                            Enable LLM analysis
-
+  --llm-model MODEL                       Ollama model (default: qwen2.5:3b)
-# Save to file
+  --output FILE                           Save transcript to file
-./RUN_GPU.sh --output transcript.txt
+  --force-cpu                             Force CPU processing
-
+  --list-devices                          List audio devices
 # Other languages (Spanish, French, German, etc.)
 ./RUN_GPU.sh --language es  # Spanish
 ./RUN_GPU.sh --language fr  # French
 ./RUN_GPU.sh --language de  # German
 # Maximum accuracy with LLM and sentence extraction
 python transcribe_speakers.py --model large --enable-llm --sentence-mode --output enriched.txt
 # Force CPU (if GPU issues)
 python transcribe_speakers.py --force-cpu
 ```
-### Key Options
+### Examples
-| Option | Description | Default |
+```bash
-|--------|-------------|---------|
+# Dutch transcription with LLM
-| `--model` | Model size: tiny/base/small/medium/large | base |
+./run_transcribe.sh --model medium --language nl --enable-llm
-| `--language` | Language code (en/es/fr/de/ja/etc.) | en |
+
-| `--device` | Audio device name (partial match) | Auto |
+# High-quality meeting transcription
-| `--interval` | Processing interval (seconds) | 8.0 |
+./run_transcribe.sh --model large --interval 8 --output meeting.txt
-| `--min-duration` | Minimum audio duration | 3.0 |
+
-| `--fast-mode` | Fast mode (3-5x faster, lower accuracy) | False |
+# Fast real-time transcription
-| `--enable-llm` | Enable fact-checking and questions | False |
+./run_transcribe.sh --model tiny --interval 3 --min-duration 2
-| `--llm-model` | Ollama model to use | llama3.2 |
+
-| `--output` | Save to file | None |
+# Specific devices
-| `--force-cpu` | Disable GPU | False |
+./run_transcribe.sh --mic "USB Mic" --monitor "Monitor of Speakers"
-| `--gpu-index` | GPU device index | 0 |
+```
 | `--sentence-mode` | Extract complete sentences from chunks | False |
 ## Model Performance
-| Model | Size | Speed | Quality | Best For |
+| Model  | Size   | Speed    | Quality | Use Case               |
-|-------|------|-------|---------|----------|
+|--------|--------|----------|---------|------------------------|
-| tiny | ~75 MB | Fastest | Basic | Quick tests, low-latency |
+| tiny   | 75 MB  | Fastest  | Basic   | Real-time, low latency |
-| base | ~145 MB | Fast | Good | General real-time use |
+| base   | 145 MB | Fast     | Good    | General use            |
-| small | ~485 MB | Moderate | Better | Balanced accuracy/speed |
+| small  | 485 MB | Moderate | Better  | Balanced               |
-| medium | ~1.5 GB | Slow | Great | High accuracy needs |
+| medium | 1.5 GB | Slow     | Great   | High accuracy          |
-| large | ~3 GB | Slowest | Best | Maximum accuracy |
+| large  | 3 GB   | Slowest  | Best    | Maximum quality        |
 ## Optimization Presets
 **Low Latency (Real-Time):**
 ```bash
 python transcribe_speakers.py --model tiny --fast-mode --interval 2 --min-duration 1.5
 ```
 **Balanced:**
 ```bash
 python transcribe_speakers.py --model base --interval 5
 ```
 **High Accuracy:**
 ```bash
 python transcribe_speakers.py --model large --interval 10 --enable-llm
 ```
 ## Troubleshooting
-**No loopback device:**
+**No audio devices found:**
- Windows: Enable Stereo Mix or install VB-Cable
+```bash
- Linux: Configure PulseAudio loopback
+# List all devices
 ./run_transcribe.sh --list-devices
 # Specify devices explicitly
 ./run_transcribe.sh --mic "device_name" --monitor "monitor_name"
 ```
 **CUDA errors:**
 ```bash
-python transcribe_speakers.py --force-cpu
+# Force CPU processing
 ./run_transcribe.sh --force-cpu
 ```
-**No audio captured:**
+**Ollama connection failed:**
- Verify audio is playing
+```bash
- Check device: `--list-devices`
+# Start Ollama service
- Increase system volume
+ollama serve
-**Poor quality:**
+# Pull required model
- Use larger model: `--model medium`
+ollama pull qwen2.5:3b
 ```
 **Poor transcription quality:**
 - Use larger model: `--model medium` or `--model large`
 - Increase interval: `--interval 10`
- Specify language: `--language <code>`
+- Specify language: `--language nl`
-
+- Ensure good audio quality (reduce background noise)
 **Ollama errors:**
 - Ensure Ollama is running
 - Pull model: `ollama pull llama3.2`
 ## Output Format
-**Standard:**
+### Standard Output
 ```
-[14:23:15] Transcribed audio segment.
+🎤 [14:23:15] User speaking into microphone
-[14:23:23] Another segment with timestamp.
+🔊 [14:23:18] Audio from speakers or system
 ```
-**With LLM (--enable-llm):**
+### With LLM Analysis
 ```
 🎤 [14:23:15] The Earth orbits the Sun in 365 days.
   ✅ FACTUAL (0.98): Scientifically accurate orbital period.
   ❓ Questions:
      1. Why do we need leap years?
      2. How does the elliptical orbit affect seasons?
      3. What factors influence Earth's orbital velocity?
 ```
 ### File Output
 ```
 [14:23:15] MIC: User speaking into microphone
 [14:23:18] SPEAKER: Audio from speakers
 ======================================================================
-[14:23:15] The Earth revolves around the Sun in 365 days.
+[14:23:25] MIC: The Earth orbits the Sun in 365 days.
 📊 Fact Check: FACTUAL (confidence: 0.98)
-💡 Scientifically accurate. Earth's orbital period is 365.25 days.
+💡 Scientifically accurate orbital period.
 ❓ Questions:
 1. Why do we need leap years?
-2. How does Earth's orbit affect seasons?
+2. How does the elliptical orbit affect seasons?
-======================================================================
+3. What factors influence Earth's orbital velocity?
 ```
-## Technical Stack
+## Architecture
- **Audio**: sounddevice, soundfile (16kHz mono, 16-bit PCM)
+- **Audio Capture**: sounddevice with dual-stream support
- **Transcription**: faster-whisper (optimized Whisper)
+- **Transcription**: faster-whisper (optimized Whisper implementation)
- **LLM**: Ollama (local inference)
+- **LLM**: Ollama for local inference
- **Capture**: WASAPI loopback (Windows), PulseAudio (Linux)
+- **Format**: 16kHz mono, 16-bit PCM
 - **Processing**: Independent mic/speaker buffers with beam_size=3
-## Future Work
+## Contributing
- Real-time streaming transcription with reduced buffering
+Contributions welcome! Please open issues or submit pull requests.
 - Speaker diarization improvements
 - Web interface for remote monitoring
 - Multi-device simultaneous transcription
 - Cloud LLM integration options
 - Custom vocabulary and domain adaptation
 - Noise reduction preprocessing
 ## License
--- a/run_transcribe.sh
+++ b/run_transcribe.sh
@@ -11,4 +11,4 @@ CUBLAS_PATH=".venv/lib/python3.13/site-packages/nvidia/cublas/lib"
 export LD_LIBRARY_PATH="${CUDNN_PATH}:${CUBLAS_PATH}:${LD_LIBRARY_PATH}"
 # Run the transcription script with all arguments
-python3 transcribe_dual_linux.py "$@"
+python3 transcribe.py "$@"
--- a/transcribe.py
+++ b/transcribe.py
@@ -0,0 +1,437 @@
 #!/usr/bin/env python3
 """
 Real-time audio transcription with dual capture and optional LLM analysis.
 Supports microphone + speaker monitor, file output, and fact-checking.
 """
 import sounddevice as sd
 import numpy as np
 import threading
 import queue
 import time
 import os
 import argparse
 from datetime import datetime
 from faster_whisper import WhisperModel
 try:
    import ollama
    OLLAMA_AVAILABLE = True
 except ImportError:
    OLLAMA_AVAILABLE = False
 class DualAudioCapture:
    """Capture both microphone and speaker output simultaneously"""
    def __init__(self, mic_device=None, monitor_device=None, sample_rate=16000, chunk_size=2048):
        self.sample_rate = sample_rate
        self.chunk_size = chunk_size
        self.audio_queue = queue.Queue()
        # Find devices
        devices = sd.query_devices()
        # Microphone (default input or specified)
        if mic_device is None:
            self.mic_device = sd.default.device[0]  # Default input
        else:
            self.mic_device = self._find_device(mic_device, input_required=True)
        # Monitor/Loopback (for speaker output)
        if monitor_device:
            self.monitor_device = self._find_device(monitor_device, input_required=True)
        else:
            self.monitor_device = None
        print(f"✓ Microphone: {devices[self.mic_device]['name']} (index {self.mic_device})")
        if self.monitor_device:
            print(f"✓ Monitor: {devices[self.monitor_device]['name']} (index {self.monitor_device})")
        else:
            print("⚠ No monitor device - capturing microphone only")
        # Start streams
        self.mic_stream = sd.InputStream(
            device=self.mic_device,
            channels=1,
            samplerate=sample_rate,
            blocksize=chunk_size,
            dtype='int16',
            callback=self._mic_callback
        )
        if self.monitor_device:
            self.monitor_stream = sd.InputStream(
                device=self.monitor_device,
                channels=1,
                samplerate=sample_rate,
                blocksize=chunk_size,
                dtype='int16',
                callback=self._monitor_callback
            )
        else:
            self.monitor_stream = None
        self.mic_stream.start()
        if self.monitor_stream:
            self.monitor_stream.start()
        print("✓ Audio capture started")
    def _find_device(self, device_name, input_required=True):
        """Find device by name substring"""
        devices = sd.query_devices()
        for i, dev in enumerate(devices):
            if device_name.lower() in dev['name'].lower():
                if not input_required or dev['max_input_channels'] > 0:
                    return i
        raise RuntimeError(f"Device '{device_name}' not found")
    def _mic_callback(self, indata, frames, time_info, status):
        """Microphone audio callback"""
        if status:
            print(f"⚠ Mic status: {status}")
        self.audio_queue.put(('mic', indata.copy()))
    def _monitor_callback(self, indata, frames, time_info, status):
        """Monitor/speaker audio callback"""
        if status:
            print(f"⚠ Monitor status: {status}")
        self.audio_queue.put(('monitor', indata.copy()))
    def read_chunk(self):
        """Read audio data from queue"""
        try:
            return self.audio_queue.get(timeout=0.05)
        except queue.Empty:
            return None
    def close(self):
        """Cleanup resources"""
        self.mic_stream.stop()
        self.mic_stream.close()
        if self.monitor_stream:
            self.monitor_stream.stop()
            self.monitor_stream.close()
 class WhisperTranscriber:
    """Process audio with Whisper"""
    def __init__(self, model_name="base", language="en", force_cpu=False):
        print(f"Loading Whisper model '{model_name}'...")
        import torch
        has_cuda = torch.cuda.is_available() and not force_cpu
        device = "cpu"
        compute_type = "int8"
        if has_cuda:
            try:
                import ctranslate2
                if ctranslate2.get_cuda_device_count() > 0:
                    device = "cuda"
                    compute_type = "float16"
                    print(f"✓ Using GPU: {torch.cuda.get_device_name(0)}")
            except Exception as e:
                print(f"⚠ CUDA unavailable: {e}")
        if device == "cpu":
            print("✓ Using CPU")
        model_kwargs = {"device": device, "compute_type": compute_type}
        if device == "cpu":
            model_kwargs["cpu_threads"] = 4
        self.model = WhisperModel(model_name, **model_kwargs)
        self.language = language
        self.mic_buffer = np.array([], dtype=np.float32)
        self.monitor_buffer = np.array([], dtype=np.float32)
        self.lock = threading.Lock()
    def add_audio(self, source, audio_chunk):
        """Add audio to appropriate buffer"""
        with self.lock:
            audio_float = audio_chunk.flatten().astype(np.float32) / 32768.0
            if source == 'mic':
                self.mic_buffer = np.concatenate([self.mic_buffer, audio_float])
            else:
                self.monitor_buffer = np.concatenate([self.monitor_buffer, audio_float])
    def transcribe_chunk(self, min_duration=3.0):
        """Transcribe accumulated audio"""
        with self.lock:
            mic_duration = len(self.mic_buffer) / 16000
            monitor_duration = len(self.monitor_buffer) / 16000
            results = {}
            # Transcribe microphone
            if mic_duration >= min_duration:
                mic_audio = self.mic_buffer.copy()
                self.mic_buffer = np.array([], dtype=np.float32)
                results['mic'] = self._transcribe(mic_audio)
            # Transcribe monitor
            if monitor_duration >= min_duration:
                monitor_audio = self.monitor_buffer.copy()
                self.monitor_buffer = np.array([], dtype=np.float32)
                results['monitor'] = self._transcribe(monitor_audio)
            return results if results else None
    def _transcribe(self, audio):
        """Internal transcription"""
        try:
            segments, _ = self.model.transcribe(
                audio,
                language=self.language,
                beam_size=3,
                vad_filter=True,
                vad_parameters=dict(min_silence_duration_ms=500)
            )
            text = " ".join([seg.text for seg in segments]).strip()
            return text if text else None
        except Exception as e:
            print(f"❌ Transcription error: {e}")
            return None
 class LLMAnalyzer:
    """LLM analysis with fact-checking and question generation"""
    def __init__(self, model="qwen2.5:3b"):
        if not OLLAMA_AVAILABLE:
            raise RuntimeError("Ollama not installed: pip install ollama")
        self.model = model
        try:
            ollama.list()
            print(f"✓ Ollama connected: {self.model}")
        except Exception as e:
            raise RuntimeError(f"Ollama not running: {e}")
    def fact_check(self, text):
        """Quick fact-check"""
        prompt = f"""Fact-check this statement. Reply ONLY with:
 VERDICT: factual/dubious/false
 CONFIDENCE: 0.0-1.0
 REASON: one sentence
 Statement: "{text}" """
        try:
            response = ollama.generate(
                model=self.model,
                prompt=prompt,
                options={"temperature": 0.1, "num_predict": 80}
            )
            import re
            response_text = response['response']
            verdict = re.search(r'VERDICT:\s*(\w+)', response_text, re.I)
            confidence = re.search(r'CONFIDENCE:\s*([\d.]+)', response_text, re.I)
            reason = re.search(r'REASON:\s*(.+?)(?:\n|$)', response_text, re.I | re.DOTALL)
            return {
                'verdict': verdict.group(1).lower() if verdict else 'unknown',
                'confidence': float(confidence.group(1)) if confidence else 0.5,
                'reason': reason.group(1).strip() if reason else response_text[:150]
            }
        except Exception as e:
            return {'verdict': 'error', 'confidence': 0.0, 'reason': str(e)}
    def generate_questions(self, text):
        """Generate follow-up questions"""
        prompt = f"""Generate 3 insightful questions about this. Reply ONLY with:
 Q1: [question]
 Q2: [question]
 Q3: [question]
 Statement: "{text}" """
        try:
            response = ollama.generate(
                model=self.model,
                prompt=prompt,
                options={"temperature": 0.7, "num_predict": 120}
            )
            import re
            response_text = response['response']
            questions = []
            for i in range(1, 4):
                q_match = re.search(rf'Q{i}:\s*(.+?)(?:\n|$)', response_text, re.I)
                if q_match:
                    question = q_match.group(1).strip()
                    if not question.endswith('?'):
                        question += '?'
                    questions.append(question)
            # Fallback defaults
            while len(questions) < 3:
                defaults = ["What are the implications?", "What evidence supports this?", "What's the context?"]
                questions.append(defaults[len(questions)])
            return questions[:3]
        except Exception as e:
            return ["What are the key points?", "What supports this?", "What are the implications?"]
 def save_transcript(text, source, timestamp, filename):
    """Append transcript to file"""
    os.makedirs(os.path.dirname(filename) if os.path.dirname(filename) else '.', exist_ok=True)
    with open(filename, "a", encoding="utf-8") as f:
        source_label = "MIC" if source == 'mic' else "SPEAKER"
        f.write(f"[{timestamp}] {source_label}: {text}\n")
 def save_enriched_transcript(text, source, timestamp, fact_check, questions, filename):
    """Save enriched transcript with LLM analysis"""
    os.makedirs(os.path.dirname(filename) if os.path.dirname(filename) else '.', exist_ok=True)
    with open(filename, "a", encoding="utf-8") as f:
        source_label = "MIC" if source == 'mic' else "SPEAKER"
        f.write(f"\n{'='*70}\n")
        f.write(f"[{timestamp}] {source_label}: {text}\n\n")
        if fact_check:
            f.write(f"📊 Fact Check: {fact_check['verdict'].upper()} ")
            f.write(f"(confidence: {fact_check['confidence']:.2f})\n")
            f.write(f"💡 {fact_check['reason']}\n\n")
        if questions:
            f.write("❓ Questions:\n")
            for i, q in enumerate(questions, 1):
                f.write(f"{i}. {q}\n")
            f.write("\n")
 def main():
    parser = argparse.ArgumentParser(description="Real-time audio transcription with dual capture")
    parser.add_argument("--model", default="tiny", choices=["tiny", "base", "small", "medium", "large"],
                        help="Whisper model (default: tiny)")
    parser.add_argument("--language", default="en", help="Language code (default: en)")
    parser.add_argument("--mic", help="Microphone device name (partial match)")
    parser.add_argument("--monitor", help="Monitor device name for speaker capture")
    parser.add_argument("--interval", type=float, default=5.0, help="Processing interval in seconds (default: 5.0)")
    parser.add_argument("--min-duration", type=float, default=2.0, help="Minimum audio duration (default: 2.0)")
    parser.add_argument("--enable-llm", action="store_true", help="Enable LLM analysis (fact-checking + questions)")
    parser.add_argument("--llm-model", default="qwen2.5:3b", help="Ollama model (default: qwen2.5:3b)")
    parser.add_argument("--output", "-o", help="Save transcript to file")
    parser.add_argument("--list-devices", action="store_true", help="List audio devices and exit")
    parser.add_argument("--force-cpu", action="store_true", help="Force CPU processing")
    args = parser.parse_args()
    if args.list_devices:
        print("\nAvailable audio devices:")
        for i, dev in enumerate(sd.query_devices()):
            in_ch = dev['max_input_channels']
            out_ch = dev['max_output_channels']
            if in_ch > 0:
                print(f"  [{i:2d}] {dev['name']:<50} IN:{in_ch} OUT:{out_ch}")
        return
    print("=== Real-Time Audio Transcription ===")
    print(f"Model: {args.model} | Language: {args.language} | Interval: {args.interval}s")
    if args.output:
        print(f"Output: {args.output}")
    if args.enable_llm:
        print(f"LLM Analysis: Enabled ({args.llm_model})")
    # Initialize capture
    try:
        capturer = DualAudioCapture(
            mic_device=args.mic,
            monitor_device=args.monitor,
            sample_rate=16000,
            chunk_size=2048
        )
    except Exception as e:
        print(f"\n❌ Audio Error: {e}")
        print("\nTip: Use --list-devices to see available devices")
        print("     Use --mic and --monitor to specify devices")
        return
    # Initialize transcriber
    try:
        transcriber = WhisperTranscriber(
            model_name=args.model,
            language=args.language,
            force_cpu=args.force_cpu
        )
    except Exception as e:
        print(f"\n❌ Whisper Error: {e}")
        return
    # Initialize LLM analyzer
    llm_analyzer = None
    if args.enable_llm:
        try:
            llm_analyzer = LLMAnalyzer(model=args.llm_model)
        except Exception as e:
            print(f"\n⚠ LLM Error: {e}")
            print("Continuing without LLM analysis...")
    # Main loop
    print(f"\n✅ Started. Press Ctrl+C to stop.\n{'='*60}")
    last_process = time.time()
    try:
        while True:
            # Collect audio
            chunk = capturer.read_chunk()
            if chunk:
                source, audio = chunk
                transcriber.add_audio(source, audio)
            # Process at intervals
            if time.time() - last_process >= args.interval:
                results = transcriber.transcribe_chunk(min_duration=args.min_duration)
                if results:
                    timestamp = datetime.now().strftime("%H:%M:%S")
                    for source, text in results.items():
                        if text:
                            source_emoji = "🎤" if source == 'mic' else "🔊"
                            print(f"\n{source_emoji} [{timestamp}] {text}")
                            # LLM analysis
                            fact_check = None
                            questions = None
                            if llm_analyzer:
                                fact_check = llm_analyzer.fact_check(text)
                                questions = llm_analyzer.generate_questions(text)
                                verdict_emoji = {'factual': '✅', 'dubious': '⚠️', 'false': '❌'}.get(
                                    fact_check['verdict'], '❓')
                                print(f"   {verdict_emoji} {fact_check['verdict'].upper()} "
                                      f"({fact_check['confidence']:.2f}): {fact_check['reason']}")
                                print(f"   ❓ Questions:")
                                for i, q in enumerate(questions, 1):
                                    print(f"      {i}. {q}")
                            # Save to file
                            if args.output:
                                if llm_analyzer:
                                    save_enriched_transcript(text, source, timestamp, fact_check, questions, args.output)
                                else:
                                    save_transcript(text, source, timestamp, args.output)
                last_process = time.time()
    except KeyboardInterrupt:
        print(f"\n{'='*60}\n🛑 Stopping...")
    capturer.close()
    if args.output and os.path.exists(args.output):
        print(f"\n💾 Transcript saved: {os.path.abspath(args.output)}")
    print("\n✅ Done!")
 if __name__ == "__main__":
    main()
--- a/transcribe_dual_linux.py
+++ b/transcribe_dual_linux.py
@@ -1,7 +1,7 @@
 #!/usr/bin/env python3
 """
-Real-time transcription with dual audio capture (microphone + speaker monitor).
+Real-time audio transcription with dual capture and optional LLM analysis.
-Linux/PipeWire optimized with Ollama LLM fact-checking.
+Supports microphone + speaker monitor, file output, and fact-checking.
 """
 import sounddevice as sd
@@ -9,6 +9,7 @@ import numpy as np
 import threading
 import queue
 import time
 import os
 import argparse
 from datetime import datetime
 from faster_whisper import WhisperModel
@@ -197,8 +198,8 @@ class WhisperTranscriber:
            return None
-class LLMFactChecker:
+class LLMAnalyzer:
-    """Fast fact-checking with Ollama"""
+    """LLM analysis with fact-checking and question generation"""
    def __init__(self, model="qwen2.5:3b"):
        if not OLLAMA_AVAILABLE:
@@ -228,34 +229,100 @@ Statement: "{text}" """
            )
            import re
-            text = response['response']
+            response_text = response['response']
-            verdict = re.search(r'VERDICT:\s*(\w+)', text, re.I)
+            verdict = re.search(r'VERDICT:\s*(\w+)', response_text, re.I)
-            confidence = re.search(r'CONFIDENCE:\s*([\d.]+)', text, re.I)
+            confidence = re.search(r'CONFIDENCE:\s*([\d.]+)', response_text, re.I)
-            reason = re.search(r'REASON:\s*(.+?)(?:\n|$)', text, re.I | re.DOTALL)
+            reason = re.search(r'REASON:\s*(.+?)(?:\n|$)', response_text, re.I | re.DOTALL)
            return {
                'verdict': verdict.group(1).lower() if verdict else 'unknown',
                'confidence': float(confidence.group(1)) if confidence else 0.5,
-                'reason': reason.group(1).strip() if reason else text[:150]
+                'reason': reason.group(1).strip() if reason else response_text[:150]
            }
        except Exception as e:
            return {'verdict': 'error', 'confidence': 0.0, 'reason': str(e)}
    def generate_questions(self, text):
        """Generate follow-up questions"""
        prompt = f"""Generate 3 insightful questions about this. Reply ONLY with:
 Q1: [question]
 Q2: [question]
 Q3: [question]
 Statement: "{text}" """
        try:
            response = ollama.generate(
                model=self.model,
                prompt=prompt,
                options={"temperature": 0.7, "num_predict": 120}
            )
            import re
            response_text = response['response']
            questions = []
            for i in range(1, 4):
                q_match = re.search(rf'Q{i}:\s*(.+?)(?:\n|$)', response_text, re.I)
                if q_match:
                    question = q_match.group(1).strip()
                    if not question.endswith('?'):
                        question += '?'
                    questions.append(question)
            # Fallback defaults
            while len(questions) < 3:
                defaults = ["What are the implications?", "What evidence supports this?", "What's the context?"]
                questions.append(defaults[len(questions)])
            return questions[:3]
        except Exception as e:
            return ["What are the key points?", "What supports this?", "What are the implications?"]
 def save_transcript(text, source, timestamp, filename):
    """Append transcript to file"""
    os.makedirs(os.path.dirname(filename) if os.path.dirname(filename) else '.', exist_ok=True)
    with open(filename, "a", encoding="utf-8") as f:
        source_label = "MIC" if source == 'mic' else "SPEAKER"
        f.write(f"[{timestamp}] {source_label}: {text}\n")
 def save_enriched_transcript(text, source, timestamp, fact_check, questions, filename):
    """Save enriched transcript with LLM analysis"""
    os.makedirs(os.path.dirname(filename) if os.path.dirname(filename) else '.', exist_ok=True)
    with open(filename, "a", encoding="utf-8") as f:
        source_label = "MIC" if source == 'mic' else "SPEAKER"
        f.write(f"\n{'='*70}\n")
        f.write(f"[{timestamp}] {source_label}: {text}\n\n")
        if fact_check:
            f.write(f"📊 Fact Check: {fact_check['verdict'].upper()} ")
            f.write(f"(confidence: {fact_check['confidence']:.2f})\n")
            f.write(f"💡 {fact_check['reason']}\n\n")
        if questions:
            f.write("❓ Questions:\n")
            for i, q in enumerate(questions, 1):
                f.write(f"{i}. {q}\n")
            f.write("\n")
 def main():
-    parser = argparse.ArgumentParser(description="Dual audio transcription with fact-checking")
+    parser = argparse.ArgumentParser(description="Real-time audio transcription with dual capture")
-    parser.add_argument("--model", default="tiny", choices=["tiny", "base", "small", "medium"],
+    parser.add_argument("--model", default="tiny", choices=["tiny", "base", "small", "medium", "large"],
-                        help="Whisper model (default: tiny for speed)")
+                        help="Whisper model (default: tiny)")
-    parser.add_argument("--language", default="en", help="Language code")
+    parser.add_argument("--language", default="en", help="Language code (default: en)")
    parser.add_argument("--mic", help="Microphone device name (partial match)")
    parser.add_argument("--monitor", help="Monitor device name for speaker capture")
-    parser.add_argument("--interval", type=float, default=5.0, help="Processing interval (seconds)")
+    parser.add_argument("--interval", type=float, default=5.0, help="Processing interval in seconds (default: 5.0)")
-    parser.add_argument("--min-duration", type=float, default=2.0, help="Min audio duration")
+    parser.add_argument("--min-duration", type=float, default=2.0, help="Minimum audio duration (default: 2.0)")
-    parser.add_argument("--enable-llm", action="store_true", help="Enable fact-checking")
+    parser.add_argument("--enable-llm", action="store_true", help="Enable LLM analysis (fact-checking + questions)")
-    parser.add_argument("--llm-model", default="qwen2.5:3b", help="Ollama model")
+    parser.add_argument("--llm-model", default="qwen2.5:3b", help="Ollama model (default: qwen2.5:3b)")
-    parser.add_argument("--list-devices", action="store_true", help="List audio devices")
+    parser.add_argument("--output", "-o", help="Save transcript to file")
-    parser.add_argument("--force-cpu", action="store_true", help="Force CPU")
+    parser.add_argument("--list-devices", action="store_true", help="List audio devices and exit")
    parser.add_argument("--force-cpu", action="store_true", help="Force CPU processing")
    args = parser.parse_args()
@@ -268,8 +335,12 @@ def main():
                print(f"  [{i:2d}] {dev['name']:<50} IN:{in_ch} OUT:{out_ch}")
        return
-    print("=== Dual Audio Transcription with Fact-Checking ===")
+    print("=== Real-Time Audio Transcription ===")
    print(f"Model: {args.model} | Language: {args.language} | Interval: {args.interval}s")
    if args.output:
        print(f"Output: {args.output}")
    if args.enable_llm:
        print(f"LLM Analysis: Enabled ({args.llm_model})")
    # Initialize capture
    try:
@@ -296,14 +367,14 @@ def main():
        print(f"\n❌ Whisper Error: {e}")
        return
-    # Initialize fact checker
+    # Initialize LLM analyzer
-    fact_checker = None
+    llm_analyzer = None
    if args.enable_llm:
        try:
-            fact_checker = LLMFactChecker(model=args.llm_model)
+            llm_analyzer = LLMAnalyzer(model=args.llm_model)
        except Exception as e:
            print(f"\n⚠ LLM Error: {e}")
-            print("Continuing without fact-checking...")
+            print("Continuing without LLM analysis...")
    # Main loop
    print(f"\n✅ Started. Press Ctrl+C to stop.\n{'='*60}")
@@ -329,10 +400,27 @@ def main():
                            source_emoji = "🎤" if source == 'mic' else "🔊"
                            print(f"\n{source_emoji} [{timestamp}] {text}")
-                            if fact_checker:
+                            # LLM analysis
-                                fc = fact_checker.fact_check(text)
+                            fact_check = None
-                                verdict_emoji = {'factual': '✅', 'dubious': '⚠️', 'false': '❌'}.get(fc['verdict'], '❓')
+                            questions = None
-                                print(f"   {verdict_emoji} {fc['verdict'].upper()} ({fc['confidence']:.2f}): {fc['reason']}")
+                            if llm_analyzer:
                                fact_check = llm_analyzer.fact_check(text)
                                questions = llm_analyzer.generate_questions(text)
                                verdict_emoji = {'factual': '✅', 'dubious': '⚠️', 'false': '❌'}.get(
                                    fact_check['verdict'], '❓')
                                print(f"   {verdict_emoji} {fact_check['verdict'].upper()} "
                                      f"({fact_check['confidence']:.2f}): {fact_check['reason']}")
                                print(f"   ❓ Questions:")
                                for i, q in enumerate(questions, 1):
                                    print(f"      {i}. {q}")
                            # Save to file
                            if args.output:
                                if llm_analyzer:
                                    save_enriched_transcript(text, source, timestamp, fact_check, questions, args.output)
                                else:
                                    save_transcript(text, source, timestamp, args.output)
                last_process = time.time()
@@ -340,6 +428,8 @@ def main():
        print(f"\n{'='*60}\n🛑 Stopping...")
    capturer.close()
    if args.output and os.path.exists(args.output):
        print(f"\n💾 Transcript saved: {os.path.abspath(args.output)}")
    print("\n✅ Done!")
--- a/transcribe_duil_linux_old.py
+++ b/transcribe_duil_linux_old.py
@@ -0,0 +1,347 @@
 #!/usr/bin/env python3
 """
 Real-time transcription with dual audio capture (microphone + speaker monitor).
 Linux/PipeWire optimized with Ollama LLM fact-checking.
 """
 import sounddevice as sd
 import numpy as np
 import threading
 import queue
 import time
 import argparse
 from datetime import datetime
 from faster_whisper import WhisperModel
 try:
    import ollama
    OLLAMA_AVAILABLE = True
 except ImportError:
    OLLAMA_AVAILABLE = False
 class DualAudioCapture:
    """Capture both microphone and speaker output simultaneously"""
    def __init__(self, mic_device=None, monitor_device=None, sample_rate=16000, chunk_size=2048):
        self.sample_rate = sample_rate
        self.chunk_size = chunk_size
        self.audio_queue = queue.Queue()
        # Find devices
        devices = sd.query_devices()
        # Microphone (default input or specified)
        if mic_device is None:
            self.mic_device = sd.default.device[0]  # Default input
        else:
            self.mic_device = self._find_device(mic_device, input_required=True)
        # Monitor/Loopback (for speaker output)
        if monitor_device:
            self.monitor_device = self._find_device(monitor_device, input_required=True)
        else:
            self.monitor_device = None
        print(f"✓ Microphone: {devices[self.mic_device]['name']} (index {self.mic_device})")
        if self.monitor_device:
            print(f"✓ Monitor: {devices[self.monitor_device]['name']} (index {self.monitor_device})")
        else:
            print("⚠ No monitor device - capturing microphone only")
        # Start streams
        self.mic_stream = sd.InputStream(
            device=self.mic_device,
            channels=1,
            samplerate=sample_rate,
            blocksize=chunk_size,
            dtype='int16',
            callback=self._mic_callback
        )
        if self.monitor_device:
            self.monitor_stream = sd.InputStream(
                device=self.monitor_device,
                channels=1,
                samplerate=sample_rate,
                blocksize=chunk_size,
                dtype='int16',
                callback=self._monitor_callback
            )
        else:
            self.monitor_stream = None
        self.mic_stream.start()
        if self.monitor_stream:
            self.monitor_stream.start()
        print("✓ Audio capture started")
    def _find_device(self, device_name, input_required=True):
        """Find device by name substring"""
        devices = sd.query_devices()
        for i, dev in enumerate(devices):
            if device_name.lower() in dev['name'].lower():
                if not input_required or dev['max_input_channels'] > 0:
                    return i
        raise RuntimeError(f"Device '{device_name}' not found")
    def _mic_callback(self, indata, frames, time_info, status):
        """Microphone audio callback"""
        if status:
            print(f"⚠ Mic status: {status}")
        self.audio_queue.put(('mic', indata.copy()))
    def _monitor_callback(self, indata, frames, time_info, status):
        """Monitor/speaker audio callback"""
        if status:
            print(f"⚠ Monitor status: {status}")
        self.audio_queue.put(('monitor', indata.copy()))
    def read_chunk(self):
        """Read audio data from queue"""
        try:
            return self.audio_queue.get(timeout=0.05)
        except queue.Empty:
            return None
    def close(self):
        """Cleanup resources"""
        self.mic_stream.stop()
        self.mic_stream.close()
        if self.monitor_stream:
            self.monitor_stream.stop()
            self.monitor_stream.close()
 class WhisperTranscriber:
    """Process audio with Whisper"""
    def __init__(self, model_name="base", language="en", force_cpu=False):
        print(f"Loading Whisper model '{model_name}'...")
        import torch
        has_cuda = torch.cuda.is_available() and not force_cpu
        device = "cpu"
        compute_type = "int8"
        if has_cuda:
            try:
                import ctranslate2
                if ctranslate2.get_cuda_device_count() > 0:
                    device = "cuda"
                    compute_type = "float16"
                    print(f"✓ Using GPU: {torch.cuda.get_device_name(0)}")
            except Exception as e:
                print(f"⚠ CUDA unavailable: {e}")
        if device == "cpu":
            print("✓ Using CPU")
        model_kwargs = {"device": device, "compute_type": compute_type}
        if device == "cpu":
            model_kwargs["cpu_threads"] = 4
        self.model = WhisperModel(model_name, **model_kwargs)
        self.language = language
        self.mic_buffer = np.array([], dtype=np.float32)
        self.monitor_buffer = np.array([], dtype=np.float32)
        self.lock = threading.Lock()
    def add_audio(self, source, audio_chunk):
        """Add audio to appropriate buffer"""
        with self.lock:
            audio_float = audio_chunk.flatten().astype(np.float32) / 32768.0
            if source == 'mic':
                self.mic_buffer = np.concatenate([self.mic_buffer, audio_float])
            else:
                self.monitor_buffer = np.concatenate([self.monitor_buffer, audio_float])
    def transcribe_chunk(self, min_duration=3.0):
        """Transcribe accumulated audio"""
        with self.lock:
            mic_duration = len(self.mic_buffer) / 16000
            monitor_duration = len(self.monitor_buffer) / 16000
            results = {}
            # Transcribe microphone
            if mic_duration >= min_duration:
                mic_audio = self.mic_buffer.copy()
                self.mic_buffer = np.array([], dtype=np.float32)
                results['mic'] = self._transcribe(mic_audio)
            # Transcribe monitor
            if monitor_duration >= min_duration:
                monitor_audio = self.monitor_buffer.copy()
                self.monitor_buffer = np.array([], dtype=np.float32)
                results['monitor'] = self._transcribe(monitor_audio)
            return results if results else None
    def _transcribe(self, audio):
        """Internal transcription"""
        try:
            segments, _ = self.model.transcribe(
                audio,
                language=self.language,
                beam_size=3,  # Faster than default 5
                vad_filter=True,
                vad_parameters=dict(min_silence_duration_ms=500)
            )
            text = " ".join([seg.text for seg in segments]).strip()
            return text if text else None
        except Exception as e:
            print(f"❌ Transcription error: {e}")
            return None
 class LLMFactChecker:
    """Fast fact-checking with Ollama"""
    def __init__(self, model="qwen2.5:3b"):
        if not OLLAMA_AVAILABLE:
            raise RuntimeError("Ollama not installed: pip install ollama")
        self.model = model
        try:
            ollama.list()
            print(f"✓ Ollama connected: {self.model}")
        except Exception as e:
            raise RuntimeError(f"Ollama not running: {e}")
    def fact_check(self, text):
        """Quick fact-check"""
        prompt = f"""Fact-check this statement. Reply ONLY with:
 VERDICT: factual/dubious/false
 CONFIDENCE: 0.0-1.0
 REASON: one sentence
 Statement: "{text}" """
        try:
            response = ollama.generate(
                model=self.model,
                prompt=prompt,
                options={"temperature": 0.1, "num_predict": 80}
            )
            import re
            text = response['response']
            verdict = re.search(r'VERDICT:\s*(\w+)', text, re.I)
            confidence = re.search(r'CONFIDENCE:\s*([\d.]+)', text, re.I)
            reason = re.search(r'REASON:\s*(.+?)(?:\n|$)', text, re.I | re.DOTALL)
            return {
                'verdict': verdict.group(1).lower() if verdict else 'unknown',
                'confidence': float(confidence.group(1)) if confidence else 0.5,
                'reason': reason.group(1).strip() if reason else text[:150]
            }
        except Exception as e:
            return {'verdict': 'error', 'confidence': 0.0, 'reason': str(e)}
 def main():
    parser = argparse.ArgumentParser(description="Dual audio transcription with fact-checking")
    parser.add_argument("--model", default="tiny", choices=["tiny", "base", "small", "medium"],
                        help="Whisper model (default: tiny for speed)")
    parser.add_argument("--language", default="en", help="Language code")
    parser.add_argument("--mic", help="Microphone device name (partial match)")
    parser.add_argument("--monitor", help="Monitor device name for speaker capture")
    parser.add_argument("--interval", type=float, default=5.0, help="Processing interval (seconds)")
    parser.add_argument("--min-duration", type=float, default=2.0, help="Min audio duration")
    parser.add_argument("--enable-llm", action="store_true", help="Enable fact-checking")
    parser.add_argument("--llm-model", default="qwen2.5:3b", help="Ollama model")
    parser.add_argument("--list-devices", action="store_true", help="List audio devices")
    parser.add_argument("--force-cpu", action="store_true", help="Force CPU")
    args = parser.parse_args()
    if args.list_devices:
        print("\nAvailable audio devices:")
        for i, dev in enumerate(sd.query_devices()):
            in_ch = dev['max_input_channels']
            out_ch = dev['max_output_channels']
            if in_ch > 0:
                print(f"  [{i:2d}] {dev['name']:<50} IN:{in_ch} OUT:{out_ch}")
        return
    print("=== Dual Audio Transcription with Fact-Checking ===")
    print(f"Model: {args.model} | Language: {args.language} | Interval: {args.interval}s")
    # Initialize capture
    try:
        capturer = DualAudioCapture(
            mic_device=args.mic,
            monitor_device=args.monitor,
            sample_rate=16000,
            chunk_size=2048
        )
    except Exception as e:
        print(f"\n❌ Audio Error: {e}")
        print("\nTip: Use --list-devices to see available devices")
        print("     Use --mic and --monitor to specify devices")
        return
    # Initialize transcriber
    try:
        transcriber = WhisperTranscriber(
            model_name=args.model,
            language=args.language,
            force_cpu=args.force_cpu
        )
    except Exception as e:
        print(f"\n❌ Whisper Error: {e}")
        return
    # Initialize fact checker
    fact_checker = None
    if args.enable_llm:
        try:
            fact_checker = LLMFactChecker(model=args.llm_model)
        except Exception as e:
            print(f"\n⚠ LLM Error: {e}")
            print("Continuing without fact-checking...")
    # Main loop
    print(f"\n✅ Started. Press Ctrl+C to stop.\n{'='*60}")
    last_process = time.time()
    try:
        while True:
            # Collect audio
            chunk = capturer.read_chunk()
            if chunk:
                source, audio = chunk
                transcriber.add_audio(source, audio)
            # Process at intervals
            if time.time() - last_process >= args.interval:
                results = transcriber.transcribe_chunk(min_duration=args.min_duration)
                if results:
                    timestamp = datetime.now().strftime("%H:%M:%S")
                    for source, text in results.items():
                        if text:
                            source_emoji = "🎤" if source == 'mic' else "🔊"
                            print(f"\n{source_emoji} [{timestamp}] {text}")
                            if fact_checker:
                                fc = fact_checker.fact_check(text)
                                verdict_emoji = {'factual': '✅', 'dubious': '⚠️', 'false': '❌'}.get(fc['verdict'], '❓')
                                print(f"   {verdict_emoji} {fc['verdict'].upper()} ({fc['confidence']:.2f}): {fc['reason']}")
                last_process = time.time()
    except KeyboardInterrupt:
        print(f"\n{'='*60}\n🛑 Stopping...")
    capturer.close()
    print("\n✅ Done!")
 if __name__ == "__main__":
    main()