This commit is contained in:
mike
2025-12-17 16:30:46 +01:00
parent c0ca907b01
commit 918e96ad21
14 changed files with 358 additions and 1951 deletions

View File

@@ -7,6 +7,7 @@ Real-time audio transcription using Whisper AI with optional LLM-powered analysi
- Real-time transcription of system audio (Windows/Linux)
- Multiple Whisper model sizes (tiny to large)
- Multi-language support
- **Sentence extraction mode** - Stitches audio chunks into complete sentences
- Optional LLM analysis for fact-checking and question generation (via Ollama)
- GPU acceleration support
- Flexible audio device configuration
@@ -17,12 +18,15 @@ Real-time audio transcription using Whisper AI with optional LLM-powered analysi
# Install dependencies
pip install -r requirements.txt
# Basic transcription
# Basic transcription (no LLM)
python transcribe_speakers.py
# With LLM analysis
# With LLM analysis (optional)
python transcribe_speakers.py --enable-llm
# With sentence extraction
python transcribe_speakers.py --sentence-mode
# List audio devices
python transcribe_speakers.py --list-devices
```
@@ -80,9 +84,7 @@ ollama pull llama3.2
### Available Scripts
- `transcribe_speakers.py` - Main script with all features
- `transcribe_speakers_llm.py` - LLM-enabled version
- `transcribe_No_llm.py` - Basic version without LLM support
- `transcribe_speakers.py` - Main script with all features (LLM optional via `--enable-llm`)
- `transcribe_dual_linux.py` - Linux-specific with dual audio support
### Common Commands
@@ -97,8 +99,11 @@ python transcribe_speakers.py --language es --output transcript.txt
# Fast mode (low latency)
python transcribe_speakers.py --fast-mode --model tiny --interval 3
# Maximum accuracy with LLM
python transcribe_speakers.py --model large --enable-llm --output enriched.txt
# Extract complete sentences from chunks
python transcribe_speakers.py --sentence-mode --output sentences.txt
# Maximum accuracy with LLM and sentence extraction
python transcribe_speakers.py --model large --enable-llm --sentence-mode --output enriched.txt
# Force CPU (avoid GPU issues)
python transcribe_speakers.py --force-cpu
@@ -119,6 +124,7 @@ python transcribe_speakers.py --force-cpu
| `--output` | Save to file | None |
| `--force-cpu` | Disable GPU | False |
| `--gpu-index` | GPU device index | 0 |
| `--sentence-mode` | Extract complete sentences from chunks | False |
## Model Performance