init

2025-12-17 16:30:46 +01:00
parent c0ca907b01
commit 918e96ad21
14 changed files with 358 additions and 1951 deletions
--- a/README.md
+++ b/README.md
@@ -7,6 +7,7 @@ Real-time audio transcription using Whisper AI with optional LLM-powered analysi
 - Real-time transcription of system audio (Windows/Linux)
 - Multiple Whisper model sizes (tiny to large)
 - Multi-language support
+- **Sentence extraction mode** - Stitches audio chunks into complete sentences
 - Optional LLM analysis for fact-checking and question generation (via Ollama)
 - GPU acceleration support
 - Flexible audio device configuration
@@ -17,12 +18,15 @@ Real-time audio transcription using Whisper AI with optional LLM-powered analysi
 # Install dependencies
 pip install -r requirements.txt

-# Basic transcription
+# Basic transcription (no LLM)
 python transcribe_speakers.py

-# With LLM analysis
+# With LLM analysis (optional)
 python transcribe_speakers.py --enable-llm

+# With sentence extraction
+python transcribe_speakers.py --sentence-mode
+
 # List audio devices
 python transcribe_speakers.py --list-devices
 ```
@@ -80,9 +84,7 @@ ollama pull llama3.2

 ### Available Scripts

- `transcribe_speakers.py` - Main script with all features
- `transcribe_speakers_llm.py` - LLM-enabled version
- `transcribe_No_llm.py` - Basic version without LLM support
+- `transcribe_speakers.py` - Main script with all features (LLM optional via `--enable-llm`)
 - `transcribe_dual_linux.py` - Linux-specific with dual audio support

 ### Common Commands
@@ -97,8 +99,11 @@ python transcribe_speakers.py --language es --output transcript.txt
 # Fast mode (low latency)
 python transcribe_speakers.py --fast-mode --model tiny --interval 3

-# Maximum accuracy with LLM
-python transcribe_speakers.py --model large --enable-llm --output enriched.txt
+# Extract complete sentences from chunks
+python transcribe_speakers.py --sentence-mode --output sentences.txt
+
+# Maximum accuracy with LLM and sentence extraction
+python transcribe_speakers.py --model large --enable-llm --sentence-mode --output enriched.txt

 # Force CPU (avoid GPU issues)
 python transcribe_speakers.py --force-cpu
@@ -119,6 +124,7 @@ python transcribe_speakers.py --force-cpu
 | `--output` | Save to file | None |
 | `--force-cpu` | Disable GPU | False |
 | `--gpu-index` | GPU device index | 0 |
+| `--sentence-mode` | Extract complete sentences from chunks | False |

 ## Model Performance