init
This commit is contained in:
20
README.md
20
README.md
@@ -7,6 +7,7 @@ Real-time audio transcription using Whisper AI with optional LLM-powered analysi
|
||||
- Real-time transcription of system audio (Windows/Linux)
|
||||
- Multiple Whisper model sizes (tiny to large)
|
||||
- Multi-language support
|
||||
- **Sentence extraction mode** - Stitches audio chunks into complete sentences
|
||||
- Optional LLM analysis for fact-checking and question generation (via Ollama)
|
||||
- GPU acceleration support
|
||||
- Flexible audio device configuration
|
||||
@@ -17,12 +18,15 @@ Real-time audio transcription using Whisper AI with optional LLM-powered analysi
|
||||
# Install dependencies
|
||||
pip install -r requirements.txt
|
||||
|
||||
# Basic transcription
|
||||
# Basic transcription (no LLM)
|
||||
python transcribe_speakers.py
|
||||
|
||||
# With LLM analysis
|
||||
# With LLM analysis (optional)
|
||||
python transcribe_speakers.py --enable-llm
|
||||
|
||||
# With sentence extraction
|
||||
python transcribe_speakers.py --sentence-mode
|
||||
|
||||
# List audio devices
|
||||
python transcribe_speakers.py --list-devices
|
||||
```
|
||||
@@ -80,9 +84,7 @@ ollama pull llama3.2
|
||||
|
||||
### Available Scripts
|
||||
|
||||
- `transcribe_speakers.py` - Main script with all features
|
||||
- `transcribe_speakers_llm.py` - LLM-enabled version
|
||||
- `transcribe_No_llm.py` - Basic version without LLM support
|
||||
- `transcribe_speakers.py` - Main script with all features (LLM optional via `--enable-llm`)
|
||||
- `transcribe_dual_linux.py` - Linux-specific with dual audio support
|
||||
|
||||
### Common Commands
|
||||
@@ -97,8 +99,11 @@ python transcribe_speakers.py --language es --output transcript.txt
|
||||
# Fast mode (low latency)
|
||||
python transcribe_speakers.py --fast-mode --model tiny --interval 3
|
||||
|
||||
# Maximum accuracy with LLM
|
||||
python transcribe_speakers.py --model large --enable-llm --output enriched.txt
|
||||
# Extract complete sentences from chunks
|
||||
python transcribe_speakers.py --sentence-mode --output sentences.txt
|
||||
|
||||
# Maximum accuracy with LLM and sentence extraction
|
||||
python transcribe_speakers.py --model large --enable-llm --sentence-mode --output enriched.txt
|
||||
|
||||
# Force CPU (avoid GPU issues)
|
||||
python transcribe_speakers.py --force-cpu
|
||||
@@ -119,6 +124,7 @@ python transcribe_speakers.py --force-cpu
|
||||
| `--output` | Save to file | None |
|
||||
| `--force-cpu` | Disable GPU | False |
|
||||
| `--gpu-index` | GPU device index | 0 |
|
||||
| `--sentence-mode` | Extract complete sentences from chunks | False |
|
||||
|
||||
## Model Performance
|
||||
|
||||
|
||||
Reference in New Issue
Block a user