This commit is contained in:
mike
2025-12-17 22:11:08 +01:00
parent 36852dde18
commit a53c0e2902
8 changed files with 218 additions and 239 deletions

105
QUICK_START.md Normal file
View File

@@ -0,0 +1,105 @@
# Quick Start Guide
## Dutch Language (Nederlands)
### Basic Dutch Transcription
```bash
./RUN_DUTCH.sh
```
- ✅ GPU-accelerated (RTX 4060 Ti)
- ✅ Sentence extraction (complete zinnen)
- ✅ Base model (goede balans snelheid/nauwkeurigheid)
### Dutch with LLM Analysis
```bash
./RUN_DUTCH_LLM.sh
```
- ✅ All features from basic version
- ✅ Fact-checking van uitspraken
- ✅ Automatische vraag generatie
- Uses llama3.2:latest model
### Save to File
```bash
./RUN_DUTCH.sh --output transcript.txt
./RUN_DUTCH_LLM.sh --output enriched.txt
```
---
## English Language
### Basic English Transcription
```bash
./RUN_GPU.sh
```
### English with LLM
```bash
./RUN_GPU.sh --enable-llm
```
---
## Other Languages
### Spanish
```bash
./RUN_GPU.sh --language es
```
### French
```bash
./RUN_GPU.sh --language fr
```
### German
```bash
./RUN_GPU.sh --language de
```
---
## Available Ollama Models
You have these models installed:
- `llama3.2:latest` (2.0 GB) - **Default** - Fast and accurate
- `llama3:8b` (4.7 GB) - More powerful
- `qwen2.5:3b` (1.9 GB) - Fast alternative
- `qwen2.5:7b` (4.7 GB) - Powerful alternative
- `qwen2.5:0.5b` (397 MB) - Very fast, less accurate
To use a different model:
```bash
./RUN_DUTCH_LLM.sh --llm-model "llama3:8b"
```
---
## Tips
### Better Accuracy
Use larger Whisper model (slower):
```bash
./RUN_DUTCH.sh --model medium # or: large
```
### Faster Processing
Use smaller model or reduce interval:
```bash
./RUN_DUTCH.sh --model tiny --interval 3
```
### Debug LLM Issues
```bash
./RUN_DUTCH_LLM.sh --llm-debug
```
---
## Controls
- **Ctrl+C** to stop transcription
- Speak clearly into your microphone
- Wait ~5 seconds for transcription to appear
- Sentences appear with 📝 emoji

View File

@@ -90,22 +90,30 @@ ollama pull llama3.2
### Common Commands ### Common Commands
```bash ```bash
# Specify device and model # Quick start with GPU (English)
python transcribe_speakers.py --device "CABLE Output" --model medium ./RUN_GPU.sh
# Save to file with language # Dutch language
python transcribe_speakers.py --language es --output transcript.txt ./RUN_DUTCH.sh
# Fast mode (low latency) # Dutch with LLM analysis
python transcribe_speakers.py --fast-mode --model tiny --interval 3 ./RUN_DUTCH_LLM.sh
# Extract complete sentences from chunks # With LLM analysis
python transcribe_speakers.py --sentence-mode --output sentences.txt ./RUN_GPU.sh --enable-llm
# Save to file
./RUN_GPU.sh --output transcript.txt
# Other languages (Spanish, French, German, etc.)
./RUN_GPU.sh --language es # Spanish
./RUN_GPU.sh --language fr # French
./RUN_GPU.sh --language de # German
# Maximum accuracy with LLM and sentence extraction # Maximum accuracy with LLM and sentence extraction
python transcribe_speakers.py --model large --enable-llm --sentence-mode --output enriched.txt python transcribe_speakers.py --model large --enable-llm --sentence-mode --output enriched.txt
# Force CPU (avoid GPU issues) # Force CPU (if GPU issues)
python transcribe_speakers.py --force-cpu python transcribe_speakers.py --force-cpu
``` ```

19
RUN_DUTCH.sh Executable file
View File

@@ -0,0 +1,19 @@
#!/bin/bash
# Dutch language transcription with GPU and sentence extraction
cd "$(dirname "$0")"
export LD_LIBRARY_PATH=".venv/lib/python3.13/site-packages/nvidia/cudnn/lib:.venv/lib/python3.13/site-packages/nvidia/cublas/lib:${LD_LIBRARY_PATH}"
echo "Starting Dutch transcription..."
echo "Speak in Dutch into your microphone"
echo "Press Ctrl+C to stop"
echo ""
.venv/bin/python3 transcribe_speakers.py \
--sentence-mode \
--language nl \
--model base \
--interval 5 \
--min-duration 2 \
"$@"

22
RUN_DUTCH_LLM.sh Executable file
View File

@@ -0,0 +1,22 @@
#!/bin/bash
# Dutch transcription with GPU, sentence extraction, and LLM analysis
cd "$(dirname "$0")"
export LD_LIBRARY_PATH=".venv/lib/python3.13/site-packages/nvidia/cudnn/lib:.venv/lib/python3.13/site-packages/nvidia/cublas/lib:${LD_LIBRARY_PATH}"
echo "Starting Dutch transcription with LLM analysis..."
echo "Using model: llama3.2:latest"
echo "Speak in Dutch into your microphone"
echo "Press Ctrl+C to stop"
echo ""
.venv/bin/python3 transcribe_speakers.py \
--sentence-mode \
--language nl \
--model large \
--interval 5 \
--min-duration 2 \
--enable-llm \
--llm-model "llama3.2:latest" \
"$@"

13
RUN_GPU.sh Executable file
View File

@@ -0,0 +1,13 @@
#!/bin/bash
# GPU-accelerated transcription with sentence extraction
cd "$(dirname "$0")"
export LD_LIBRARY_PATH=".venv/lib/python3.13/site-packages/nvidia/cudnn/lib:.venv/lib/python3.13/site-packages/nvidia/cublas/lib:${LD_LIBRARY_PATH}"
.venv/bin/python3 transcribe_speakers.py \
--sentence-mode \
--model large \
--interval 5 \
--min-duration 2 \
"$@"

View File

@@ -1,226 +0,0 @@
[23:31:46] So it helps us get back into a grounded information terrain and then also it requires us.
📊 Fact Check: NOT_FACTUAL (confidence: 0.90)
💡 The statement is a vague, nonspecific claim that cannot be verified against any factual evidence.
❓ Questions:
1. What specific processes or actions help us return to a grounded information terrain?
2. In what ways does this approach require us to change our current practices or mindset?
3. How does re-establishing a grounded information terrain impact the overall effectiveness of the project?
======================================================================
[23:31:54] to take the time to pay attention to information, really absorb it properly, and then to make decisions based on that. So we need to bring people into the process of
📊 Fact Check: NOT_FACTUAL (confidence: 0.99)
💡 The statement is an incomplete, nonfactual description of a process, not a verifiable claim.
❓ Questions:
1. What are the key points here?
2. What evidence supports this?
3. What are the implications?
======================================================================
[23:32:02] decision making and at the same time as part of that bring them into a terrain of really
📊 Fact Check: NOT_FACTUAL (confidence: 1.00)
💡 The statement is a nonsensical fragment and does not convey a verifiable factual claim.
❓ Questions:
1. What does the phrase "bring them into a terrain of really" refer to in the context of decision making?
2. How does the process of decision making simultaneously involve "bringing them into a terrain" as mentioned?
3. Can you explain how the concept of "terrain"?
======================================================================
[23:32:10] curing and discerning information properly and then engage in the decision-making process. That's the only way we're actually going to turn this around. It's not going to be good enough to...
📊 Fact Check: NOT_FACTUAL (confidence: 0.90)
💡 The fragment is an incomplete quote with no verifiable factual claim.
❓ Questions:
1. What does "curing and discerning information properly" entail in the context of this statement?
2. How does engaging in the decisionmaking process contribute to turning the situation around?
3. Why is simply having information or a plan not sufficient according to the speaker?
======================================================================
[23:32:18] to elect new politicians because the underlying problem of the way we absorb, process and deal with information now remains. And the only way we can do that is actually to do it.
📊 Fact Check: NOT_FACTUAL (confidence: 0.95)
💡 The statement is an opinion about politics and information processing, not a verifiable factual claim.
❓ Questions:
1. What specific aspects of the way we absorb, process, and deal with information are identified as the underlying problem in the statement?
2. How does the statement justify the election of new politicians as a solution to the information-related issue it describes?
3. What practical steps or strategies does the statement imply we should take to "actually do it" in addressing the information problem?
======================================================================
[23:32:26] actually by bringing people in on a mass basis, having huge numbers of citizens, juries around the country on a regular basis for decisions we're making involving the
📊 Fact Check: NOT_FACTUAL (confidence: 0.90)
💡 The statement is incomplete and lacks context, making it impossible to verify its factual accuracy.
❓ Questions:
1. What are the key points here?
2. What evidence supports this?
3. What are the implications?
======================================================================
[23:32:35] public that's the only way we're going to be able to turn this around and not just think that okay let's just wait for another Kamala Harris or somebody like that to come along and win an election then
📊 Fact Check: NOT_FACTUAL (confidence: 0.95)
💡 The quoted phrase
❓ Questions:
1. What specific actions does the speaker believe are necessary?
2. Q1: What specific actions does the speaker believe are necessary?
3. What are the implications?
======================================================================
[23:32:42] we'll all be right and we'll be able to turn the clock back. It won't work like that. The problem is far too deep seated than that. So, yes, we are becoming
📊 Fact Check: NOT_FACTUAL (confidence: 0.90)
💡 The statement is a vague, incomplete fragment with no verifiable factual claim.
❓ Questions:
1. What specific problem is the speaker implying is "far too deep seated" to be solved by simply "turning the clock back"?
2. How does the speaker's claim that "we'll all be right" relate to the broader context or argument being presented?
3. In what ways might the statement "So, yes, we are becoming" reflect a shift in perspective or identity for the speaker or the audience?
======================================================================
[23:32:50] That's basically what's going on at the moment. But that doesn't mean that we can lose hope, because there are mechanisms in which we can actually turn that around.
📊 Fact Check: NOT_FACTUAL (confidence: 0.90)
💡 The statement is a general, nonspecific claim that cannot be verified as true or false.
❓ Questions:
1. What specific situation or issue is being described as "what's going on at the moment"?
2. What mechanisms are being referred to that could help "turn that around"?
3. How does the speaker justify maintaining hope despite the current challenges?
======================================================================
[23:32:58] by actually engaging in the political process ourselves, which would force us to then utilise information in a different way.
📊 Fact Check: DUBIOUS (confidence: 0.70)
💡 The claim is a speculative assertion about how political engagement might change information use, and it cannot be verified as a factual statement.
❓ Questions:
1. What are the key points here?
2. What evidence supports this?
3. What are the implications?
======================================================================
[23:33:06] hopefully in the end come to different conclusions, but be part of that decision-making process too. So it's an important realization. What's happening to us is
📊 Fact Check: NOT_FACTUAL (confidence: 1.00)
💡 The statement is a subjective expression of hope and realization, not a verifiable factual claim.
❓ Questions:
1. What does the speaker mean by "hopefully in the end come to different conclusions" and how does that relate to the decision-making process mentioned?
2. In what ways might being part of the decision-making process influence the outcomes described in the statement?
3. What specific "important realization" is referenced, and how does it connect to "what's happening to us"?
======================================================================
[23:33:25] species in terms of our intelligence but it more importantly gives us a very important call to action. We need to think. We need to think. We need to think. We need to think. We need to think. We need to think. We need to think. We need to think. We need to think. We need to think. We need to think. We need to think. We need to think. We need to think. We need to think. We need to think. We need to think. We need to think. We need to think. We need to think. We need to think. We need to think. We need to think. We need to think. We need to think. We need to think. We need to think. We need to think. We need to think. We need to think.
📊 Fact Check: NOT_FACTUAL (confidence: 1.00)
💡 The statement is a nonsensical fragment that does not present any verifiable factual claim.
❓ Questions:
1. What is the main message conveyed by?
2. Q1: What is the main message conveyed by?
3. What are the implications?
======================================================================
[23:33:40] differently about how we govern ourselves going forward if we are to reverse this genuine decline. I hope you like my video. As a psychiatrist who loves politics and economics and philosophy I love to make videos like this and you can really help promote this video to other people and get it on your feed more by liking and commenting and subscribing to the video.
📊 Fact Check: NOT_FACTUAL (confidence: 1.00)
💡 The statement is a personal comment and request for promotion, not a claim that can be verified as true or false.
❓ Questions:
1. What specific strategies does the speaker propose for reversing the "genuine decline" in governance mentioned in the statement?
2. How does the speakers background as a psychiatrist influence their perspective on politics, economics, and philosophy?
3. In what ways does the speaker suggest viewers can effectively promote the video to reach a wider audience?
======================================================================
[23:33:46] well. We have a wonderful community of people here who comment and support each other through this very traumatic period of world history.
📊 Fact Check: DUBIOUS (confidence: 0.60)
💡 The claim is a subjective, unverified assertion about a communitys nature and cannot be confirmed or refuted with available evidence.
❓ Questions:
1. Who are the members of the community mentioned in the statement?
2. Which specific traumatic period of world history is being referred to?
3. In what ways do the community members comment and support each other during this period?
======================================================================
[23:33:54] that we're going through right now. I also hope you consider becoming a subscriber to the channel and also subscribing to my.
📊 Fact Check: NOT_FACTUAL (confidence: 1.00)
💡 The sentence is a fragment expressing a hope, not a verifiable factual claim.
❓ Questions:
1. What are the key points here?
2. What evidence supports this?
3. What are the implications?
======================================================================
[23:34:02] E-newsletter, there's a link in the description, and that way we can stay in touch outside the channel so you can stay, keep a
📊 Fact Check: NOT_FACTUAL (confidence: 0.90)
💡 [one sentence]"
❓ Questions:
1. What are the key points here?
2. What evidence supports this?
3. What are the implications?
======================================================================
[23:34:10] rest of all of the content that I'm making on an ongoing basis. The latest of which actually is my latest book called We the People.
📊 Fact Check: DUBIOUS (confidence: 0.50)
💡 The statement is a fragment with no verifiable context or evidence that the speakers latest book is titled *We the People*.
❓ Questions:
1. What are the key points here?
2. What evidence supports this?
3. What are the implications?
======================================================================
[23:34:19] very very proud of this book. It's actually a novel, a fiction book, written by myself and the famous award-winning author T.J. McGregor.
📊 Fact Check: NOT_FACTUAL (confidence: 0.90)
💡 No verifiable record exists of a novel coauthored by the user and an awardwinning author
❓ Questions:
1. How did you and T.J. McGregor collaborate on the novel?
2. What inspired you to co-write a fiction book with an awardwinning author?
3. What genre and themes does the novel?
======================================================================
[23:34:27] Together we wrote a book about what the future might look like. Bit of a dystopian novel, but what might happen if autocracy goes to its next stage?
📊 Fact Check: NOT_FACTUAL (confidence: 0.90)
💡 There is no verifiable evidence that the speaker and the other person coauthored a book on future dystopias.
❓ Questions:
1. What core themes and motifs did the book explore to envision the next stage of autocracy?
2. How does the narrative structure of the novel reflect the progression of authoritarian power in a dystopian future?
3. What real-world events or historical patterns inspired the authors to imagine a future where autocracy has evolved beyond its current form?

27
start_transcribe.sh Executable file
View File

@@ -0,0 +1,27 @@
#!/bin/bash
# Simple startup script for transcription with GPU support
cd "$(dirname "$0")"
# Activate virtual environment
source .venv/bin/activate
# Set CUDA library paths for ctranslate2
export LD_LIBRARY_PATH=".venv/lib/python3.13/site-packages/nvidia/cudnn/lib:.venv/lib/python3.13/site-packages/nvidia/cublas/lib:.venv/lib/python3.13/site-packages/nvidia/cuda_runtime/lib:${LD_LIBRARY_PATH}"
# Run transcription with sentence mode and GPU
echo "Starting transcription with:"
echo " - Model: tiny (fast, good for testing)"
echo " - GPU mode (RTX 4060 Ti)"
echo " - Sentence extraction enabled"
echo " - Interval: 5 seconds"
echo ""
echo "Speak into your microphone to see transcription..."
echo "Press Ctrl+C to stop"
echo ""
python3 transcribe_speakers.py \
--sentence-mode \
--model tiny \
--interval 5 \
--min-duration 2

View File

@@ -79,17 +79,28 @@ class WindowsLoopbackAudioCapture:
dev['max_input_channels'] > 0): dev['max_input_channels'] > 0):
return dev return dev
# Auto-detect: look for WASAPI speakers/headphones # Auto-detect: look for WASAPI speakers/headphones (Windows)
for dev in devices: for dev in devices:
if (dev['max_input_channels'] > 0 and if (dev['max_input_channels'] > 0 and
any(x in dev['name'] for x in ['Speakers', 'Headphones', 'Output'])): any(x in dev['name'] for x in ['Speakers', 'Headphones', 'Output'])):
return dev return dev
# Fallback: Stereo Mix or similar # Fallback: Stereo Mix or similar (Windows)
for dev in devices: for dev in devices:
if 'Stereo Mix' in dev['name']: if 'Stereo Mix' in dev['name']:
return dev return dev
# Linux fallback: use default input device (pipewire/pulse)
try:
default_input_idx = sd.default.device[0] # Default input device
if default_input_idx is not None:
dev = devices[default_input_idx]
if dev['max_input_channels'] > 0:
print("⚠️ Note: Using default input device (microphone). For speaker capture on Linux, use transcribe_dual_linux.py")
return dev
except:
pass
return None return None
def _audio_callback(self, indata, frames, time_info, status): def _audio_callback(self, indata, frames, time_info, status):
@@ -511,8 +522,8 @@ Examples:
help="GPU device index to use (default: 0)") help="GPU device index to use (default: 0)")
parser.add_argument("--enable-llm", action="store_true", parser.add_argument("--enable-llm", action="store_true",
help="Enable LLM analysis (fact-checking and questions)") help="Enable LLM analysis (fact-checking and questions)")
parser.add_argument("--llm-model", default="gpt-oss:20b", parser.add_argument("--llm-model", default="llama3.2:latest",
help="Ollama model to use for LLM analysis (default: gpt-oss:20b)") help="Ollama model to use for LLM analysis (default: llama3.2:latest)")
parser.add_argument("--llm-debug", action="store_true", parser.add_argument("--llm-debug", action="store_true",
help="Show LLM raw responses for debugging") help="Show LLM raw responses for debugging")
parser.add_argument("--sentence-mode", action="store_true", parser.add_argument("--sentence-mode", action="store_true",