6.1 KiB
Swedish-Style Crossword Puzzle Generator
A high-performance Java-based puzzle generator with theme-based word filtering and daily automated generation.
Features
- Swedish-style crossword puzzles with arrow clues
- Theme-based word filtering using semantic similarity graph
- Daily automated generation via Docker + cron
- JSON export format compatible with web frontends
- Genetic algorithm for optimal grid layouts
- Constraint satisfaction for word placement
Architecture
Components
-
SwedishGenerator.java - Core puzzle generation engine
- Genetic algorithm for mask generation
- CSP solver for word filling
- Optimized for Dutch word lists
-
ThemeGraph.java - Theme-based word scoring system
- Predefined theme keywords (news, tech, sports, etc.)
- Edit distance similarity matching
- Automatic theme detection
-
DailyGenerator.java - Daily puzzle automation
- Generates themed puzzles
- JSON output with metadata
- Index file generation
-
ExportFormat.java - Export to standard format
- Grid cropping and optimization
- Arrow cell calculation
- Compatible with existing frontends
Usage
Local Development
# Compile
./compile.sh
# Run Main (interactive)
java -cp ~/dev/.target puzzle.Main --seed 42 --pop 18 --gens 100
# Generate daily puzzles
java -cp ~/dev/.target puzzle.DailyGenerator
Docker Deployment
# Build image
docker build -t puzzle-generator .
# Run with docker-compose
docker-compose up -d puzzle_gen_java
# View logs
docker logs -f puzzle_gen_java
Environment Variables
| Variable | Default | Description |
|---|---|---|
OUT_DIR |
/data/puzzles |
Output directory for generated puzzles |
PUZZLES_PER_DAY |
3 |
Number of puzzles to generate daily |
WORDS_PATH |
./word-list.txt |
Path to word list file |
THEME_FILTER |
true |
Enable theme-based word filtering |
THEME_MIN_SCORE |
0.6 |
Minimum theme score (0.0-1.0) |
LM_STUDIO_BASE_URL |
- | LM Studio URL (future feature) |
GENERATE_ON_START |
false |
Generate puzzles on container startup |
Theme System
Supported Themes
algemeen- General/common wordsnieuws- News/politicstechnologie- Technologysport- Sportsweer- Weather/natureeconomie- Economygezondheid- Health
Theme Filtering
Words are scored against themes using:
- Direct matching - Word is in theme keyword list (score: 1.0)
- Substring matching - Partial word overlap (score: 0.7)
- Edit distance - Fuzzy matching for variations (score: 0.8-0.9)
Example:
ThemeGraph.filterByTheme(words, "technologie", 0.6);
// Returns: COMPUTER, INTERNET, SOFTWARE, DATA, etc.
Output Format
Puzzle JSON
{
"date": "2025-12-19",
"theme": "technologie",
"difficulty": 1,
"rewards": {
"coins": 50,
"stars": 2,
"hints": 1
},
"gridv2": [
"###COMPUTER###",
"#I#O#E#E#O#"
],
"words": [
{
"word": "COMPUTER",
"clue": "COMPUTER",
"startRow": 0,
"startCol": 3,
"direction": "horizontal",
"answer": "COMPUTER",
"arrowRow": 0,
"arrowCol": 2
}
]
}
Index JSON
{
"date": "2025-12-19",
"files": [
"crossword_2025-12-19_01_technologie.json",
"crossword_2025-12-19_02_sport.json",
"crossword_2025-12-19_03_nieuws.json"
]
}
Scheduling
Puzzles are generated daily at 3:15 AM (configurable in crontab).
Edit crontab to change schedule:
# Daily at 3:15 AM
15 3 * * * java -cp /app/target puzzle.DailyGenerator
# Every 6 hours
0 */6 * * * java -cp /app/target puzzle.DailyGenerator
# Weekly on Monday at 1 AM
0 1 * * 1 java -cp /app/target puzzle.DailyGenerator
Word List Format
Plain text file, one word per line, uppercase A-Z only, 2-8 characters:
EU
UUR
AUTO
BOOM
COMPUTER
INTERNET
...
Performance
- Mask generation: ~2-5 seconds (genetic algorithm)
- Word filling: ~5-30 seconds (CSP solver with MRV heuristic)
- Total per puzzle: ~10-40 seconds
Optimizations:
- Positional indexing for fast candidate lookup
- Sorted intersection for constraint checking
- No large array allocations during search
- Progress bar with real-time stats
Integration with LM Studio (Future)
The system is prepared for LM Studio integration to generate themed clues:
docker-compose up -d
# Set LM_STUDIO_BASE_URL in docker-compose.yml
# Container will query LM Studio for contextual clues based on themes
This will enhance clues from simple word repetition to semantic hints.
Migration from Node.js
The Java version maintains module-wise compatibility with the Node.js generator:
| Node.js | Java |
|---|---|
swedish_generator.js |
SwedishGenerator.java |
export_format.js |
ExportFormat.java |
main.js |
Main.java + DailyGenerator.java |
| N/A | ThemeGraph.java (new) |
Volume Management
Puzzles are stored in a Docker volume outside the workspace:
# Default location
/var/lib/puzzle-data
# Custom location
export PUZZLE_OUTPUT_DIR=/path/to/puzzles
docker-compose up -d
# View puzzles
ls -lh /var/lib/puzzle-data/*.json
Troubleshooting
No puzzles generated
- Check word list has enough words (minimum 50)
- Lower
THEME_MIN_SCOREif using theme filtering - Increase
PUZZLES_PER_DAYattempts
Container not starting
docker logs puzzle_gen_java
# Check for compilation errors or missing files
Low quality puzzles
- Increase
--gensparameter (more genetic iterations) - Increase
--popparameter (larger population) - Ensure word list has good variety of lengths 2-8
License
MIT
Authors
Original Node.js version + Java port with theme system