254 lines
6.0 KiB
Markdown
254 lines
6.0 KiB
Markdown
# Swedish-Style Crossword Puzzle Generator
|
|
|
|
A high-performance Java-based puzzle generator with theme-based word filtering and daily automated generation.
|
|
|
|
## Features
|
|
|
|
- **Swedish-style crossword puzzles** with arrow clues
|
|
- **Theme-based word filtering** using semantic similarity graph
|
|
- **Daily automated generation** via Docker + cron
|
|
- **JSON export format** compatible with web frontends
|
|
- **Genetic algorithm** for optimal grid layouts
|
|
- **Constraint satisfaction** for word placement
|
|
|
|
## Architecture
|
|
|
|
### Components
|
|
|
|
1. **SwedishGenerator.java** - Core puzzle generation engine
|
|
- Genetic algorithm for mask generation
|
|
- CSP solver for word filling
|
|
- Optimized for Dutch word lists
|
|
|
|
2. **ThemeGraph.java** - Theme-based word scoring system
|
|
- Predefined theme keywords (news, tech, sports, etc.)
|
|
- Edit distance similarity matching
|
|
- Automatic theme detection
|
|
|
|
3. **Main.java** - Core generator and daily automation
|
|
- Generates themed puzzles
|
|
- JSON output with metadata
|
|
- Index file generation
|
|
|
|
4. **ExportFormat.java** - Export to standard format
|
|
- Grid cropping and optimization
|
|
- Arrow cell calculation
|
|
- Compatible with existing frontends
|
|
|
|
## Usage
|
|
|
|
### Local Development
|
|
|
|
```bash
|
|
# Compile
|
|
./compile.sh
|
|
|
|
# Run Main (interactive)
|
|
java -cp ~/dev/.target puzzle.Main --seed 42 --pop 18 --gens 100
|
|
|
|
# Generate daily puzzles
|
|
```
|
|
|
|
### Docker Deployment
|
|
|
|
```bash
|
|
# Build image
|
|
docker build -t puzzle-generator .
|
|
|
|
# Run with docker-compose
|
|
docker-compose up -d puzzle_gen_java
|
|
|
|
# View logs
|
|
docker logs -f puzzle_gen_java
|
|
```
|
|
|
|
### Environment Variables
|
|
|
|
| Variable | Default | Description |
|
|
|----------------------|-------------------|----------------------------------------|
|
|
| `OUT_DIR` | `/data/puzzles` | Output directory for generated puzzles |
|
|
| `PUZZLES_PER_DAY` | `3` | Number of puzzles to generate daily |
|
|
| `WORDS_PATH` | `./word-list.txt` | Path to word list file |
|
|
| `THEME_FILTER` | `true` | Enable theme-based word filtering |
|
|
| `THEME_MIN_SCORE` | `0.6` | Minimum theme score (0.0-1.0) |
|
|
| `LM_STUDIO_BASE_URL` | - | LM Studio URL (future feature) |
|
|
| `GENERATE_ON_START` | `false` | Generate puzzles on container startup |
|
|
|
|
## Theme System
|
|
|
|
### Supported Themes
|
|
|
|
- `algemeen` - General/common words
|
|
- `nieuws` - News/politics
|
|
- `technologie` - Technology
|
|
- `sport` - Sports
|
|
- `weer` - Weather/nature
|
|
- `economie` - Economy
|
|
- `gezondheid` - Health
|
|
|
|
### Theme Filtering
|
|
|
|
Words are scored against themes using:
|
|
1. **Direct matching** - Word is in theme keyword list (score: 1.0)
|
|
2. **Substring matching** - Partial word overlap (score: 0.7)
|
|
3. **Edit distance** - Fuzzy matching for variations (score: 0.8-0.9)
|
|
|
|
Example:
|
|
```bash
|
|
ThemeGraph.filterByTheme(words, "technologie", 0.6);
|
|
// Returns: COMPUTER, INTERNET, SOFTWARE, DATA, etc.
|
|
```
|
|
|
|
## Output Format
|
|
|
|
### Puzzle JSON
|
|
|
|
```json
|
|
{
|
|
"date": "2025-12-19",
|
|
"theme": "technologie",
|
|
"difficulty": 1,
|
|
"rewards": {
|
|
"coins": 50,
|
|
"stars": 2,
|
|
"hints": 1
|
|
},
|
|
"gridv2": [
|
|
"###COMPUTER###",
|
|
"#I#O#E#E#O#"
|
|
],
|
|
"words": [
|
|
{
|
|
"word": "COMPUTER",
|
|
"clue": "COMPUTER",
|
|
"startRow": 0,
|
|
"startCol": 3,
|
|
"direction": "horizontal",
|
|
"answer": "COMPUTER",
|
|
"arrowRow": 0,
|
|
"arrowCol": 2
|
|
}
|
|
]
|
|
}
|
|
```
|
|
|
|
### Index JSON
|
|
|
|
```json
|
|
{
|
|
"date": "2025-12-19",
|
|
"files": [
|
|
"crossword_2025-12-19_01_technologie.json",
|
|
"crossword_2025-12-19_02_sport.json",
|
|
"crossword_2025-12-19_03_nieuws.json"
|
|
]
|
|
}
|
|
```
|
|
|
|
## Scheduling
|
|
|
|
Puzzles are generated daily at **3:15 AM** (configurable in `crontab`).
|
|
|
|
Edit `crontab` to change schedule:
|
|
```cron
|
|
# Daily at 3:15 AM
|
|
15 3 * * * java -cp /app/target puzzle.Main
|
|
|
|
# Every 6 hours
|
|
0 */6 * * * java -cp /app/target puzzle.Main
|
|
|
|
# Weekly on Monday at 1 AM
|
|
0 1 * * 1 java -cp /app/target puzzle.Main
|
|
```
|
|
|
|
## Word List Format
|
|
|
|
Plain text file, one word per line, uppercase A-Z only, 2-8 characters:
|
|
|
|
```
|
|
EU
|
|
UUR
|
|
AUTO
|
|
BOOM
|
|
COMPUTER
|
|
INTERNET
|
|
...
|
|
```
|
|
|
|
## Performance
|
|
|
|
- **Mask generation**: ~2-5 seconds (genetic algorithm)
|
|
- **Word filling**: ~5-30 seconds (CSP solver with MRV heuristic)
|
|
- **Total per puzzle**: ~10-40 seconds
|
|
|
|
Optimizations:
|
|
- Positional indexing for fast candidate lookup
|
|
- Sorted intersection for constraint checking
|
|
- No large array allocations during search
|
|
- Progress bar with real-time stats
|
|
|
|
## Integration with LM Studio (Future)
|
|
|
|
The system is prepared for LM Studio integration to generate themed clues:
|
|
|
|
```bash
|
|
docker-compose up -d
|
|
# Set LM_STUDIO_BASE_URL in docker-compose.yml
|
|
# Container will query LM Studio for contextual clues based on themes
|
|
```
|
|
|
|
This will enhance clues from simple word repetition to semantic hints.
|
|
|
|
## Migration from Node.js
|
|
|
|
The Java version maintains module-wise compatibility with the Node.js generator:
|
|
|
|
| Node.js | Java |
|
|
|------------------------|-------------------------------------|
|
|
| `swedish_generator.js` | `SwedishGenerator.java` |
|
|
| `export_format.js` | `ExportFormat.java` |
|
|
| `main.js` | `Main.java` |
|
|
| N/A | `ThemeGraph.java` (new) |
|
|
|
|
## Volume Management
|
|
|
|
Puzzles are stored in a Docker volume outside the workspace:
|
|
|
|
```bash
|
|
# Default location
|
|
/var/lib/puzzle-data
|
|
|
|
# Custom location
|
|
export PUZZLE_OUTPUT_DIR=/path/to/puzzles
|
|
docker-compose up -d
|
|
|
|
# View puzzles
|
|
ls -lh /var/lib/puzzle-data/*.json
|
|
```
|
|
|
|
## Troubleshooting
|
|
|
|
### No puzzles generated
|
|
- Check word list has enough words (minimum 50)
|
|
- Lower `THEME_MIN_SCORE` if using theme filtering
|
|
- Increase `PUZZLES_PER_DAY` attempts
|
|
|
|
### Container not starting
|
|
```bash
|
|
docker logs puzzle_gen_java
|
|
# Check for compilation errors or missing files
|
|
```
|
|
|
|
### Low quality puzzles
|
|
- Increase `--gens` parameter (more genetic iterations)
|
|
- Increase `--pop` parameter (larger population)
|
|
- Ensure word list has good variety of lengths 2-8
|
|
|
|
## License
|
|
|
|
MIT
|
|
|
|
## Authors
|
|
|
|
Original Node.js version + Java port with theme system
|