initial commit
This commit is contained in:
254
README.md
Normal file
254
README.md
Normal file
@@ -0,0 +1,254 @@
|
||||
# Swedish-Style Crossword Puzzle Generator
|
||||
|
||||
A high-performance Java-based puzzle generator with theme-based word filtering and daily automated generation.
|
||||
|
||||
## Features
|
||||
|
||||
- **Swedish-style crossword puzzles** with arrow clues
|
||||
- **Theme-based word filtering** using semantic similarity graph
|
||||
- **Daily automated generation** via Docker + cron
|
||||
- **JSON export format** compatible with web frontends
|
||||
- **Genetic algorithm** for optimal grid layouts
|
||||
- **Constraint satisfaction** for word placement
|
||||
|
||||
## Architecture
|
||||
|
||||
### Components
|
||||
|
||||
1. **SwedishGenerator.java** - Core puzzle generation engine
|
||||
- Genetic algorithm for mask generation
|
||||
- CSP solver for word filling
|
||||
- Optimized for Dutch word lists
|
||||
|
||||
2. **ThemeGraph.java** - Theme-based word scoring system
|
||||
- Predefined theme keywords (news, tech, sports, etc.)
|
||||
- Edit distance similarity matching
|
||||
- Automatic theme detection
|
||||
|
||||
3. **DailyGenerator.java** - Daily puzzle automation
|
||||
- Generates themed puzzles
|
||||
- JSON output with metadata
|
||||
- Index file generation
|
||||
|
||||
4. **ExportFormat.java** - Export to standard format
|
||||
- Grid cropping and optimization
|
||||
- Arrow cell calculation
|
||||
- Compatible with existing frontends
|
||||
|
||||
## Usage
|
||||
|
||||
### Local Development
|
||||
|
||||
```bash
|
||||
# Compile
|
||||
./compile.sh
|
||||
|
||||
# Run Main (interactive)
|
||||
java -cp ~/dev/.target puzzle.Main --seed 42 --pop 18 --gens 100
|
||||
|
||||
# Generate daily puzzles
|
||||
java -cp ~/dev/.target puzzle.DailyGenerator
|
||||
```
|
||||
|
||||
### Docker Deployment
|
||||
|
||||
```bash
|
||||
# Build image
|
||||
docker build -t puzzle-generator .
|
||||
|
||||
# Run with docker-compose
|
||||
docker-compose up -d puzzle_gen_java
|
||||
|
||||
# View logs
|
||||
docker logs -f puzzle_gen_java
|
||||
```
|
||||
|
||||
### Environment Variables
|
||||
|
||||
| Variable | Default | Description |
|
||||
|----------------------|-------------------|----------------------------------------|
|
||||
| `OUT_DIR` | `/data/puzzles` | Output directory for generated puzzles |
|
||||
| `PUZZLES_PER_DAY` | `3` | Number of puzzles to generate daily |
|
||||
| `WORDS_PATH` | `./word-list.txt` | Path to word list file |
|
||||
| `THEME_FILTER` | `true` | Enable theme-based word filtering |
|
||||
| `THEME_MIN_SCORE` | `0.6` | Minimum theme score (0.0-1.0) |
|
||||
| `LM_STUDIO_BASE_URL` | - | LM Studio URL (future feature) |
|
||||
| `GENERATE_ON_START` | `false` | Generate puzzles on container startup |
|
||||
|
||||
## Theme System
|
||||
|
||||
### Supported Themes
|
||||
|
||||
- `algemeen` - General/common words
|
||||
- `nieuws` - News/politics
|
||||
- `technologie` - Technology
|
||||
- `sport` - Sports
|
||||
- `weer` - Weather/nature
|
||||
- `economie` - Economy
|
||||
- `gezondheid` - Health
|
||||
|
||||
### Theme Filtering
|
||||
|
||||
Words are scored against themes using:
|
||||
1. **Direct matching** - Word is in theme keyword list (score: 1.0)
|
||||
2. **Substring matching** - Partial word overlap (score: 0.7)
|
||||
3. **Edit distance** - Fuzzy matching for variations (score: 0.8-0.9)
|
||||
|
||||
Example:
|
||||
```bash
|
||||
ThemeGraph.filterByTheme(words, "technologie", 0.6);
|
||||
// Returns: COMPUTER, INTERNET, SOFTWARE, DATA, etc.
|
||||
```
|
||||
|
||||
## Output Format
|
||||
|
||||
### Puzzle JSON
|
||||
|
||||
```json
|
||||
{
|
||||
"date": "2025-12-19",
|
||||
"theme": "technologie",
|
||||
"difficulty": 1,
|
||||
"rewards": {
|
||||
"coins": 50,
|
||||
"stars": 2,
|
||||
"hints": 1
|
||||
},
|
||||
"gridv2": [
|
||||
"###COMPUTER###",
|
||||
"#I#O#E#E#O#"
|
||||
],
|
||||
"words": [
|
||||
{
|
||||
"word": "COMPUTER",
|
||||
"clue": "COMPUTER",
|
||||
"startRow": 0,
|
||||
"startCol": 3,
|
||||
"direction": "horizontal",
|
||||
"answer": "COMPUTER",
|
||||
"arrowRow": 0,
|
||||
"arrowCol": 2
|
||||
}
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
### Index JSON
|
||||
|
||||
```json
|
||||
{
|
||||
"date": "2025-12-19",
|
||||
"files": [
|
||||
"crossword_2025-12-19_01_technologie.json",
|
||||
"crossword_2025-12-19_02_sport.json",
|
||||
"crossword_2025-12-19_03_nieuws.json"
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
## Scheduling
|
||||
|
||||
Puzzles are generated daily at **3:15 AM** (configurable in `crontab`).
|
||||
|
||||
Edit `crontab` to change schedule:
|
||||
```cron
|
||||
# Daily at 3:15 AM
|
||||
15 3 * * * java -cp /app/target puzzle.DailyGenerator
|
||||
|
||||
# Every 6 hours
|
||||
0 */6 * * * java -cp /app/target puzzle.DailyGenerator
|
||||
|
||||
# Weekly on Monday at 1 AM
|
||||
0 1 * * 1 java -cp /app/target puzzle.DailyGenerator
|
||||
```
|
||||
|
||||
## Word List Format
|
||||
|
||||
Plain text file, one word per line, uppercase A-Z only, 2-8 characters:
|
||||
|
||||
```
|
||||
EU
|
||||
UUR
|
||||
AUTO
|
||||
BOOM
|
||||
COMPUTER
|
||||
INTERNET
|
||||
...
|
||||
```
|
||||
|
||||
## Performance
|
||||
|
||||
- **Mask generation**: ~2-5 seconds (genetic algorithm)
|
||||
- **Word filling**: ~5-30 seconds (CSP solver with MRV heuristic)
|
||||
- **Total per puzzle**: ~10-40 seconds
|
||||
|
||||
Optimizations:
|
||||
- Positional indexing for fast candidate lookup
|
||||
- Sorted intersection for constraint checking
|
||||
- No large array allocations during search
|
||||
- Progress bar with real-time stats
|
||||
|
||||
## Integration with LM Studio (Future)
|
||||
|
||||
The system is prepared for LM Studio integration to generate themed clues:
|
||||
|
||||
```bash
|
||||
docker-compose up -d
|
||||
# Set LM_STUDIO_BASE_URL in docker-compose.yml
|
||||
# Container will query LM Studio for contextual clues based on themes
|
||||
```
|
||||
|
||||
This will enhance clues from simple word repetition to semantic hints.
|
||||
|
||||
## Migration from Node.js
|
||||
|
||||
The Java version maintains module-wise compatibility with the Node.js generator:
|
||||
|
||||
| Node.js | Java |
|
||||
|------------------------|-------------------------------------|
|
||||
| `swedish_generator.js` | `SwedishGenerator.java` |
|
||||
| `export_format.js` | `ExportFormat.java` |
|
||||
| `main.js` | `Main.java` + `DailyGenerator.java` |
|
||||
| N/A | `ThemeGraph.java` (new) |
|
||||
|
||||
## Volume Management
|
||||
|
||||
Puzzles are stored in a Docker volume outside the workspace:
|
||||
|
||||
```bash
|
||||
# Default location
|
||||
/var/lib/puzzle-data
|
||||
|
||||
# Custom location
|
||||
export PUZZLE_OUTPUT_DIR=/path/to/puzzles
|
||||
docker-compose up -d
|
||||
|
||||
# View puzzles
|
||||
ls -lh /var/lib/puzzle-data/*.json
|
||||
```
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### No puzzles generated
|
||||
- Check word list has enough words (minimum 50)
|
||||
- Lower `THEME_MIN_SCORE` if using theme filtering
|
||||
- Increase `PUZZLES_PER_DAY` attempts
|
||||
|
||||
### Container not starting
|
||||
```bash
|
||||
docker logs puzzle_gen_java
|
||||
# Check for compilation errors or missing files
|
||||
```
|
||||
|
||||
### Low quality puzzles
|
||||
- Increase `--gens` parameter (more genetic iterations)
|
||||
- Increase `--pop` parameter (larger population)
|
||||
- Ensure word list has good variety of lengths 2-8
|
||||
|
||||
## License
|
||||
|
||||
MIT
|
||||
|
||||
## Authors
|
||||
|
||||
Original Node.js version + Java port with theme system
|
||||
Reference in New Issue
Block a user