454 lines
25 KiB
Markdown
454 lines
25 KiB
Markdown
# Troostwijk Auction Scraper
|
|
|
|
A Java-based web scraper for Dutch auctions on Troostwijk Auctions with **100% free** desktop/email notifications, SQLite persistence, and AI-powered object detection.
|
|
|
|
## Features
|
|
|
|
- **Auction Discovery**: Automatically discovers active Dutch auctions
|
|
- **Data Scraping**: Fetches detailed lot information via Troostwijk's JSON API
|
|
- **SQLite Storage**: Persists auction data, lots, images, and detected objects
|
|
- **Image Processing**: Downloads and analyzes lot images using OpenCV YOLO object detection
|
|
- **Free Notifications**: Real-time notifications when:
|
|
- Bids change on monitored lots
|
|
- Auctions are closing soon (within 5 minutes)
|
|
- Via desktop notifications (Windows/macOS/Linux system tray) ✅
|
|
- Optionally via email (Gmail SMTP - free) ✅
|
|
|
|
## Dependencies
|
|
|
|
All dependencies are managed via Maven (see `pom.xml`):
|
|
|
|
- **jsoup 1.17.2** - HTML parsing and HTTP client
|
|
- **Jackson 2.17.0** - JSON processing
|
|
- **SQLite JDBC 3.45.1.0** - Database operations
|
|
- **JavaMail 1.6.2** - Email notifications (free)
|
|
- **OpenCV 4.9.0** - Image processing and object detection
|
|
|
|
## Setup
|
|
|
|
### 1.. Notification Options (Choose One)
|
|
|
|
#### Option A: Desktop Notifications Only ⭐ (Recommended - Zero Setup)
|
|
|
|
Desktop notifications work out of the box on:
|
|
- **Windows**: System tray notifications
|
|
- **macOS**: Notification Center
|
|
- **Linux**: Desktop environment notifications (GNOME, KDE, etc.)
|
|
|
|
**No configuration required!** Just run with default settings:
|
|
```bash
|
|
export NOTIFICATION_CONFIG="desktop"
|
|
# Or simply don't set it - desktop is the default
|
|
```
|
|
|
|
#### Option B: Desktop + Email Notifications 📧 (Free Gmail)
|
|
|
|
1. Enable 2-Factor Authentication in your Google Account
|
|
2. Go to: **Google Account → Security → 2-Step Verification → App passwords**
|
|
3. Generate an app password for "Mail"
|
|
4. Set environment variable:
|
|
```bash
|
|
export NOTIFICATION_CONFIG="smtp:your.email@gmail.com:your_app_password:recipient@example.com"
|
|
```
|
|
|
|
**Format**: `smtp:username:app_password:recipient_email`
|
|
|
|
**Example**:
|
|
```bash
|
|
export NOTIFICATION_CONFIG="smtp:john.doe@gmail.com:abcd1234efgh5678:john.doe@gmail.com"
|
|
```
|
|
|
|
**Note**: This is completely free using Gmail's SMTP server. No paid services required!
|
|
|
|
### 2. OpenCV Native Libraries
|
|
|
|
Download and install OpenCV native libraries for your platform:
|
|
|
|
**Windows:**
|
|
```bash
|
|
# Download from https://opencv.org/releases/
|
|
# Extract and add to PATH or use:
|
|
java -Djava.library.path="C:\opencv\build\java\x64" -jar scraper.jar
|
|
```
|
|
|
|
**Linux:**
|
|
```bash
|
|
sudo apt-get install libopencv-dev
|
|
```
|
|
|
|
**macOS:**
|
|
```bash
|
|
brew install opencv
|
|
```
|
|
|
|
### 3. YOLO Model Files
|
|
|
|
Download YOLO model files for object detection:
|
|
|
|
```bash
|
|
mkdir models
|
|
cd models
|
|
|
|
# Download YOLOv4 config
|
|
wget https://raw.githubusercontent.com/AlexeyAB/darknet/master/cfg/yolov4.cfg
|
|
|
|
# Download YOLOv4 weights (245 MB)
|
|
wget https://github.com/AlexeyAB/darknet/releases/download/darknet_yolo_v3_optimal/yolov4.weights
|
|
|
|
# Download COCO class names
|
|
wget https://raw.githubusercontent.com/AlexeyAB/darknet/master/data/coco.names
|
|
```
|
|
|
|
## Building
|
|
|
|
```bash
|
|
mvn clean package
|
|
```
|
|
|
|
This creates:
|
|
- `target/troostwijk-scraper-1.0-SNAPSHOT.jar` - Regular JAR
|
|
- `target/troostwijk-scraper-1.0-SNAPSHOT-jar-with-dependencies.jar` - Executable JAR with all dependencies
|
|
|
|
## Running
|
|
|
|
### Quick Start (Desktop Notifications Only)
|
|
|
|
```bash
|
|
java -Djava.library.path="/path/to/opencv/lib" \
|
|
-jar target/troostwijk-scraper-1.0-SNAPSHOT-jar-with-dependencies.jar
|
|
```
|
|
|
|
### With Email Notifications
|
|
|
|
```bash
|
|
export NOTIFICATION_CONFIG="smtp:your@gmail.com:app_password:your@gmail.com"
|
|
|
|
java -Djava.library.path="/path/to/opencv/lib" \
|
|
-jar target/troostwijk-scraper-1.0-SNAPSHOT-jar-with-dependencies.jar
|
|
```
|
|
|
|
### Using Maven
|
|
|
|
```bash
|
|
mvn exec:java -Dexec.mainClass="com.auction.scraper.TroostwijkScraper"
|
|
```
|
|
|
|
## System Architecture & Integration Flow
|
|
|
|
```
|
|
┌─────────────────────────────────────────────────────────────────────────────┐
|
|
│ COMPLETE SYSTEM INTEGRATION DIAGRAM │
|
|
└─────────────────────────────────────────────────────────────────────────────┘
|
|
|
|
┌──────────────────────────────────────────────────────────────────────────────┐
|
|
│ PHASE 1: EXTERNAL SCRAPER (Python/Playwright) - ARCHITECTURE-TROOSTWIJK │
|
|
└──────────────────────────────────────────────────────────────────────────────┘
|
|
│
|
|
┌─────────────────────────────┼─────────────────────────────┐
|
|
▼ ▼ ▼
|
|
[Listing Pages] [Auction Pages] [Lot Pages]
|
|
/auctions?page=N /a/auction-id /l/lot-id
|
|
│ │ │
|
|
│ Extract URLs │ Parse __NEXT_DATA__ │ Parse __NEXT_DATA__
|
|
├────────────────────────────▶│ JSON │ JSON
|
|
│ │ │
|
|
│ ▼ ▼
|
|
│ ┌────────────────┐ ┌────────────────┐
|
|
│ │ INSERT auctions│ │ INSERT lots │
|
|
│ │ to SQLite │ │ INSERT images │
|
|
│ └────────────────┘ │ (URLs only) │
|
|
│ │ └────────────────┘
|
|
│ │ │
|
|
└─────────────────────────────┴────────────────────────────┘
|
|
▼
|
|
┌──────────────────┐
|
|
│ SQLITE DATABASE │
|
|
│ troostwijk.db │
|
|
└──────────────────┘
|
|
│
|
|
┌─────────────────┼─────────────────┐
|
|
▼ ▼ ▼
|
|
[auctions table] [lots table] [images table]
|
|
- auction_id - lot_id - id
|
|
- title - auction_id - lot_id
|
|
- location - title - url
|
|
- lots_count - current_bid - local_path
|
|
- closing_time - bid_count - downloaded=0
|
|
- closing_time
|
|
│
|
|
┌─────────────────────────────────────┴─────────────────────────────────────┐
|
|
│ PHASE 2: MONITORING & PROCESSING (Java) - THIS PROJECT │
|
|
└────────────────────────────────────────────────────────────────────────────┘
|
|
│
|
|
┌─────────────────┼─────────────────┐
|
|
▼ ▼ ▼
|
|
[TroostwijkMonitor] [DatabaseService] [ScraperDataAdapter]
|
|
│ │ │
|
|
│ Read lots │ Query lots │ Transform data
|
|
│ every hour │ Import images │ TEXT → INTEGER
|
|
│ │ │ "€123" → 123.0
|
|
└─────────────────┴─────────────────┘
|
|
│
|
|
┌─────────────────────────┼─────────────────────────┐
|
|
▼ ▼ ▼
|
|
[Bid Monitoring] [Image Processing] [Closing Alerts]
|
|
Check API every 1h Download images Check < 5 min
|
|
│ │ │
|
|
│ New bid? │ Process via │ Time critical?
|
|
├─[YES]──────────┐ │ ObjectDetection ├─[YES]────┐
|
|
│ │ │ │ │
|
|
▼ │ ▼ │ │
|
|
[Update current_bid] │ ┌──────────────────┐ │ │
|
|
in database │ │ YOLO Detection │ │ │
|
|
│ │ OpenCV DNN │ │ │
|
|
│ └──────────────────┘ │ │
|
|
│ │ │ │
|
|
│ │ Detect objects │ │
|
|
│ ├─[vehicle] │ │
|
|
│ ├─[furniture] │ │
|
|
│ ├─[machinery] │ │
|
|
│ │ │ │
|
|
│ ▼ │ │
|
|
│ [Save labels to DB] │ │
|
|
│ [Estimate value] │ │
|
|
│ │ │ │
|
|
│ │ │ │
|
|
└─────────┴───────────────────────┴──────────┘
|
|
│
|
|
┌───────────────────────────────────────────────┴────────────────────────────┐
|
|
│ PHASE 3: NOTIFICATION SYSTEM - USER INTERACTION TRIGGERS │
|
|
└────────────────────────────────────────────────────────────────────────────┘
|
|
│
|
|
┌─────────────────┴─────────────────┐
|
|
▼ ▼
|
|
[NotificationService] [User Decision Points]
|
|
│ │
|
|
┌───────────────────┼───────────────────┐ │
|
|
▼ ▼ ▼ │
|
|
[Desktop Notify] [Email Notify] [Priority Level] │
|
|
Windows/macOS/ Gmail SMTP 0=Normal │
|
|
Linux system (FREE) 1=High │
|
|
tray │
|
|
│ │ │ │
|
|
└───────────────────┴───────────────────┘ │
|
|
│ │
|
|
▼ ▼
|
|
┌──────────────────┐ ┌──────────────────┐
|
|
│ USER INTERACTION │ │ TRIGGER EVENTS: │
|
|
│ NOTIFICATIONS │ │ │
|
|
└──────────────────┘ └──────────────────┘
|
|
│ │
|
|
┌───────────────────┼───────────────────┐ │
|
|
▼ ▼ ▼ │
|
|
┌──────────────────┐ ┌──────────────────┐ ┌──────────────────┐ │
|
|
│ 1. BID CHANGE │ │ 2. OBJECT │ │ 3. CLOSING │ │
|
|
│ │ │ DETECTED │ │ ALERT │ │
|
|
│ "Nieuw bod op │ │ │ │ │ │
|
|
│ kavel 12345: │ │ "Lot contains: │ │ "Kavel 12345 │ │
|
|
│ €150 (was €125)"│ │ - Vehicle │ │ sluit binnen │ │
|
|
│ │ │ - Machinery │ │ 5 min." │ │
|
|
│ Priority: NORMAL │ │ Est: €5000" │ │ Priority: HIGH │ │
|
|
│ │ │ │ │ │ │
|
|
│ Action needed: │ │ Action needed: │ │ Action needed: │ │
|
|
│ ▸ Place bid? │ │ ▸ Review item? │ │ ▸ Place final │ │
|
|
│ ▸ Monitor? │ │ ▸ Confirm value? │ │ bid? │ │
|
|
│ ▸ Ignore? │ │ ▸ Add to watch? │ │ ▸ Let expire? │ │
|
|
└──────────────────┘ └──────────────────┘ └──────────────────┘ │
|
|
│ │ │ │
|
|
└───────────────────┴───────────────────┴─────────────┘
|
|
│
|
|
▼
|
|
┌─────────────────────────────────────────────────────────────────────────────┐
|
|
│ USER ACTIONS & EXCEPTIONS │
|
|
├─────────────────────────────────────────────────────────────────────────────┤
|
|
│ Additional interaction points: │
|
|
│ │
|
|
│ 4. VIEWING DAY QUESTIONS │
|
|
│ "Bezichtiging op [date] - kunt u aanwezig zijn?" │
|
|
│ Action: ▸ Confirm attendance ▸ Request alternative ▸ Decline │
|
|
│ │
|
|
│ 5. ITEM RECOGNITION CONFIRMATION │
|
|
│ "Detected: [object] - Is deze correcte identificatie?" │
|
|
│ Action: ▸ Confirm ▸ Correct label ▸ Add notes │
|
|
│ │
|
|
│ 6. VALUE ESTIMATE APPROVAL │
|
|
│ "Geschatte waarde: €X - Akkoord?" │
|
|
│ Action: ▸ Accept ▸ Adjust ▸ Request manual review │
|
|
│ │
|
|
│ 7. EXCEPTION HANDLING │
|
|
│ "Afwijkende sluitingstijd / locatiewijziging / special terms" │
|
|
│ Action: ▸ Acknowledge ▸ Update preferences ▸ Withdraw interest │
|
|
└─────────────────────────────────────────────────────────────────────────────┘
|
|
|
|
┌─────────────────────────────────────────────────────────────────────────────┐
|
|
│ OBJECT DETECTION & VALUE ESTIMATION PIPELINE │
|
|
└─────────────────────────────────────────────────────────────────────────────┘
|
|
|
|
[Downloaded Image] → [ImageProcessingService]
|
|
│ │
|
|
│ ▼
|
|
│ [ObjectDetectionService]
|
|
│ │
|
|
│ ├─ Load YOLO model
|
|
│ ├─ Run inference (416x416)
|
|
│ ├─ Post-process detections
|
|
│ │ (confidence > 0.5)
|
|
│ │
|
|
│ ▼
|
|
│ ┌──────────────────────┐
|
|
│ │ Detected Objects: │
|
|
│ │ - person │
|
|
│ │ - car │
|
|
│ │ - truck │
|
|
│ │ - furniture │
|
|
│ │ - machinery │
|
|
│ │ - electronics │
|
|
│ │ (80 COCO classes) │
|
|
│ └──────────────────────┘
|
|
│ │
|
|
│ ▼
|
|
│ [Value Estimation Logic]
|
|
│ (Future enhancement)
|
|
│ │
|
|
│ ├─ Match objects to auction categories
|
|
│ ├─ Historical price analysis
|
|
│ ├─ Condition assessment
|
|
│ ├─ Market trends
|
|
│ │
|
|
│ ▼
|
|
│ ┌──────────────────────┐
|
|
│ │ Estimated Value: │
|
|
│ │ €X - €Y range │
|
|
│ │ Confidence: 75% │
|
|
│ └──────────────────────┘
|
|
│ │
|
|
└──────────────────────┴─ [Save to DB]
|
|
│
|
|
▼
|
|
[Trigger notification if
|
|
value > threshold]
|
|
|
|
```
|
|
|
|
## Integration Hooks & Timing
|
|
|
|
| Event | Frequency | Trigger | Notification Type | User Action Required |
|
|
|-------|-----------|---------|-------------------|---------------------|
|
|
| **New auction discovered** | On scrape | Scraper finds new auction | Desktop + Email (optional) | Review auction |
|
|
| **Bid change detected** | Every 1 hour | Monitor detects higher bid | Desktop + Email | Place counter-bid? |
|
|
| **Closing soon (< 30 min)** | When detected | Time-based check | Desktop + Email | Review lot |
|
|
| **Closing imminent (< 5 min)** | When detected | Time-based check | Desktop + Email (HIGH) | Final bid decision |
|
|
| **Object detected** | On image process | YOLO finds objects | Desktop + Email | Confirm identification |
|
|
| **Value estimated** | After detection | Estimation complete | Desktop + Email | Approve estimate |
|
|
| **Viewing day scheduled** | From lot metadata | Scraper extracts date | Desktop + Email | Confirm attendance |
|
|
| **Exception/Change** | On update | Scraper detects change | Desktop + Email (HIGH) | Acknowledge |
|
|
|
|
## Project Structure
|
|
|
|
```
|
|
src/main/java/com/auction/
|
|
├── Main.java # Entry point
|
|
├── TroostwijkMonitor.java # Monitoring & orchestration
|
|
├── DatabaseService.java # SQLite operations
|
|
├── ScraperDataAdapter.java # Schema translation (TEXT→INT, €→float)
|
|
├── ImageProcessingService.java # Downloads & processes images
|
|
├── ObjectDetectionService.java # OpenCV YOLO detection
|
|
├── NotificationService.java # Desktop + Email notifications (FREE)
|
|
├── Lot.java # Domain model for auction lots
|
|
├── AuctionInfo.java # Domain model for auctions
|
|
└── Console.java # Logging utility
|
|
```
|
|
|
|
## Configuration
|
|
|
|
Edit `TroostwijkScraper.main()` to customize:
|
|
|
|
- **Database file**: `troostwijk.db` (SQLite database location)
|
|
- **YOLO paths**: Model configuration and weights files
|
|
- **Monitoring frequency**: Default is every 1 hour
|
|
- **Closing alerts**: Default is 5 minutes before closing
|
|
|
|
## Database Schema
|
|
|
|
The scraper creates three tables:
|
|
|
|
**sales**
|
|
- `sale_id` (PRIMARY KEY)
|
|
- `title`, `location`, `closing_time`
|
|
|
|
**lots**
|
|
- `lot_id` (PRIMARY KEY)
|
|
- `sale_id`, `title`, `description`, `manufacturer`, `type`, `year`
|
|
- `category`, `current_bid`, `currency`, `url`
|
|
- `closing_time`, `closing_notified`
|
|
|
|
**images**
|
|
- `id` (PRIMARY KEY)
|
|
- `lot_id`, `url`, `local_path`, `labels` (detected objects)
|
|
|
|
## Notification Examples
|
|
|
|
### Desktop Notification
|
|
![System Tray Notification]
|
|
```
|
|
🔔 Kavel bieding update
|
|
Nieuw bod op kavel 12345: €150.00 (was €125.00)
|
|
```
|
|
|
|
### Email Notification
|
|
```
|
|
From: your.email@gmail.com
|
|
To: your.email@gmail.com
|
|
Subject: [Troostwijk] Kavel bieding update
|
|
|
|
Nieuw bod op kavel 12345: €150.00 (was €125.00)
|
|
```
|
|
|
|
**High Priority Alerts** (closing soon):
|
|
```
|
|
⚠️ Lot nearing closure
|
|
Kavel 12345 sluit binnen 5 min.
|
|
```
|
|
|
|
## Why This Approach?
|
|
|
|
✅ **100% Free** - No paid services (Twilio, Pushover, etc.)
|
|
✅ **No External Dependencies** - Desktop notifications built into Java
|
|
✅ **Works Offline** - Desktop notifications don't need internet
|
|
✅ **Privacy First** - Your data stays on your machine
|
|
✅ **Cross-Platform** - Windows, macOS, Linux supported
|
|
✅ **Optional Email** - Add Gmail notifications if you want
|
|
|
|
## Troubleshooting
|
|
|
|
### Desktop Notifications Not Showing
|
|
|
|
- **Windows**: Check if Java has notification permissions
|
|
- **Linux**: Ensure you have a desktop environment running (not headless)
|
|
- **macOS**: Check System Preferences → Notifications
|
|
|
|
### Email Not Sending
|
|
|
|
1. Verify 2FA is enabled in Google Account
|
|
2. Confirm you're using an **App Password** (not your regular Gmail password)
|
|
3. Check that "Less secure app access" is NOT needed (app passwords work with 2FA)
|
|
4. Verify the SMTP format: `smtp:username:app_password:recipient`
|
|
|
|
## Notes
|
|
|
|
- Desktop notifications require a graphical environment (not headless servers)
|
|
- For headless servers, use email-only notifications
|
|
- Gmail SMTP is free and has generous limits (500 emails/day)
|
|
- OpenCV native libraries must match your platform architecture
|
|
- YOLO weights file is ~245 MB
|
|
|
|
|
|
```shell
|
|
git add . | git commit -a -m all | git push
|
|
```
|
|
## License
|
|
|
|
This is example code for educational purposes.
|
|
|
|
|