165 lines
3.5 KiB
Markdown
165 lines
3.5 KiB
Markdown
# Troostwijk Auction Extractor - Run Instructions
|
|
|
|
## Fixed Warnings
|
|
|
|
All warnings have been resolved:
|
|
- ✅ SLF4J logging configured (slf4j-simple)
|
|
- ✅ Native access enabled for SQLite JDBC
|
|
- ✅ Logging output controlled via simplelogger.properties
|
|
|
|
## Prerequisites
|
|
|
|
1. **Java 21** installed
|
|
2. **Maven** installed
|
|
3. **IntelliJ IDEA** (recommended) or command line
|
|
|
|
## Setup (First Time Only)
|
|
|
|
### 1. Install Dependencies
|
|
|
|
In IntelliJ Terminal or PowerShell:
|
|
|
|
```bash
|
|
# Reload Maven dependencies
|
|
mvn clean install
|
|
|
|
# Install Playwright browser binaries (first time only)
|
|
mvn exec:java -e -Dexec.mainClass=com.microsoft.playwright.CLI -Dexec.args="install"
|
|
```
|
|
|
|
## Running the Application
|
|
|
|
### Option A: Using IntelliJ IDEA (Easiest)
|
|
|
|
1. **Add VM Options for native access:**
|
|
- Run → Edit Configurations
|
|
- Select or create configuration for `TroostwijkAuctionExtractor`
|
|
- In "VM options" field, add:
|
|
```
|
|
--enable-native-access=ALL-UNNAMED
|
|
```
|
|
|
|
2. **Add Program Arguments (optional):**
|
|
- In "Program arguments" field, add:
|
|
```
|
|
--max-visits 3
|
|
```
|
|
|
|
3. **Run the application:**
|
|
- Click the green Run button
|
|
|
|
### Option B: Using Maven (Command Line)
|
|
|
|
```bash
|
|
# Run with 3 page limit
|
|
mvn exec:java
|
|
|
|
# Run with custom arguments (override pom.xml defaults)
|
|
mvn exec:java -Dexec.args="--max-visits 5"
|
|
|
|
# Run without cache
|
|
mvn exec:java -Dexec.args="--no-cache --max-visits 2"
|
|
|
|
# Run with unlimited visits
|
|
mvn exec:java -Dexec.args=""
|
|
```
|
|
|
|
### Option C: Using Java Directly
|
|
|
|
```bash
|
|
# Compile first
|
|
mvn clean compile
|
|
|
|
# Run with native access enabled
|
|
java --enable-native-access=ALL-UNNAMED \
|
|
-cp target/classes:$(mvn dependency:build-classpath -Dmdep.outputFile=/dev/stdout -q) \
|
|
com.auction.TroostwijkAuctionExtractor --max-visits 3
|
|
```
|
|
|
|
## Command Line Arguments
|
|
|
|
```
|
|
--max-visits <n> Limit actual page fetches to n (0 = unlimited, default)
|
|
--no-cache Disable page caching
|
|
--help Show help message
|
|
```
|
|
|
|
## Examples
|
|
|
|
### Test with 3 page visits (cached pages don't count):
|
|
```bash
|
|
mvn exec:java -Dexec.args="--max-visits 3"
|
|
```
|
|
|
|
### Fresh extraction without cache:
|
|
```bash
|
|
mvn exec:java -Dexec.args="--no-cache --max-visits 5"
|
|
```
|
|
|
|
### Full extraction (all pages, unlimited):
|
|
```bash
|
|
mvn exec:java -Dexec.args=""
|
|
```
|
|
|
|
## Expected Output (No Warnings)
|
|
|
|
```
|
|
=== Troostwijk Auction Extractor ===
|
|
Max page visits set to: 3
|
|
|
|
Initializing Playwright browser...
|
|
✓ Browser ready
|
|
✓ Cache database initialized
|
|
|
|
Starting auction extraction from https://www.troostwijkauctions.com/auctions
|
|
|
|
[Page 1] Fetching auctions...
|
|
✓ Fetched from website (visit 1/3)
|
|
✓ Found 20 auctions
|
|
|
|
[Page 2] Fetching auctions...
|
|
✓ Loaded from cache
|
|
✓ Found 20 auctions
|
|
|
|
[Page 3] Fetching auctions...
|
|
✓ Fetched from website (visit 2/3)
|
|
✓ Found 20 auctions
|
|
|
|
✓ Total auctions extracted: 60
|
|
|
|
=== Results ===
|
|
Total auctions found: 60
|
|
Dutch auctions (NL): 45
|
|
Actual page visits: 2
|
|
|
|
✓ Browser and cache closed
|
|
```
|
|
|
|
## Cache Management
|
|
|
|
- Cache is stored in: `cache/page_cache.db`
|
|
- Cache expires after: 24 hours (configurable in code)
|
|
- To clear cache: Delete `cache/page_cache.db` file
|
|
|
|
## Troubleshooting
|
|
|
|
### If you still see warnings:
|
|
|
|
1. **Reload Maven project in IntelliJ:**
|
|
- Right-click `pom.xml` → Maven → Reload project
|
|
|
|
2. **Verify VM options:**
|
|
- Ensure `--enable-native-access=ALL-UNNAMED` is in VM options
|
|
|
|
3. **Clean and rebuild:**
|
|
```bash
|
|
mvn clean install
|
|
```
|
|
|
|
### If Playwright fails:
|
|
|
|
```bash
|
|
# Reinstall browser binaries
|
|
mvn exec:java -e -Dexec.mainClass=com.microsoft.playwright.CLI -Dexec.args="install chromium"
|
|
```
|