# Troostwijk Auction Extractor - Run Instructions ## Fixed Warnings All warnings have been resolved: - ✅ SLF4J logging configured (slf4j-simple) - ✅ Native access enabled for SQLite JDBC - ✅ Logging output controlled via simplelogger.properties ## Prerequisites 1. **Java 21** installed 2. **Maven** installed 3. **IntelliJ IDEA** (recommended) or command line ## Setup (First Time Only) ### 1. Install Dependencies In IntelliJ Terminal or PowerShell: ```bash # Reload Maven dependencies mvn clean install # Install Playwright browser binaries (first time only) mvn exec:java -e -Dexec.mainClass=com.microsoft.playwright.CLI -Dexec.args="install" ``` ## Running the Application ### Option A: Using IntelliJ IDEA (Easiest) 1. **Add VM Options for native access:** - Run → Edit Configurations - Select or create configuration for `TroostwijkAuctionExtractor` - In "VM options" field, add: ``` --enable-native-access=ALL-UNNAMED ``` 2. **Add Program Arguments (optional):** - In "Program arguments" field, add: ``` --max-visits 3 ``` 3. **Run the application:** - Click the green Run button ### Option B: Using Maven (Command Line) ```bash # Run with 3 page limit mvn exec:java # Run with custom arguments (override pom.xml defaults) mvn exec:java -Dexec.args="--max-visits 5" # Run without cache mvn exec:java -Dexec.args="--no-cache --max-visits 2" # Run with unlimited visits mvn exec:java -Dexec.args="" ``` ### Option C: Using Java Directly ```bash # Compile first mvn clean compile # Run with native access enabled java --enable-native-access=ALL-UNNAMED \ -cp target/classes:$(mvn dependency:build-classpath -Dmdep.outputFile=/dev/stdout -q) \ com.auction.TroostwijkAuctionExtractor --max-visits 3 ``` ## Command Line Arguments ``` --max-visits Limit actual page fetches to n (0 = unlimited, default) --no-cache Disable page caching --help Show help message ``` ## Examples ### Test with 3 page visits (cached pages don't count): ```bash mvn exec:java -Dexec.args="--max-visits 3" ``` ### Fresh extraction without cache: ```bash mvn exec:java -Dexec.args="--no-cache --max-visits 5" ``` ### Full extraction (all pages, unlimited): ```bash mvn exec:java -Dexec.args="" ``` ## Expected Output (No Warnings) ``` === Troostwijk Auction Extractor === Max page visits set to: 3 Initializing Playwright browser... ✓ Browser ready ✓ Cache database initialized Starting auction extraction from https://www.troostwijkauctions.com/auctions [Page 1] Fetching auctions... ✓ Fetched from website (visit 1/3) ✓ Found 20 auctions [Page 2] Fetching auctions... ✓ Loaded from cache ✓ Found 20 auctions [Page 3] Fetching auctions... ✓ Fetched from website (visit 2/3) ✓ Found 20 auctions ✓ Total auctions extracted: 60 === Results === Total auctions found: 60 Dutch auctions (NL): 45 Actual page visits: 2 ✓ Browser and cache closed ``` ## Cache Management - Cache is stored in: `cache/page_cache.db` - Cache expires after: 24 hours (configurable in code) - To clear cache: Delete `cache/page_cache.db` file ## Troubleshooting ### If you still see warnings: 1. **Reload Maven project in IntelliJ:** - Right-click `pom.xml` → Maven → Reload project 2. **Verify VM options:** - Ensure `--enable-native-access=ALL-UNNAMED` is in VM options 3. **Clean and rebuild:** ```bash mvn clean install ``` ### If Playwright fails: ```bash # Reinstall browser binaries mvn exec:java -e -Dexec.mainClass=com.microsoft.playwright.CLI -Dexec.args="install chromium" ```