Files
auctiora/RUN_INSTRUCTIONS.md
2025-11-28 05:16:51 +01:00

3.5 KiB

Troostwijk Auction Extractor - Run Instructions

Fixed Warnings

All warnings have been resolved:

  • SLF4J logging configured (slf4j-simple)
  • Native access enabled for SQLite JDBC
  • Logging output controlled via simplelogger.properties

Prerequisites

  1. Java 21 installed
  2. Maven installed
  3. IntelliJ IDEA (recommended) or command line

Setup (First Time Only)

1. Install Dependencies

In IntelliJ Terminal or PowerShell:

# Reload Maven dependencies
mvn clean install

# Install Playwright browser binaries (first time only)
mvn exec:java -e -Dexec.mainClass=com.microsoft.playwright.CLI -Dexec.args="install"

Running the Application

Option A: Using IntelliJ IDEA (Easiest)

  1. Add VM Options for native access:

    • Run → Edit Configurations
    • Select or create configuration for TroostwijkAuctionExtractor
    • In "VM options" field, add:
      --enable-native-access=ALL-UNNAMED
      
  2. Add Program Arguments (optional):

    • In "Program arguments" field, add:
      --max-visits 3
      
  3. Run the application:

    • Click the green Run button

Option B: Using Maven (Command Line)

# Run with 3 page limit
mvn exec:java

# Run with custom arguments (override pom.xml defaults)
mvn exec:java -Dexec.args="--max-visits 5"

# Run without cache
mvn exec:java -Dexec.args="--no-cache --max-visits 2"

# Run with unlimited visits
mvn exec:java -Dexec.args=""

Option C: Using Java Directly

# Compile first
mvn clean compile

# Run with native access enabled
java --enable-native-access=ALL-UNNAMED \
  -cp target/classes:$(mvn dependency:build-classpath -Dmdep.outputFile=/dev/stdout -q) \
  com.auction.TroostwijkAuctionExtractor --max-visits 3

Command Line Arguments

--max-visits <n>   Limit actual page fetches to n (0 = unlimited, default)
--no-cache         Disable page caching
--help             Show help message

Examples

Test with 3 page visits (cached pages don't count):

mvn exec:java -Dexec.args="--max-visits 3"

Fresh extraction without cache:

mvn exec:java -Dexec.args="--no-cache --max-visits 5"

Full extraction (all pages, unlimited):

mvn exec:java -Dexec.args=""

Expected Output (No Warnings)

=== Troostwijk Auction Extractor ===
Max page visits set to: 3

Initializing Playwright browser...
✓ Browser ready
✓ Cache database initialized

Starting auction extraction from https://www.troostwijkauctions.com/auctions

[Page 1] Fetching auctions...
  ✓ Fetched from website (visit 1/3)
  ✓ Found 20 auctions

[Page 2] Fetching auctions...
  ✓ Loaded from cache
  ✓ Found 20 auctions

[Page 3] Fetching auctions...
  ✓ Fetched from website (visit 2/3)
  ✓ Found 20 auctions

✓ Total auctions extracted: 60

=== Results ===
Total auctions found: 60
Dutch auctions (NL): 45
Actual page visits: 2

✓ Browser and cache closed

Cache Management

  • Cache is stored in: cache/page_cache.db
  • Cache expires after: 24 hours (configurable in code)
  • To clear cache: Delete cache/page_cache.db file

Troubleshooting

If you still see warnings:

  1. Reload Maven project in IntelliJ:

    • Right-click pom.xml → Maven → Reload project
  2. Verify VM options:

    • Ensure --enable-native-access=ALL-UNNAMED is in VM options
  3. Clean and rebuild:

    mvn clean install
    

If Playwright fails:

# Reinstall browser binaries
mvn exec:java -e -Dexec.mainClass=com.microsoft.playwright.CLI -Dexec.args="install chromium"