- Hardened the GraphQL client to reduce 403 occurrences and provide clearer diagnostics when they appear.
- Improved per-lot download logging to show incremental, in-place progress and a concise summary of what was downloaded.
### Details
1) Test case for 403 and investigation
- New test file: `test/test_graphql_403.py`.
- Uses `importlib` to load `src/config.py` and `src/graphql_client.py` directly so it’s independent of sys.path quirks.
- Mocks `aiohttp.ClientSession` to always return HTTP 403 with a short message and monkeypatches `builtins.print` to capture logs.
- Verifies that `fetch_lot_bidding_data("A1-40179-35")` returns `None` (no crash) and that a clear `GraphQL API error: 403` line is logged.
- Result: `pytest test/test_graphql_403.py -q` passes locally.
- Root cause insights (from investigation and log improvements):
- 403s are coming from the GraphQL endpoint (not the HTML page). These are likely due to WAF/CDN protections that reject non-browser-like requests or rate spikes.
- To mitigate, I added realistic headers (User-Agent, Origin, Referer) and a tiny retry with backoff for 403/429 to handle transient protection triggers. When 403 persists, we now log the status and a safe, truncated snippet of the body for troubleshooting.
2) Incremental/in-place logging for downloads
- Updated `src/scraper.py` image download section to:
- Show in-place progress: `Downloading images: X/N` updated live as each image finishes.
- After completion, print: `Downloaded: K/N new images`.
- Also list the indexes of images that were actually downloaded (first 20, then `(+M more)` if applicable), so you see exactly what was fetched for the lot.
3) GraphQL client improvements
- Updated `src/graphql_client.py`:
- Added browser-like headers and contextual Referer.
- Added small retry with backoff for 403/429.
- Improved error logs to include status, lot id, and a short body snippet.
### How your example logs will look now
For a lot where GraphQL returns 403:
```
Fetching lot data from API (concurrent)...
GraphQL API error: 403 (lot=A1-40179-35) — Forbidden by WAF
```
For image downloads:
```
Images: 6
Downloading images: 0/6
... 6/6
Downloaded: 6/6 new images
Indexes: 0, 1, 2, 3, 4, 5
```
(When all cached: `All 6 images already cached`)
### Notes
- Full test run surfaced a pre-existing import error in `test/test_scraper.py` (unrelated to these changes). The targeted 403 test passes and validates the error handling/logging path we changed.
- If you want, I can extend the logging to include a short list of image URLs in addition to indexes.
160 lines
4.2 KiB
Markdown
160 lines
4.2 KiB
Markdown
# Dashboard Upgrade Plan
|
|
|
|
## Executive Summary
|
|
**5 new intelligence fields** enable advanced opportunity detection and analytics. Run migrations to activate.
|
|
|
|
---
|
|
|
|
## New Intelligence Fields
|
|
|
|
| Field | Type | Coverage | Value | Use Cases |
|
|
|-------------------------|---------|--------------------------|-------|-----------------------------------------|
|
|
| **followers_count** | INTEGER | 100% future, 0% existing | ⭐⭐⭐⭐⭐ | Popularity tracking, sleeper detection |
|
|
| **estimated_min_price** | REAL | 100% future, 0% existing | ⭐⭐⭐⭐⭐ | Bargain detection, value gap analysis |
|
|
| **estimated_max_price** | REAL | 100% future, 0% existing | ⭐⭐⭐⭐⭐ | Overvaluation alerts, ROI calculation |
|
|
| **lot_condition** | TEXT | ~85% future | ⭐⭐⭐ | Quality filtering, condition scoring |
|
|
| **appearance** | TEXT | ~85% future | ⭐⭐⭐ | Visual assessment, restoration projects |
|
|
|
|
### Key Metrics Enabled
|
|
- Interest-to-bid conversion rate
|
|
- Auction house estimation accuracy
|
|
- Bargain/overvaluation detection
|
|
- Price prediction models
|
|
|
|
---
|
|
|
|
## Data Quality Fixes ✅
|
|
**Orphaned lots:** 16,807 → 13 (99.9% fixed)
|
|
**Auction completeness:** 0% → 100% (lots_count, first_lot_closing_time)
|
|
|
|
---
|
|
|
|
## Dashboard Upgrades
|
|
|
|
### Priority 1: Opportunity Detection (High ROI)
|
|
|
|
**1.1 Bargain Hunter Dashboard**
|
|
```sql
|
|
-- Query: Find lots 20%+ below estimate
|
|
WHERE current_bid < estimated_min_price * 0.80
|
|
AND followers_count > 3
|
|
AND closing_time > NOW()
|
|
```
|
|
**Alert logic:** `value_gap = estimated_min - current_bid`
|
|
|
|
**1.2 Sleeper Lots**
|
|
```sql
|
|
-- Query: High interest, no bids, <24h left
|
|
WHERE followers_count > 10
|
|
AND bid_count = 0
|
|
AND hours_remaining < 24
|
|
```
|
|
|
|
**1.3 Value Gap Heatmap**
|
|
- Great deals: <80% of estimate
|
|
- Fair price: 80-120% of estimate
|
|
- Overvalued: >120% of estimate
|
|
|
|
### Priority 2: Intelligence Analytics
|
|
|
|
**2.1 Enhanced Lot Card**
|
|
```
|
|
Bidding: €500 current | 12 followers | 8 bids | 2.4/hr
|
|
Valuation: €1,200-€1,800 est | €700 value gap | €700-€1,300 potential profit
|
|
Condition: Used - Good | Normal wear
|
|
Timing: 2h 15m left | First: Dec 6 09:15 | Last: Dec 8 12:10
|
|
```
|
|
|
|
**2.2 Auction House Accuracy**
|
|
```sql
|
|
-- Post-auction analysis
|
|
SELECT category,
|
|
AVG(ABS(final - midpoint)/midpoint * 100) as accuracy,
|
|
AVG(final - midpoint) as bias
|
|
FROM lots WHERE final_price IS NOT NULL
|
|
GROUP BY category
|
|
```
|
|
|
|
**2.3 Interest Conversion Rate**
|
|
```sql
|
|
SELECT
|
|
COUNT(*) total,
|
|
COUNT(CASE WHEN followers > 0 THEN 1) as with_followers,
|
|
COUNT(CASE WHEN bids > 0 THEN 1) as with_bids,
|
|
ROUND(with_bids / with_followers * 100, 2) as conversion_rate
|
|
FROM lots
|
|
```
|
|
|
|
### Priority 3: Real-Time Alerts
|
|
|
|
```python
|
|
BARGAIN: current_bid < estimated_min * 0.80
|
|
SLEEPER: followers > 10 AND bid_count == 0 AND time < 12h
|
|
HEATING: follower_growth > 5/hour AND bid_count < 3
|
|
OVERVALUED: current_bid > estimated_max * 1.2
|
|
```
|
|
|
|
### Priority 4: Advanced Analytics
|
|
|
|
**4.1 Price Prediction Model**
|
|
```python
|
|
features = [
|
|
'followers_count',
|
|
'estimated_min_price',
|
|
'estimated_max_price',
|
|
'lot_condition',
|
|
'bid_velocity',
|
|
'category'
|
|
]
|
|
predicted_price = model.predict(features)
|
|
```
|
|
|
|
**4.2 Category Intelligence**
|
|
- Avg followers per category
|
|
- Bid rate vs follower rate
|
|
- Bargain rate by category
|
|
|
|
---
|
|
|
|
## Database Queries
|
|
|
|
### Get Bargains
|
|
```sql
|
|
SELECT lot_id, title, current_bid, estimated_min_price,
|
|
(estimated_min_price - current_bid)/estimated_min_price*100 as bargain_score
|
|
FROM lots
|
|
WHERE current_bid < estimated_min_price * 0.80
|
|
AND LOT>$10,000 in identified opportunities
|
|
```
|
|
|
|
---
|
|
|
|
## Next Steps
|
|
|
|
**Today:**
|
|
```bash
|
|
# Run to activate all features
|
|
python enrich_existing_lots.py # ~2.3 hrs
|
|
python fetch_missing_bid_history.py # ~15 min
|
|
```
|
|
|
|
**This Week:**
|
|
1. Implement Bargain Hunter Dashboard
|
|
2. Add opportunity alerts
|
|
3. Create enhanced lot cards
|
|
|
|
**Next Week:**
|
|
1. Build analytics dashboards
|
|
2. Implement ML price prediction
|
|
3. Set up smart notifications
|
|
|
|
---
|
|
|
|
## Conclusion
|
|
**80%+ intelligence increase** enables:
|
|
- 🎯 Automated bargain detection
|
|
- 📊 Predictive price modeling
|
|
- ⚡ Real-time opportunity alerts
|
|
- 💰 ROI tracking
|
|
|
|
**Run migrations to activate all features.** |