summaryrefslogtreecommitdiff
path: root/src/wikiarticle.go
AgeCommit message (Collapse)AuthorFilesLines
30 hoursfix: decode wiki article names for clean storagedev1-3/+6
- wikidata.go: url.PathUnescape SPARQL titles before storing - wikiarticle.go: PathUnescape on read, PathEscape on send - DB holds decoded names; URLs are always freshly encoded
30 hoursfix: avoid double URL-encoding of wiki article namesdev1-4/+11
- wiki_article values are already URL-encoded in the DB - Build query URL manually instead of url.Values.Encode() - Only escape username (not pre-encoded)
31 hoursfeat: track wiki_status_code and skip 404 entries on rerundev1-23/+47
- queryWikiArticle returns HTTP status code alongside entry data - Always record wiki_status_code for every request (success or failure) - Skip entries with wiki_status_code = 404 in future runs - Only update data fields on HTTP 200; non-200 only records status - Log line shows updated vs skipped (non-200) counts
35 hoursfix: add 429 retry with exponential backoff and increase rate limit delaydev1-9/+32
- Retry up to 5 times on HTTP 429 with 2s/4s/8s/16s backoff - Move inter-request delay before each request (was after) - Increase base delay from 1s to 2s between requests - Fix: only sleep after first request (skip delay on first call)
35 hoursfeat: fetch missing wiki data from custom server and populate imdb tabledev1-0/+283
- Add wiki_server and wiki_username config fields - Query custom server for each wiki_article entry - Extract description, synopsis (Plot), year, poster_url, license, license_url, num_accolades from structured JSON response - Serial processing with 1 req/s rate limit - Update only entries missing at least one target column