summaryrefslogtreecommitdiff
path: root/src
AgeCommit message (Expand)AuthorFilesLines
23 hoursfeat: add -wiki-only flag to rerun only wiki data extractiondev2-11/+46
24 hoursrefactor: pipeline SPARQL and wiki data in paralleldev3-166/+169
24 hours.dev1-8/+0
25 hoursrefactor: decode wiki_article names once in DB, encode on senddev1-4/+0
25 hoursfix: decode wiki article names for clean storagedev2-4/+7
25 hoursfix: avoid double URL-encoding of wiki article namesdev1-4/+11
26 hoursfeat: track wiki_status_code and skip 404 entries on rerundev1-23/+47
30 hoursfix: add 429 retry with exponential backoff and increase rate limit delaydev1-9/+32
30 hoursfeat: fetch missing wiki data from custom server and populate imdb tabledev3-0/+291
30 hoursfix: skip already-classified entries in wikidata querydev1-1/+1
31 hoursfeat: set has_no_wiki_article flag for entries without Wikipedia articledev1-13/+34
31 hoursfeat: fetch Wikipedia article titles via Wikidata SPARQLdev2-0/+243
3 daysfix: use INSERT IGNORE for imdb_genre to handle re-runsdev1-1/+1
3 daysfeat: adapt genre code for n:m relation via imdb_genredev1-9/+29
3 daysfeat: populate genre table from title.basics.tsvdev1-11/+47
3 daysfix: correct TSV parsing — use line-by-line reader and proper column indicesdev1-30/+57
3 dayschore: delete .gz files after extracting in downloadImdbDatasetsdev1-0/+3
3 daysmove download pathdev1-1/+1
3 daysfeat: fetchAndUpdateImdbData — download IMDB datasets and populate imdb tabledev2-0/+352
3 daysfeat: populate imdb table with unique title IDs from linksdev1-0/+91
3 daysfeat: extract IMDB title IDs from links URLs into param fielddev3-15/+87
3 daysfeat: add AccessToken back to Config struct (json:"-" to exclude from seriali...dev1-0/+1
3 dayschore: remove access_token from config (calculated by program)dev2-4/+0
3 daysfeat: switch config to JSON; add go.mod and config.json.exampledev4-88/+57
3 dayschore: commit existing config.go changesdev1-1/+2
3 daysInitial commitdev2-0/+143