| Age | Commit message (Collapse) | Author | Files | Lines | |
|---|---|---|---|---|---|
| 3 days | fix: correct TSV parsing — use line-by-line reader and proper column indices | dev | 1 | -30/+57 | |
| - Replace csv.Reader with bufio.Scanner to avoid quote-parsing issues that skipped ~355 entries (e.g. tt1853728 was on line 4.8M and got lost when csv.Reader encountered malformed quoted fields earlier) - Fix column indices: startYear=rec[5], runtimeMinutes=rec[7] (was rec[4]/rec[5] which mapped to isAdult/startYear) - Update basics for ALL imdb entries, not just those missing ratings | |||||
| 3 days | chore: delete .gz files after extracting in downloadImdbDatasets | dev | 1 | -0/+3 | |
| 3 days | move download path | dev | 1 | -1/+1 | |
| 3 days | feat: fetchAndUpdateImdbData — download IMDB datasets and populate imdb table | dev | 1 | -0/+348 | |
| - Check for imdb entries with NULL average_rating - Download title.basics.tsv.gz and title.ratings.tsv.gz to imdbdata/ - Decompress alongside gzip originals - Parse only rows matching our imdb_ids (memory-efficient) - Update: average_rating, num_votes, title_type, primary_title, original_title, start_year, runtime_minutes - Results: 3394 ratings, 3093 basics updated out of 3448 entries | |||||
