<feed xmlns='http://www.w3.org/2005/Atom'>
<title>hnimdbbot/src/main.go, branch main</title>
<subtitle>Pure Cinema!</subtitle>
<link rel='alternate' type='text/html' href='https://git.iamfabulous.de/hnimdbbot/'/>
<entry>
<title>feat: add three-level logging with per-request debug output</title>
<updated>2026-06-26T12:14:52+00:00</updated>
<author>
<name>dev</name>
</author>
<published>2026-06-26T12:14:52+00:00</published>
<link rel='alternate' type='text/html' href='https://git.iamfabulous.de/hnimdbbot/commit/?id=06536f57b1fdc76212da6b85fbc9287cc4f0de70'/>
<id>06536f57b1fdc76212da6b85fbc9287cc4f0de70</id>
<content type='text'>
- New --log-level flag: debug (default info), info, silent
  debug: every API request logged (method, URL, status, duration)
  info:  normal events (batch progress, entry counts, summaries)
  silent: only warnings and fatal errors
- Replaced all log.Printf/Fatalf calls with level-gated helpers
- API request timing added to queryWikiArticle, queryWikidataBatch, downloadFile
- Retries and backoff logged in debug mode
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
- New --log-level flag: debug (default info), info, silent
  debug: every API request logged (method, URL, status, duration)
  info:  normal events (batch progress, entry counts, summaries)
  silent: only warnings and fatal errors
- Replaced all log.Printf/Fatalf calls with level-gated helpers
- API request timing added to queryWikiArticle, queryWikidataBatch, downloadFile
- Retries and backoff logged in debug mode
</pre>
</div>
</content>
</entry>
<entry>
<title>feat: add -wiki-only flag to rerun only wiki data extraction</title>
<updated>2026-06-26T01:37:51+00:00</updated>
<author>
<name>dev</name>
</author>
<published>2026-06-26T01:37:51+00:00</published>
<link rel='alternate' type='text/html' href='https://git.iamfabulous.de/hnimdbbot/commit/?id=15d06c9802d08037283aa218ccc2f92a9236fcc9'/>
<id>15d06c9802d08037283aa218ccc2f92a9236fcc9</id>
<content type='text'>
- fetchWikiArticlesData is standalone again (re-extracted from consumer)
- -wiki-only flag skips SPARQL pipeline, runs only wiki data fetch
- Default behavior: full pipeline (SPARQL + wiki data in parallel)
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
- fetchWikiArticlesData is standalone again (re-extracted from consumer)
- -wiki-only flag skips SPARQL pipeline, runs only wiki data fetch
- Default behavior: full pipeline (SPARQL + wiki data in parallel)
</pre>
</div>
</content>
</entry>
<entry>
<title>refactor: pipeline SPARQL and wiki data in parallel</title>
<updated>2026-06-26T01:26:07+00:00</updated>
<author>
<name>dev</name>
</author>
<published>2026-06-26T01:26:07+00:00</published>
<link rel='alternate' type='text/html' href='https://git.iamfabulous.de/hnimdbbot/commit/?id=8e2d742e59b3923852e1ef6e7a5e2ee1de14ce45'/>
<id>8e2d742e59b3923852e1ef6e7a5e2ee1de14ce45</id>
<content type='text'>
- Merge fetchWikiArticles + fetchWikiArticlesData into one pipeline
- SPARQL producer fetches batches, commits each to DB, forwards resolved articles
- Wiki data consumer runs concurrently, fetching at 2s/request
- Each SPARQL batch commits independently (no global transaction)
- Rate limits respected for both Wikidata SPARQL and wiki server
- No parallel requests to either endpoint
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
- Merge fetchWikiArticles + fetchWikiArticlesData into one pipeline
- SPARQL producer fetches batches, commits each to DB, forwards resolved articles
- Wiki data consumer runs concurrently, fetching at 2s/request
- Each SPARQL batch commits independently (no global transaction)
- Rate limits respected for both Wikidata SPARQL and wiki server
- No parallel requests to either endpoint
</pre>
</div>
</content>
</entry>
<entry>
<title>feat: fetch missing wiki data from custom server and populate imdb table</title>
<updated>2026-06-25T19:14:39+00:00</updated>
<author>
<name>dev</name>
</author>
<published>2026-06-25T19:14:39+00:00</published>
<link rel='alternate' type='text/html' href='https://git.iamfabulous.de/hnimdbbot/commit/?id=e13cbe6a4fd1ebe3f2c3bfc86c54e8dd17c59624'/>
<id>e13cbe6a4fd1ebe3f2c3bfc86c54e8dd17c59624</id>
<content type='text'>
- Add wiki_server and wiki_username config fields
- Query custom server for each wiki_article entry
- Extract description, synopsis (Plot), year, poster_url, license,
  license_url, num_accolades from structured JSON response
- Serial processing with 1 req/s rate limit
- Update only entries missing at least one target column
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
- Add wiki_server and wiki_username config fields
- Query custom server for each wiki_article entry
- Extract description, synopsis (Plot), year, poster_url, license,
  license_url, num_accolades from structured JSON response
- Serial processing with 1 req/s rate limit
- Update only entries missing at least one target column
</pre>
</div>
</content>
</entry>
<entry>
<title>feat: fetch Wikipedia article titles via Wikidata SPARQL</title>
<updated>2026-06-25T18:07:08+00:00</updated>
<author>
<name>dev</name>
</author>
<published>2026-06-25T18:07:08+00:00</published>
<link rel='alternate' type='text/html' href='https://git.iamfabulous.de/hnimdbbot/commit/?id=fa742660190a7d3b7b6f068565ce543d413edbab'/>
<id>fa742660190a7d3b7b6f068565ce543d413edbab</id>
<content type='text'>
- Query Wikidata SPARQL in batches of 30 for entries missing wiki_article
- Store wiki_article title in imdb table
- Respect rate limits with configurable delay and retry on 5xx/429
- Skip entries that have no Wikipedia article
- Removed unique constraint on wiki_article (multiple entries can share one)
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
- Query Wikidata SPARQL in batches of 30 for entries missing wiki_article
- Store wiki_article title in imdb table
- Respect rate limits with configurable delay and retry on 5xx/429
- Skip entries that have no Wikipedia article
- Removed unique constraint on wiki_article (multiple entries can share one)
</pre>
</div>
</content>
</entry>
<entry>
<title>feat: fetchAndUpdateImdbData — download IMDB datasets and populate imdb table</title>
<updated>2026-06-24T01:46:14+00:00</updated>
<author>
<name>dev</name>
</author>
<published>2026-06-24T01:46:14+00:00</published>
<link rel='alternate' type='text/html' href='https://git.iamfabulous.de/hnimdbbot/commit/?id=86069f011f35e339a30ffb717308990369c5f29f'/>
<id>86069f011f35e339a30ffb717308990369c5f29f</id>
<content type='text'>
- Check for imdb entries with NULL average_rating
- Download title.basics.tsv.gz and title.ratings.tsv.gz to imdbdata/
- Decompress alongside gzip originals
- Parse only rows matching our imdb_ids (memory-efficient)
- Update: average_rating, num_votes, title_type, primary_title,
  original_title, start_year, runtime_minutes
- Results: 3394 ratings, 3093 basics updated out of 3448 entries
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
- Check for imdb entries with NULL average_rating
- Download title.basics.tsv.gz and title.ratings.tsv.gz to imdbdata/
- Decompress alongside gzip originals
- Parse only rows matching our imdb_ids (memory-efficient)
- Update: average_rating, num_votes, title_type, primary_title,
  original_title, start_year, runtime_minutes
- Results: 3394 ratings, 3093 basics updated out of 3448 entries
</pre>
</div>
</content>
</entry>
<entry>
<title>feat: populate imdb table with unique title IDs from links</title>
<updated>2026-06-24T01:33:12+00:00</updated>
<author>
<name>dev</name>
</author>
<published>2026-06-24T01:33:12+00:00</published>
<link rel='alternate' type='text/html' href='https://git.iamfabulous.de/hnimdbbot/commit/?id=eec189de8a5be0a18103a215d369c6135b86e9ff'/>
<id>eec189de8a5be0a18103a215d369c6135b86e9ff</id>
<content type='text'>
- Extract distinct IMDb title IDs from links.param (host=imdb.com)
- Skip IDs already in imdb table and non-title params (nm, ls, etc.)
- Insert 3448 unique title IDs into imdb.imdb_id
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
- Extract distinct IMDb title IDs from links.param (host=imdb.com)
- Skip IDs already in imdb table and non-title params (nm, ls, etc.)
- Insert 3448 unique title IDs into imdb.imdb_id
</pre>
</div>
</content>
</entry>
<entry>
<title>feat: extract IMDB title IDs from links URLs into param field</title>
<updated>2026-06-24T01:19:26+00:00</updated>
<author>
<name>dev</name>
</author>
<published>2026-06-24T01:19:26+00:00</published>
<link rel='alternate' type='text/html' href='https://git.iamfabulous.de/hnimdbbot/commit/?id=163b9bddd68f7ffc8fc4164acee333fe5bff3c7a'/>
<id>163b9bddd68f7ffc8fc4164acee333fe5bff3c7a</id>
<content type='text'>
- Query links table for IMDB title URLs (field=1, host=imdb.com)
- Extract ttIDs via regex and batch-update links.param
- 5662 rows updated successfully
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
- Query links table for IMDB title URLs (field=1, host=imdb.com)
- Extract ttIDs via regex and batch-update links.param
- 5662 rows updated successfully
</pre>
</div>
</content>
</entry>
<entry>
<title>feat: switch config to JSON; add go.mod and config.json.example</title>
<updated>2026-06-23T23:52:52+00:00</updated>
<author>
<name>dev</name>
</author>
<published>2026-06-23T23:52:52+00:00</published>
<link rel='alternate' type='text/html' href='https://git.iamfabulous.de/hnimdbbot/commit/?id=7af896fce4eac0579076aa15a3e987345dc9f9e8'/>
<id>7af896fce4eac0579076aa15a3e987345dc9f9e8</id>
<content type='text'>
- Replace Viper-based config with encoding/json (config.go)
- Add config.json with sensible defaults (gitignored)
- Add config.json.example with empty values as reference
- Initialize go module (go.mod)
- Update main.go to use LoadConfig()
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
- Replace Viper-based config with encoding/json (config.go)
- Add config.json with sensible defaults (gitignored)
- Add config.json.example with empty values as reference
- Initialize go module (go.mod)
- Update main.go to use LoadConfig()
</pre>
</div>
</content>
</entry>
<entry>
<title>Initial commit</title>
<updated>2026-06-23T23:41:31+00:00</updated>
<author>
<name>dev</name>
</author>
<published>2026-06-23T23:41:31+00:00</published>
<link rel='alternate' type='text/html' href='https://git.iamfabulous.de/hnimdbbot/commit/?id=2e3e5b3efc6a8d9471a73c5553f88fa94e28bd3a'/>
<id>2e3e5b3efc6a8d9471a73c5553f88fa94e28bd3a</id>
<content type='text'>
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
</pre>
</div>
</content>
</entry>
</feed>
