kb-meta-fetch
Original:🇺🇸 English
Translated
1 scriptsChecked / no sensitive code detected
Fetch journal articles from Crossref published after a user-specified date and insert them into PostgreSQL `journals` with DOI deduplication. Use when incrementally ingesting journal metadata from `journals_issn` into `journals`.
7installs
Sourcetiangong-ai/skills
Added on
NPX Install
npx skill4agent add tiangong-ai/skills kb-meta-fetchTags
Translated version includes tags in frontmatterSKILL.md Content
View Translation Comparison →KB Meta Fetch
Core Goal
- Pull records from Crossref after a given
journal-article.--from-date - Read ISSN seed rows from (
journals_issn,journal).issn1 - Insert rows into with
journals.ON CONFLICT (doi) DO NOTHING - Keep the implementation aligned with .
1_crossref_multi_increment.py
Run Workflow
- Set database connection env vars (user-managed keys prefixed with ):
KB_
KB_DB_HOSTKB_DB_PORTKB_DB_NAMEKB_DB_USERKB_DB_PASSWORD- (required, log output directory)
KB_LOG_DIR
- Run incremental fetch with a required date:
bash
python3 scripts/crossref_multi_increment.py --from-date 2024-05-01- If executing through an tool call, set timeout to 1800 seconds (30 minutes).
exec
- Check logs in:
- (UTC timestamp, one file per run)
${KB_LOG_DIR}/crossref-YYYYMMDD-HHMMSS.log
- Build user-facing summary strictly from the current run output:
- Prefer emitted by
RUN_SUMMARY_JSON.crossref_multi_increment.py - If JSON is unavailable, parse only this run's .
${KB_LOG_DIR}/crossref-YYYYMMDD-HHMMSS.log - must mean rows inserted in this run (after DOI dedup), not cumulative rows in table.
total_inserted
Behavior Contract
- Query Crossref endpoint: .
https://api.crossref.org/journals/{issn}/works - Filter with .
type:journal-article,from-pub-date:<from-date> - Keep only items whose equals target journal title (case-insensitive).
container-title - Continue pagination with cursor until no matching items remain.
- Store fields in :
journals,title,doi,journal,authors.date - Reporting/announcement metrics must use current-run log/summary only.
- Do not compute announcement counts via database-wide or time-window SQL such as .
WHERE date >= ...
Scope Boundary
- Implement only Crossref incremental fetch + insert into .
journals
Script
scripts/crossref_multi_increment.py