Literature Retrieval and Push (Cron Skill, Multi-channel Version)
Objectives
Solidify the "retrieve literature → generate summary → scheduled push" process into a reusable pipeline instead of a one-time temporary task.
Partition Data Sources (Locally Solidified)
Prioritize using local integrated tables:
references/JCR2024_FQBJCR2025_merged.csv
: Integrates 2024 JCR and 2025 CAS Partition (aligns journals with the same name, retains original fields from both parties)
Review source tables when necessary:
references/FQBJCR2025.csv
: CAS Partition (includes major/minor category partitions, Top, OA status, WoS indexing, etc.)
- : JCR IF, IF Quartile, IF Rank (can be used as reference for SCI/JCR partitions)
Upstream supplementary sources:
- GitHub:
https://github.com/yongqianxiao/share_repo/tree/master/JCR
- Local manifest:
references/upstream/share_repo_JCR_manifest.txt
- Synchronization script:
scripts/sync_share_repo_jcr.sh
(pulls upstream to )
EasyScholar Open API Supplement:
- API:
https://www.easyscholar.cc/open/getPublicationRank
- Parameters: ,
- Example:
.../getPublicationRank?secretKey=<EASYSCHOLAR_SECRET_KEY>&publicationName=Nature
- Current configured secretKey:
<Please configure locally privately, do not write to the repository>
- Key return fields ():
- (SCI/JCR Partition)
- (latest IF)
- (5-year IF)
- // (CAS Basic/Enhanced version related partition fields)
- Supplementary fields such as , ,
EasyScholar Usage Rules:
- Default order: EasyScholar API → Local integrated table → share_repo upstream files (supplementary verification)
- Request method: GET
- Request parameters:
- (required)
- (required, journal name)
- Success criteria: and
- Failure example: (Key error)
- After calling the API, the output must be marked with "Source: EasyScholar API"
- is considered sensitive information: only desensitized display (e.g., ) is allowed in responses, full display is prohibited
EasyScholar Return Parsing (Must Follow):
- : Official data
- : All available official grading fields
- : Dataset fields selected by the user on the extended end
- : Custom data
- : Custom dataset definition (including , , ...)
- : Array elements are in the format
- Split by to get and
- Use to find the corresponding dataset () in
- Map to ~
- Final display format:
Official Field Priority Display Recommendations:
- Core partitions: , , , , ,
- Core indicators: , , ,
- Warning information:
- China Pharmaceutical University grading: (displayed by default)
- Other fields (e.g.,
swufe/cufe/ccf/cssci/ahci
) are displayed on demand, full output is not mandatory
Partition Display Strategy (Must Execute):
- Prioritize displaying the "latest caliber": based on EasyScholar returns.
- Provide a summary of "highest partition" at the same time:
- CAS: Take the optimal result from
sciBase/sciUp/sciUpSmall/sciUpTop
and mark the source field.
- SCI/JCR: Take the optimal Q partition result from (Q1 is the highest).
- If there are conflicts between multiple sources, retain the conflict description instead of silently overwriting.
Applicable Scenarios
- User says "Refer to MDRGNB standards to create daily/weekly/monthly reports on a certain topic"
- User issues a request in natural language: "Help me push xx literature to xx at xx:xx"
- User requests "Push to Feishu/WeCom/QQ/Telegram at xx:xx every day"
- User requests adding fields: journal name, CAS Partition, SCI Partition
- User reports "Task shows delivered but no full text received"
Standard Process (Must Follow Order)
-
Confirm Parameters
- Topic (e.g., MDRGNB, Liver Cancer/HCC)
- Frequency (daily/weekly/monthly) and time zone (default Asia/Shanghai)
- Push time (use default values if not specified by user: daily report 08:00, weekly report 08:00, monthly report 08:00)
- Push channel and target ID (feishu/wecom/qqbot/telegram)
- Output template (daily/weekly/monthly report)
-
Create/Update Cron Task
sessionTarget: "isolated"
payload.kind: "agentTurn"
- (default)
- Default linkage rule (must execute): When a daily report task is first created successfully, automatically create corresponding weekly and monthly report tasks for the same topic (unless explicitly disabled by the user).
- Support independent activation: Weekly/monthly reports can be created and run independently without relying on "first enabling daily reports".
-
Three-stage Pipeline (Referenced from daily-paper-skills)
- Daily report:
- Phase A (Fetch): Retrieve PubMed + initial screening and deduplication
- Phase B (Review): Generate structured TopN, 10 Questions, Four-dimensional Score, Evidence Level
- Phase C (Deliver): Push according to channel rules and record sending results
- Weekly/Monthly report:
- Phase A (Collect): Collect results of "published daily reports" within the corresponding cycle (prioritize reading daily report archives)
- Phase B (Synthesize): Conduct aggregate statistics, topic merging, trend and evidence strength change analysis based on daily reports
- Phase C (Deliver): Push according to channel rules and record sending results
-
Solidify "Retrieval + Output Structure"
- Daily report retrieval source: PubMed (default)
- Daily report must automatically expand user's spoken topic "xx literature" into an executable PubMed search query (Boolean logic + synonyms + abbreviations)
- Weekly/monthly reports do not directly re-run full retrieval by default; they prioritize automatic summarization based on daily report archives; supplementary retrieval is only triggered when daily reports are missing
- Time window: default last 24 hours (daily report)/last 7 days (weekly report)/last 30 days (monthly report); supplementary window defaults to last 7 days (daily report) or last 30 days (weekly/monthly report)
- TopN: default Top3 for daily reports (use actual number if insufficient); default Top10 for weekly reports; default Top20 for monthly reports (all can be modified according to user requirements)
- It is recommended to retain historical deduplication files (e.g., or topic_history.json) to avoid pushing the same article for consecutive days
-
Solidify "Push Reliability"
- Require using the tool to actively send in the payload prompt
- WeCom long texts must be distributed in 2-3 segments (each segment <=1200 words)
- Feishu/QQ/Telegram default to single sending; if the text is too long, split into segments as well
- Segment markers: , ,
- If is explicitly included in the payload, set cron to
-
Verification
- Manually trigger once
- Check and via
- User confirms whether the full text is visible on the target terminal
-
Weekly/Monthly Report Sample Coverage Verification (Must Execute)
- Weekly report: Count the number of days covered by daily reports within the window, default threshold >=4 days; if insufficient, mark "Insufficient samples, conclusions are for reference only" at the top of the report.
- Monthly report: Count the number of days covered by daily reports within the window, default threshold >=18 days; mark insufficient samples similarly if not met.
PubMed Configuration (Local Execution)
- API Key:
- Call suggestion: Attach to PubMed requests to improve quota and stability.
- Security constraints: This key is only used for local operation, not displayed externally, and not written to public repositories.
PubMed Search Query Expansion Rules (Must Execute)
When the user says "Push xx literature", do not directly search for "xx" as is; first expand the search query:
- Topic normalization: Convert Chinese topics to English medical subject headings (MeSH if necessary).
- Synonym expansion: Add common aliases/abbreviations/full names (connected with OR).
- Disease + scenario constraints: Add research scenario words (e.g., resistance, therapy, prognosis) if necessary.
- Time constraints: Default last 24 hours for daily reports; default aggregate last 7 days of daily reports for weekly reports; default aggregate last 30 days of daily reports for monthly reports.
- Result quality priority: Prioritize clinically relevant literature with higher evidence levels.
Search query structure example:
(Main topic synonym1 OR synonym2 OR abbreviation) AND (Research scenario word1 OR scenario word2)
Attach a line "This search query: ..." at the beginning of the output.
Enhanced Rules (Migrated from dailypaper Pipeline)
-
Minimum Quantity Supplementary
- When the number of high-quality new articles is less than TopN (default 3), automatically supplement from the last 7-day window until TopN is reached or candidates are exhausted.
- Supplementary articles must not duplicate those already selected for the day.
-
Quality Priority Sorting
- Candidate sorting priority: Evidence Level > Clinical Relevance > Method Innovation > Recency.
- For candidates with the same score, prioritize those with higher journal partitions (comprehensive judgment of SCI/Q and CAS Partition).
-
Historical Backfill and Interruption Prevention
- If the number of retrieval results for the current period (day/week/month) is extremely small, allow backfilling from the historical candidate pool to ensure daily/weekly/monthly reports are not interrupted.
- Backfilled articles must be explicitly marked with "Backfill/Supplementary" to avoid confusion with "daily new articles".
-
Standardized Archive Structure (Recommended Mandatory)
- After each daily report output, save a structured record (JSON):
topic/date/pmid/title/journal/sci/cas/cpu/evidence/score/summary/url
.
- Weekly/monthly reports prioritize reading this structured archive instead of directly parsing natural language text.
Recommended Output Template (General for MDRGNB/HCC)
- Number of new additions in this issue (daily report = today, weekly report = this week, monthly report = this month)
- TopN (default 3 for daily reports / 10 for weekly reports / 20 for monthly reports; use actual number if insufficient)
- Each article includes:
- Title
- Journal name (Journal)
- PMID
- Link
- CAS Partition (latest caliber + highest partition; write "To be verified" if not found)
- SCI/JCR Partition (latest caliber + highest partition; write "To be verified" if not found)
- China Pharmaceutical University grading (CPU)
- 80-120 word Chinese abstract
- One-sentence clinical/research value
- 10 Question Points (1-10)
- Four-dimensional Score (Engineering Application Value/Architecture Innovation/Theoretical Contribution/Result Reliability, 1-10) + Overall Evaluation
- Evidence Level (Meta/RCT/Observation/In vitro/Review/Other)
- Conclusions of this issue
- Suggestions for next issue (1-2 items)
Channel Mapping (Default)
- Feishu: , or
- WeCom: ,
- QQ: ,
target=qqbot:c2c:<openid>
or
- Telegram: ,
Cron Template (General for Multi-channel)
json
{
"action": "add",
"job": {
"name": "<daily|weekly|monthly>-<topic>-<channel>-<HHMM>",
"schedule": { "kind": "cron", "expr": "<m> <h> * * *", "tz": "Asia/Shanghai" },
"sessionTarget": "isolated",
"wakeMode": "now",
"payload": {
"kind": "agentTurn",
"model": "openai-codex/gpt-5.3-codex",
"timeoutSeconds": 1200,
"message": "先检索文献并生成结构化快报,再使用 message 工具发送到指定渠道/目标(channel=<feishu|wecom|qqbot|telegram>, target=<ID>)。若正文过长则按2-3段发送(每段<=1200字),最后仅输出固定完成语。"
},
"delivery": { "mode": "none" }
}
}
Common Expressions (default push time is 08:00, can be modified):
- Daily report:
- Weekly report (last day of the week): (8:00 AM Sunday)
- Monthly report (last day of the month): + in-task verification "Check if the next day crosses the month, only send on the day before the month change"
Default Linkage Creation Strategy:
- When creating a daily report, create corresponding weekly and monthly reports for the same topic at the same time.
- If the user does not specify a time: daily/weekly/monthly reports all default to 08:00; if the user specifies a time, overwrite the default value with the user's time.
- Weekly reports default to "8:00 AM on the last day of the week", monthly reports default to "8:00 AM on the last day of the month"; both allow independent modification and can be started/stopped independently.
- If the corresponding weekly/monthly report already exists, skip creation and only update the configuration (avoid duplicate tasks).
- Monthly report "end-of-month push" must undergo secondary verification: only send when "tomorrow crosses the month" to avoid multiple triggers on 28-31.
Troubleshooting
-
Phenomenon:
but user cannot see full text
- Force to use in payload for segmented sending
- First send a short test message to verify target reachability
- Re-test full text with
-
Phenomenon: Push failed or timed out
- Degradation order: Full text segmentation → Short summary (Top3 + links) → Alert message (including failure reason and retry suggestions)
- Record each failure in run summary (for tracking)
-
Phenomenon: Inconsistent content between QQ and WeCom
- Use the same "content generation template", only replace
Constraints
- Do not output only a single placeholder word (e.g., )
- Must mark when partition information is uncertain, do not fabricate
- 10 Question Points, Four-dimensional Score, and Evidence Level are fixed output items and cannot be omitted
- Weekly/monthly reports must be automatically summarized based on published daily reports (aggregation/induction/trend) by default, cannot be replaced with re-retrieval by default
- When the user requests "keep consistent", the text structure of QQ and WeCom must be consistent
- Each selected article in Weekly Top10 and Monthly Top20 must include a 1-line "selection reason" (e.g., high evidence level/high partition/high clinical value)