gs-search

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Google Scholar Basic Search

Google Scholar 基础搜索

Search Google Scholar for papers using keyword(s). Returns structured result list via DOM scraping.
通过关键词搜索Google Scholar中的论文。通过DOM抓取返回结构化结果列表。

Arguments

参数

$ARGUMENTS contains the search keyword(s).
$ARGUMENTS包含搜索关键词。

Steps

步骤

1. Navigate

1. 导航

Use
mcp__chrome-devtools__navigate_page
:
  • url:
    https://scholar.google.com/scholar?q={URL_ENCODED_KEYWORDS}&hl=en&num=10
使用
mcp__chrome-devtools__navigate_page
:
  • url:
    https://scholar.google.com/scholar?q={URL_ENCODED_KEYWORDS}&hl=en&num=10

2. Extract results (evaluate_script)

2. 提取结果(evaluate_script)

Wait for results to load, check for CAPTCHA, then scrape the DOM:
javascript
async () => {
  // Wait for results or CAPTCHA
  for (let i = 0; i < 20; i++) {
    if (document.querySelector('#gs_res_ccl') || document.querySelector('#gs_captcha_ccl')) break;
    await new Promise(r => setTimeout(r, 500));
  }

  // CAPTCHA check
  if (document.querySelector('#gs_captcha_ccl') || document.body.innerText.includes('unusual traffic')) {
    return { error: 'captcha', message: 'Google Scholar requires CAPTCHA verification. Please complete it in your browser, then tell me to continue.' };
  }

  const items = document.querySelectorAll('#gs_res_ccl .gs_r.gs_or.gs_scl');
  const results = Array.from(items).map((item, i) => {
    const titleEl = item.querySelector('.gs_rt a');
    const meta = item.querySelector('.gs_a')?.textContent || '';
    // Parse "Author1, Author2 - Journal, Year - publisher"
    const parts = meta.split(' - ');
    const authors = parts[0]?.trim() || '';
    const journalYear = parts[1]?.trim() || '';
    const citedByEl = item.querySelector('.gs_fl a[href*="cites"]');
    const relatedEl = item.querySelector('.gs_fl a[href*="related"]');
    const versionsEl = item.querySelector('.gs_fl a[href*="cluster"]');

    return {
      n: i + 1,
      title: titleEl?.textContent?.trim() || item.querySelector('.gs_rt')?.textContent?.trim() || '',
      href: titleEl?.href || '',
      authors,
      journalYear,
      citedBy: citedByEl?.textContent?.match(/\d+/)?.[0] || '0',
      citedByUrl: citedByEl?.href || '',
      dataCid: item.getAttribute('data-cid') || '',
      fullTextUrl: (item.querySelector('.gs_ggs a') || item.querySelector('.gs_or_ggsm a'))?.href || '',
      snippet: item.querySelector('.gs_rs')?.textContent?.trim()?.substring(0, 200) || '',
      relatedUrl: relatedEl?.href || '',
      versionsUrl: versionsEl?.href || '',
      versions: versionsEl?.textContent?.match(/\d+/)?.[0] || ''
    };
  });

  const totalText = document.querySelector('#gs_ab_md')?.textContent?.trim() || '';
  const currentUrl = window.location.href;
  return { total: totalText, resultCount: results.length, currentUrl, results };
}
等待结果加载,检查是否有CAPTCHA,然后抓取DOM:
javascript
async () => {
  // Wait for results or CAPTCHA
  for (let i = 0; i < 20; i++) {
    if (document.querySelector('#gs_res_ccl') || document.querySelector('#gs_captcha_ccl')) break;
    await new Promise(r => setTimeout(r, 500));
  }

  // CAPTCHA check
  if (document.querySelector('#gs_captcha_ccl') || document.body.innerText.includes('unusual traffic')) {
    return { error: 'captcha', message: 'Google Scholar requires CAPTCHA verification. Please complete it in your browser, then tell me to continue.' };
  }

  const items = document.querySelectorAll('#gs_res_ccl .gs_r.gs_or.gs_scl');
  const results = Array.from(items).map((item, i) => {
    const titleEl = item.querySelector('.gs_rt a');
    const meta = item.querySelector('.gs_a')?.textContent || '';
    // Parse "Author1, Author2 - Journal, Year - publisher"
    const parts = meta.split(' - ');
    const authors = parts[0]?.trim() || '';
    const journalYear = parts[1]?.trim() || '';
    const citedByEl = item.querySelector('.gs_fl a[href*="cites"]');
    const relatedEl = item.querySelector('.gs_fl a[href*="related"]');
    const versionsEl = item.querySelector('.gs_fl a[href*="cluster"]');

    return {
      n: i + 1,
      title: titleEl?.textContent?.trim() || item.querySelector('.gs_rt')?.textContent?.trim() || '',
      href: titleEl?.href || '',
      authors,
      journalYear,
      citedBy: citedByEl?.textContent?.match(/\d+/)?.[0] || '0',
      citedByUrl: citedByEl?.href || '',
      dataCid: item.getAttribute('data-cid') || '',
      fullTextUrl: (item.querySelector('.gs_ggs a') || item.querySelector('.gs_or_ggsm a'))?.href || '',
      snippet: item.querySelector('.gs_rs')?.textContent?.trim()?.substring(0, 200) || '',
      relatedUrl: relatedEl?.href || '',
      versionsUrl: versionsEl?.href || '',
      versions: versionsEl?.textContent?.match(/\d+/)?.[0] || ''
    };
  });

  const totalText = document.querySelector('#gs_ab_md')?.textContent?.trim() || '';
  const currentUrl = window.location.href;
  return { total: totalText, resultCount: results.length, currentUrl, results };
}

3. Report

3. 报告

Present results as a numbered list:
Searched Google Scholar for "$ARGUMENTS": {total}

1. {title}
   Authors: {authors} | {journalYear}
   Cited by: {citedBy} | [Full text]({fullTextUrl})
   Data-CID: {dataCid}

2. ...
Always show the
dataCid
— it's the unique identifier used for citation export and "cited by" tracking.
If
fullTextUrl
is available, highlight it (means open-access PDF/HTML).
将结果以编号列表形式呈现:
已在Google Scholar中搜索“$ARGUMENTS”:{total}

1. {title}
   作者:{authors} | {journalYear}
   被引用次数:{citedBy} | [全文]({fullTextUrl})
   Data-CID:{dataCid}

2. ...
始终显示
dataCid
——这是用于引文导出和“被引用情况”追踪的唯一标识符。
如果
fullTextUrl
可用,突出显示它(表示开放获取的PDF/HTML)。

4. Follow-up

4. 后续操作

When the user wants to:
  • See more results: use
    gs-navigate-pages
    to go to next page
  • See who cited a paper: use
    gs-cited-by
    with the data-cid
  • Export to Zotero: use
    gs-export
    with the data-cid(s)
当用户想要:
  • 查看更多结果:使用
    gs-navigate-pages
    跳转到下一页
  • 查看引用某篇论文的文献:使用
    gs-cited-by
    并传入data-cid
  • 导出到Zotero:使用
    gs-export
    并传入一个或多个data-cid

CAPTCHA Handling

CAPTCHA处理

If the result contains
{error: 'captcha'}
:
  1. Tell the user: "Google Scholar is requesting CAPTCHA verification. Please complete it in your browser."
  2. Wait for user confirmation
  3. Retry the evaluate_script extraction
如果结果包含
{error: 'captcha'}
  1. 告知用户:“Google Scholar要求进行CAPTCHA验证,请在浏览器中完成验证。”
  2. 等待用户确认
  3. 重试evaluate_script提取操作

Notes

注意事项

  • This skill uses 2 tool calls:
    navigate_page
    +
    evaluate_script
  • Google Scholar has NO public API — all data extraction is via DOM scraping
  • data-cid
    is the primary identifier (cluster ID) — used across all GS skills
  • Keep request frequency low to avoid triggering CAPTCHA
  • Default
    num=10
    results per page (max 20)
  • 本技能使用2个工具调用:
    navigate_page
    +
    evaluate_script
  • Google Scholar没有公开API——所有数据提取均通过DOM抓取实现
  • data-cid
    是主要标识符(聚类ID)——在所有GS技能中通用
  • 降低请求频率以避免触发CAPTCHA
  • 默认每页返回
    num=10
    条结果(最多20条)