gemini-research-browser-use

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Gemini Research Browser Use

Gemini 浏览器研究工具使用指南

Overview

概述

Perform research or queries using Google Gemini via Chrome DevTools Protocol (CDP). This method reuses the user's existing Chrome login session to interact with the Gemini web interface (
https://gemini.google.com/
).
通过Chrome DevTools Protocol(CDP),借助Google Gemini进行研究或提问。本方法复用用户已有的Chrome登录会话,与Gemini网页端(
https://gemini.google.com/
)进行交互。

Prerequisites

前置条件

  1. Python + websockets Verify:
    bash
    python3 --version
    python3 -m pip show websockets
    Install if missing:
    bash
    python3 -m pip install websockets
  2. Google Chrome Verify:
    bash
    "/Applications/Google Chrome.app/Contents/MacOS/Google Chrome" --version
  3. CDP Port Availability Verify Chrome is listening (after launch in Step 2):
    bash
    curl -s http://localhost:9222/json | python3 -m json.tool
  4. Non-default user data directory (required by Chrome) Chrome CDP requires a non-default profile path. Use a cloned profile so you keep login state.
    bash
    rm -rf /tmp/chrome-gemini-profile
    rsync -a "$HOME/Library/Application Support/Google/Chrome/" /tmp/chrome-gemini-profile/
  1. Python + websockets 验证环境:
    bash
    python3 --version
    python3 -m pip show websockets
    若未安装则执行:
    bash
    python3 -m pip install websockets
  2. Google Chrome 验证版本:
    bash
    "/Applications/Google Chrome.app/Contents/MacOS/Google Chrome" --version
  3. CDP端口可用性 启动Chrome后,验证是否监听端口:
    bash
    curl -s http://localhost:9222/json | python3 -m json.tool
  4. 非默认用户数据目录(Chrome强制要求) Chrome CDP必须使用非默认配置文件路径。可克隆现有配置文件以保留登录状态:
    bash
    rm -rf /tmp/chrome-gemini-profile
    rsync -a "$HOME/Library/Application Support/Google/Chrome/" /tmp/chrome-gemini-profile/

Method Comparison

方法对比

MethodProsConsRecommended
Chrome Remote Debugging (CDP)Uses existing login, full automation, reliableRequires Chrome restart with debugging flagYes
browser-use --browser real
Simple CLIOpens new session without login❌ No
browser_subagent
Visual feedbackRate limited, may fail❌ No

方法优势劣势推荐度
Chrome远程调试(CDP)复用现有登录状态、全自动化、稳定性高需要以调试标志重启Chrome推荐
browser-use --browser real
CLI操作简单启动新会话无登录状态❌ 不推荐
browser_subagent
有可视化反馈存在调用频率限制,易失败❌ 不推荐

✅ Recommended Method: Chrome Remote Debugging (CDP)

✅ 推荐方法:Chrome远程调试(CDP)

This is the most reliable method that uses your system Chrome with existing Google login.
这是最可靠的方案,可使用系统中已登录Google账号的Chrome浏览器。

Prerequisites

前置条件

  1. Python 3 with
    websockets
    library
  2. Google Chrome installed at
    /Applications/Google Chrome.app/
  3. User logged into Google in Chrome
  1. Python 3 环境,已安装
    websockets
  2. Google Chrome 安装在
    /Applications/Google Chrome.app/
    路径
  3. Chrome中已登录Google账号

Step 1: Install websockets (if needed)

步骤1:安装websockets(若未安装)

bash
pip3 install websockets
bash
pip3 install websockets

Or in virtual environment:

或使用虚拟环境:

python3 -m venv .venv && ./.venv/bin/pip install websockets
undefined
python3 -m venv .venv && ./.venv/bin/pip install websockets
undefined

Step 2: Launch Chrome with Remote Debugging (Non-default profile)

步骤2:以远程调试模式启动Chrome(使用非默认配置文件)

Important: Close any existing Chrome windows first, or use a different debugging port.
bash
"/Applications/Google Chrome.app/Contents/MacOS/Google Chrome" \
  --remote-debugging-port=9222 \
  --user-data-dir="/tmp/chrome-gemini-profile" \
  "https://gemini.google.com/" &
Parameters explained:
  • --remote-debugging-port=9222
    : Enables CDP on port 9222
  • --user-data-dir
    : Points to your existing Chrome profile (with login session)
  • The URL opens Gemini directly
重要提示:先关闭所有Chrome窗口,或使用其他调试端口。
bash
"/Applications/Google Chrome.app/Contents/MacOS/Google Chrome" \
  --remote-debugging-port=9222 \
  --user-data-dir="/tmp/chrome-gemini-profile" \
  "https://gemini.google.com/" &
参数说明:
  • --remote-debugging-port=9222
    :在9222端口启用CDP
  • --user-data-dir
    :指向包含登录状态的Chrome配置文件副本
  • 最后一个参数直接打开Gemini网页

Step 3: Verify Connection (CDP)

步骤3:验证CDP连接

bash
curl -s http://localhost:9222/json | python3 -m json.tool
Look for the Gemini page entry:
json
{
  "title": "Google Gemini",
  "url": "https://gemini.google.com/app",
  "webSocketDebuggerUrl": "ws://localhost:9222/devtools/page/XXXXXXXX"
}
Note: If URL shows
/app
instead of just
/
, it means you're logged in.
bash
curl -s http://localhost:9222/json | python3 -m json.tool
查找Gemini页面条目:
json
{
  "title": "Google Gemini",
  "url": "https://gemini.google.com/app",
  "webSocketDebuggerUrl": "ws://localhost:9222/devtools/page/XXXXXXXX"
}
注意:若URL显示为
/app
而非仅
/
,说明已处于登录状态。

Step 4: Send Query to Gemini

步骤4:向Gemini发送查询

Save this as
gemini_query.py
or run inline:
python
import asyncio
import websockets
import json
import subprocess
import sys

async def query_gemini(query_text, wait_seconds=30):
    # Get the Gemini page WebSocket URL
    result = subprocess.run(
        ["curl", "-s", "http://localhost:9222/json"],
        capture_output=True, text=True
    )
    pages = json.loads(result.stdout)
    
    # Find Gemini page
    gemini_page = None
    for page in pages:
        if page.get("type") == "page" and "gemini.google.com" in page.get("url", ""):
            gemini_page = page
            break
    
    if not gemini_page:
        print("Error: Gemini page not found. Make sure Chrome is open with Gemini.")
        return None
    
    ws_url = gemini_page["webSocketDebuggerUrl"]
    print(f"Connecting to: {ws_url}")
    
    async with websockets.connect(ws_url) as ws:
        # Step 1: Input the query
        input_js = f'''
        const editor = document.querySelector('div[contenteditable="true"]');
        if(editor) {{
            editor.focus();
            document.execCommand('insertText', false, `{query_text}`);
            editor.dispatchEvent(new Event('input', {{bubbles: true}}));
            'success';
        }} else {{
            'editor not found';
        }}
        '''
        
        await ws.send(json.dumps({
            "id": 1,
            "method": "Runtime.evaluate",
            "params": {"expression": input_js}
        }))
        response = await ws.recv()
        result = json.loads(response)
        print(f"Input result: {result.get('result', {}).get('result', {}).get('value', 'unknown')}")
        
        # Step 2: Click send button
        await asyncio.sleep(1)
        click_js = '''
        const btn = document.querySelector('button[aria-label="傳送訊息"]');
        if(btn) { btn.click(); 'clicked'; } else { 'button not found'; }
        '''
        
        await ws.send(json.dumps({
            "id": 2,
            "method": "Runtime.evaluate",
            "params": {"expression": click_js}
        }))
        response = await ws.recv()
        result = json.loads(response)
        print(f"Click result: {result.get('result', {}).get('result', {}).get('value', 'unknown')}")
        
        # Step 3: Wait for response
        print(f"Waiting {wait_seconds} seconds for Gemini to respond...")
        await asyncio.sleep(wait_seconds)
        
        # Step 4: Extract the response
        extract_js = '''
        const markdownEls = document.querySelectorAll('.markdown');
        if(markdownEls.length > 0) {
            markdownEls[markdownEls.length - 1].innerText;
        } else {
            'No response found';
        }
        '''
        
        await ws.send(json.dumps({
            "id": 3,
            "method": "Runtime.evaluate",
            "params": {"expression": extract_js}
        }))
        response = await ws.recv()
        result = json.loads(response)
        content = result.get('result', {}).get('result', {}).get('value', 'No content')
        
        return content
将以下代码保存为
gemini_query.py
或直接运行:
python
import asyncio
import websockets
import json
import subprocess
import sys

async def query_gemini(query_text, wait_seconds=30):
    # 获取Gemini页面的WebSocket URL
    result = subprocess.run(
        ["curl", "-s", "http://localhost:9222/json"],
        capture_output=True, text=True
    )
    pages = json.loads(result.stdout)
    
    # 定位Gemini页面
    gemini_page = None
    for page in pages:
        if page.get("type") == "page" and "gemini.google.com" in page.get("url", ""):
            gemini_page = page
            break
    
    if not gemini_page:
        print("Error: Gemini page not found. Make sure Chrome is open with Gemini.")
        return None
    
    ws_url = gemini_page["webSocketDebuggerUrl"]
    print(f"Connecting to: {ws_url}")
    
    async with websockets.connect(ws_url) as ws:
        # 步骤1:输入查询内容
        input_js = f'''
        const editor = document.querySelector('div[contenteditable="true"]');
        if(editor) {{
            editor.focus();
            document.execCommand('insertText', false, `{query_text}`);
            editor.dispatchEvent(new Event('input', {{bubbles: true}}));
            'success';
        }} else {{
            'editor not found';
        }}
        '''
        
        await ws.send(json.dumps({
            "id": 1,
            "method": "Runtime.evaluate",
            "params": {"expression": input_js}
        }))
        response = await ws.recv()
        result = json.loads(response)
        print(f"Input result: {result.get('result', {}).get('result', {}).get('value', 'unknown')}")
        
        # 步骤2:点击发送按钮
        await asyncio.sleep(1)
        click_js = '''
        const btn = document.querySelector('button[aria-label="傳送訊息"]');
        if(btn) { btn.click(); 'clicked'; } else { 'button not found'; }
        '''
        
        await ws.send(json.dumps({
            "id": 2,
            "method": "Runtime.evaluate",
            "params": {"expression": click_js}
        }))
        response = await ws.recv()
        result = json.loads(response)
        print(f"Click result: {result.get('result', {}).get('result', {}).get('value', 'unknown')}")
        
        # 步骤3:等待响应
        print(f"Waiting {wait_seconds} seconds for Gemini to respond...")
        await asyncio.sleep(wait_seconds)
        
        # 步骤4:提取响应内容
        extract_js = '''
        const markdownEls = document.querySelectorAll('.markdown');
        if(markdownEls.length > 0) {
            markdownEls[markdownEls.length - 1].innerText;
        } else {
            'No response found';
        }
        '''
        
        await ws.send(json.dumps({
            "id": 3,
            "method": "Runtime.evaluate",
            "params": {"expression": extract_js}
        }))
        response = await ws.recv()
        result = json.loads(response)
        content = result.get('result', {}).get('result', {}).get('value', 'No content')
        
        return content

Main execution

主程序执行

if name == "main": query = sys.argv[1] if len(sys.argv) > 1 else "範例問題:請用繁體中文回答什麼是區塊鏈?" result = asyncio.run(query_gemini(query, wait_seconds=30)) print("\n" + "="*50) print("GEMINI RESPONSE:") print("="*50) print(result)
undefined
if name == "main": query = sys.argv[1] if len(sys.argv) > 1 else "範例問題:請用繁體中文回答什麼是區塊鏈?" result = asyncio.run(query_gemini(query, wait_seconds=30)) print("\n" + "="*50) print("GEMINI RESPONSE:") print("="*50) print(result)
undefined

Step 5: Run the Query

步骤5:运行查询

bash
python3 gemini_query.py "範例問題:你的查詢問題"
Or inline for simple queries:
bash
python3 << 'EOF'
import asyncio
import websockets
import json

async def send_to_gemini():
    # Get WebSocket URL
    import subprocess
    result = subprocess.run(["curl", "-s", "http://localhost:9222/json"], capture_output=True, text=True)
    pages = json.loads(result.stdout)
    ws_url = next(p["webSocketDebuggerUrl"] for p in pages if "gemini.google.com" in p.get("url", ""))
    
    async with websockets.connect(ws_url) as ws:
        # Input query
        await ws.send(json.dumps({
            "id": 1,
            "method": "Runtime.evaluate",
            "params": {"expression": '''
                const editor = document.querySelector('div[contenteditable="true"]');
                editor.focus();
                document.execCommand('insertText', false, '範例問題:請分析比特幣未來的價格走勢');
                editor.dispatchEvent(new Event('input', {bubbles: true}));
            '''}
        }))
        await ws.recv()
        
        # Click send
        await asyncio.sleep(1)
        await ws.send(json.dumps({
            "id": 2,
            "method": "Runtime.evaluate",
            "params": {"expression": '''document.querySelector('button[aria-label="傳送訊息"]').click()'''}
        }))
        await ws.recv()
        
        # Wait and extract
        await asyncio.sleep(30)
        await ws.send(json.dumps({
            "id": 3,
            "method": "Runtime.evaluate",
            "params": {"expression": '''
                document.querySelectorAll('.markdown')[document.querySelectorAll('.markdown').length - 1].innerText
            '''}
        }))
        response = await ws.recv()
        print(json.loads(response)['result']['result']['value'])

asyncio.run(send_to_gemini())
EOF

bash
python3 gemini_query.py "範例問題:你的查詢問題"
或直接运行单行脚本实现简单查询:
bash
python3 << 'EOF'
import asyncio
import websockets
import json

async def send_to_gemini():
    # 获取WebSocket URL
    import subprocess
    result = subprocess.run(["curl", "-s", "http://localhost:9222/json"], capture_output=True, text=True)
    pages = json.loads(result.stdout)
    ws_url = next(p["webSocketDebuggerUrl"] for p in pages if "gemini.google.com" in p.get("url", ""))
    
    async with websockets.connect(ws_url) as ws:
        # 输入查询内容
        await ws.send(json.dumps({
            "id": 1,
            "method": "Runtime.evaluate",
            "params": {"expression": '''
                const editor = document.querySelector('div[contenteditable="true"]');
                editor.focus();
                document.execCommand('insertText', false, '範例問題:請分析比特幣未來的價格走勢');
                editor.dispatchEvent(new Event('input', {bubbles: true}));
            '''}
        }))
        await ws.recv()
        
        # 点击发送按钮
        await asyncio.sleep(1)
        await ws.send(json.dumps({
            "id": 2,
            "method": "Runtime.evaluate",
            "params": {"expression": '''document.querySelector('button[aria-label="傳送訊息"]').click()'''}
        }))
        await ws.recv()
        
        # 等待并提取响应
        await asyncio.sleep(30)
        await ws.send(json.dumps({
            "id": 3,
            "method": "Runtime.evaluate",
            "params": {"expression": '''
                document.querySelectorAll('.markdown')[document.querySelectorAll('.markdown').length - 1].innerText
            '''}
        }))
        response = await ws.recv()
        print(json.loads(response)['result']['result']['value'])

asyncio.run(send_to_gemini())
EOF

Alternative Method: browser-use CLI

替代方案:browser-use CLI工具

This method is simpler but does not use your existing Chrome login. You'll need to log in manually each time.
此方法操作简单,但不会复用现有Chrome登录状态,每次使用需手动登录。

Prerequisites

前置条件

bash
undefined
bash
undefined

Create virtual environment

创建虚拟环境

python3 -m venv .venv
python3 -m venv .venv

Install browser-use

安装browser-use

./.venv/bin/pip install browser-use
undefined
./.venv/bin/pip install browser-use
undefined

Workflow

操作流程

1) Open Gemini

1) 打开Gemini页面

bash
./.venv/bin/browser-use --browser real open "https://gemini.google.com/"
bash
./.venv/bin/browser-use --browser real open "https://gemini.google.com/"

2) Get Page State

2) 获取页面状态

bash
./.venv/bin/browser-use --browser real state
Look for:
  • The input textbox:
    contenteditable=true role=textbox
  • The send button:
    aria-label=傳送訊息
bash
./.venv/bin/browser-use --browser real state
查找以下元素:
  • 输入文本框:
    contenteditable=true role=textbox
  • 发送按钮:
    aria-label=傳送訊息

3) Input Text via JavaScript eval

3) 通过JavaScript eval输入文本

bash
./.venv/bin/browser-use --browser real eval "const editor = document.querySelector('div[contenteditable=\"true\"]'); editor.focus(); document.execCommand('insertText', false, 'YOUR QUERY HERE'); editor.dispatchEvent(new Event('input', {bubbles: true}));"
bash
./.venv/bin/browser-use --browser real eval "const editor = document.querySelector('div[contenteditable=\"true\"]'); editor.focus(); document.execCommand('insertText', false, 'YOUR QUERY HERE'); editor.dispatchEvent(new Event('input', {bubbles: true}));"

4) Click Send Button

4) 点击发送按钮

bash
undefined
bash
undefined

Get current state to find button index

获取当前状态以确认按钮索引

./.venv/bin/browser-use --browser real state
./.venv/bin/browser-use --browser real state

Click the send button (replace INDEX with actual number)

点击发送按钮(将INDEX替换为实际编号)

./.venv/bin/browser-use --browser real click INDEX
undefined
./.venv/bin/browser-use --browser real click INDEX
undefined

5) Close Session

5) 关闭会话

bash
./.venv/bin/browser-use close

bash
./.venv/bin/browser-use close

Troubleshooting

故障排查

Chrome Remote Debugging Issues

Chrome远程调试问题

ProblemCauseSolution
curl: (7) Failed to connect
Chrome not running with debuggingRestart Chrome with
--remote-debugging-port=9222
WebSocket connection refusedPage ID changedRe-fetch
/json
to get new WebSocket URL
"editor not found"Page not fully loadedWait a few seconds before running script
"button not found"Send button not visibleCheck if text was actually input first
Login page instead of appWrong user-data-dir pathVerify path:
"$HOME/Library/Application Support/Google/Chrome"
DevTools remote debugging requires a non-default data directory
Chrome disallows default profile for CDPLaunch with a cloned profile:
/tmp/chrome-gemini-profile
curl
shows connection refused even though Chrome is running
CDP not listening due to profile pathEnsure
--user-data-dir
is not default and the port is free
No Gemini page found via CDP
Gemini not loaded or not logged inOpen
https://gemini.google.com/
in the launched Chrome and wait for
/app
问题原因解决方案
curl: (7) Failed to connect
Chrome未以调试模式启动使用
--remote-debugging-port=9222
参数重启Chrome
WebSocket连接被拒绝页面ID已变更重新请求
/json
获取新的WebSocket URL
"editor not found"页面未完全加载等待几秒后再运行脚本
"button not found"发送按钮未显示先确认文本是否已成功输入
跳转到登录页而非应用页用户数据目录路径错误验证路径:
"$HOME/Library/Application Support/Google/Chrome"
DevTools remote debugging requires a non-default data directory
Chrome禁止使用默认配置文件进行CDP调试使用克隆的配置文件启动:
/tmp/chrome-gemini-profile
Chrome已启动但
curl
仍显示连接拒绝
CDP因配置文件问题未监听端口确保
--user-data-dir
指向非默认路径且端口未被占用
No Gemini page found via CDP
Gemini未加载或未登录在启动的Chrome中打开
https://gemini.google.com/
,等待URL变为
/app

browser-use Issues

browser-use工具问题

ProblemCauseSolution
Not logged inbrowser-use creates isolated sessionUse Chrome Remote Debugging method instead
Unknown key: "請"
error
CLI doesn't support UnicodeUse
eval
with JavaScript
execCommand
Click doesn't workElement index changedRe-run
state
before each click

问题原因解决方案
未处于登录状态browser-use创建独立会话改用Chrome远程调试方案
Unknown key: "請"
错误
CLI不支持Unicode字符使用
eval
结合JavaScript
execCommand
输入内容
点击操作无效元素索引已变更每次点击前重新运行
state
命令获取最新索引

Best Practices

最佳实践

  1. Always use Chrome Remote Debugging for queries requiring authentication
  2. Wait 30+ seconds for complex queries (Gemini's "Deep Think" mode takes longer)
  3. Check for
    .markdown
    elements
    to verify response is complete
  4. Use inline Python for one-off queries; use the full script for automation
  5. Close Chrome debugging session when done to avoid port conflicts
  6. Keep profile cloned in
    /tmp/chrome-gemini-profile
    to avoid CDP blocking the default profile

  1. 优先使用Chrome远程调试:适用于需要身份验证的查询场景
  2. 复杂查询延长等待时间:Gemini的“深度思考”模式需30秒以上等待时间
  3. 检查
    .markdown
    元素
    :通过该元素确认响应是否完整
  4. 脚本复用策略:一次性查询使用单行脚本,自动化场景使用完整脚本
  5. 使用后关闭调试会话:避免端口冲突
  6. 保留克隆配置文件:使用
    /tmp/chrome-gemini-profile
    避免CDP限制默认配置文件

Complete Example: Crypto Price Analysis

完整示例:加密货币价格分析

完整工作流程

完整工作流程

bash
undefined
bash
undefined

Step 1: 準備 Chrome 設定檔副本 (避免 CDP 預設目錄限制)

步骤1:准备Chrome配置文件副本(规避CDP默认目录限制)

rm -rf /tmp/chrome-gemini-profile rsync -a "$HOME/Library/Application Support/Google/Chrome/" /tmp/chrome-gemini-profile/
rm -rf /tmp/chrome-gemini-profile rsync -a "$HOME/Library/Application Support/Google/Chrome/" /tmp/chrome-gemini-profile/

Step 2: 啟動 Chrome 遠端除錯模式

步骤2:启动Chrome远程调试模式

"/Applications/Google Chrome.app/Contents/MacOS/Google Chrome"
--remote-debugging-port=9222
--user-data-dir="/tmp/chrome-gemini-profile"
"https://gemini.google.com/" > /dev/null 2>&1 &
"/Applications/Google Chrome.app/Contents/MacOS/Google Chrome"
--remote-debugging-port=9222
--user-data-dir="/tmp/chrome-gemini-profile"
"https://gemini.google.com/" > /dev/null 2>&1 &

Step 3: 等待頁面載入並驗證連接

步骤3:等待页面加载并验证连接

sleep 8 curl -s http://localhost:9222/json | python3 -c "import sys, json; pages = json.load(sys.stdin); gemini = [p for p in pages if p.get('type') == 'page' and 'gemini.google.com' in p.get('url', '')]; print(f"找到 Gemini 頁面: {gemini[0]['url'] if gemini else '未找到'}")"
undefined
sleep 8 curl -s http://localhost:9222/json | python3 -c "import sys, json; pages = json.load(sys.stdin); gemini = [p for p in pages if p.get('type') == 'page' and 'gemini.google.com' in p.get('url', '')]; print(f"找到 Gemini 頁面: {gemini[0]['url'] if gemini else '未找到'}")"
undefined

方法 1: 完整查詢腳本 (query_gemini.py)

方法1:完整查询脚本(query_gemini.py)

將以下內容儲存為
query_gemini.py
:
python
import asyncio
import websockets
import json
import subprocess
import sys

async def query_gemini(query_text, wait_seconds=60):
    # Get the Gemini page WebSocket URL
    result = subprocess.run(
        ["curl", "-s", "http://localhost:9222/json"],
        capture_output=True, text=True
    )
    pages = json.loads(result.stdout)
    
    # Find Gemini page
    gemini_page = None
    for page in pages:
        if page.get("type") == "page" and "gemini.google.com" in page.get("url", ""):
            gemini_page = page
            break
    
    if not gemini_page:
        print("錯誤:找不到 Gemini 頁面。請確保 Chrome 已開啟 Gemini。")
        return None
    
    ws_url = gemini_page["webSocketDebuggerUrl"]
    print(f"正在連接到: {ws_url}")
    
    async with websockets.connect(ws_url) as ws:
        # Step 1: Input the query
        input_js = f'''
        const editor = document.querySelector('div[contenteditable="true"]');
        if(editor) {{
            editor.focus();
            document.execCommand('insertText', false, `{query_text}`);
            editor.dispatchEvent(new Event('input', {{bubbles: true}}));
            'success';
        }} else {{
            'editor not found';
        }}
        '''
        
        await ws.send(json.dumps({
            "id": 1,
            "method": "Runtime.evaluate",
            "params": {"expression": input_js}
        }))
        response = await ws.recv()
        result = json.loads(response)
        print(f"輸入結果: {result.get('result', {}).get('result', {}).get('value', 'unknown')}")
        
        # Step 2: Click send button
        await asyncio.sleep(1)
        click_js = '''
        const btn = document.querySelector('button[aria-label="傳送訊息"]');
        if(btn) { btn.click(); 'clicked'; } else { 'button not found'; }
        '''
        
        await ws.send(json.dumps({
            "id": 2,
            "method": "Runtime.evaluate",
            "params": {"expression": click_js}
        }))
        response = await ws.recv()
        result = json.loads(response)
        print(f"點擊結果: {result.get('result', {}).get('result', {}).get('value', 'unknown')}")
        
        # Step 3: Wait for response
        print(f"等待 {wait_seconds} 秒讓 Gemini 回應...")
        await asyncio.sleep(wait_seconds)
        
        # Step 4: Extract the response - try to get complete content
        extract_js = '''
        const markdownEls = document.querySelectorAll('.markdown');
        if(markdownEls.length > 0) {
            const lastMarkdown = markdownEls[markdownEls.length - 1];
            // Get all text content including nested elements
            lastMarkdown.innerText || lastMarkdown.textContent || 'Empty response';
        } else {
            'No response found';
        }
        '''
        
        await ws.send(json.dumps({
            "id": 3,
            "method": "Runtime.evaluate",
            "params": {"expression": extract_js}
        }))
        response = await ws.recv()
        result = json.loads(response)
        content = result.get('result', {}).get('result', {}).get('value', 'No content')
        
        return content
将以下内容保存为
query_gemini.py
python
import asyncio
import websockets
import json
import subprocess
import sys

async def query_gemini(query_text, wait_seconds=60):
    # 获取Gemini页面的WebSocket URL
    result = subprocess.run(
        ["curl", "-s", "http://localhost:9222/json"],
        capture_output=True, text=True
    )
    pages = json.loads(result.stdout)
    
    # 定位Gemini页面
    gemini_page = None
    for page in pages:
        if page.get("type") == "page" and "gemini.google.com" in page.get("url", ""):
            gemini_page = page
            break
    
    if not gemini_page:
        print("錯誤:找不到 Gemini 頁面。請確保 Chrome 已開啟 Gemini。")
        return None
    
    ws_url = gemini_page["webSocketDebuggerUrl"]
    print(f"正在連接到: {ws_url}")
    
    async with websockets.connect(ws_url) as ws:
        # 步骤1:输入查询内容
        input_js = f'''
        const editor = document.querySelector('div[contenteditable="true"]');
        if(editor) {{
            editor.focus();
            document.execCommand('insertText', false, `{query_text}`);
            editor.dispatchEvent(new Event('input', {{bubbles: true}}));
            'success';
        }} else {{
            'editor not found';
        }}
        '''
        
        await ws.send(json.dumps({
            "id": 1,
            "method": "Runtime.evaluate",
            "params": {"expression": input_js}
        }))
        response = await ws.recv()
        result = json.loads(response)
        print(f"輸入結果: {result.get('result', {}).get('result', {}).get('value', 'unknown')}")
        
        # 步骤2:点击发送按钮
        await asyncio.sleep(1)
        click_js = '''
        const btn = document.querySelector('button[aria-label="傳送訊息"]');
        if(btn) { btn.click(); 'clicked'; } else { 'button not found'; }
        '''
        
        await ws.send(json.dumps({
            "id": 2,
            "method": "Runtime.evaluate",
            "params": {"expression": click_js}
        }))
        response = await ws.recv()
        result = json.loads(response)
        print(f"點擊結果: {result.get('result', {}).get('result', {}).get('value', 'unknown')}")
        
        # 步骤3:等待响应
        print(f"等待 {wait_seconds} 秒讓 Gemini 回應...")
        await asyncio.sleep(wait_seconds)
        
        # 步骤4:提取响应内容 - 尝试获取完整内容
        extract_js = '''
        const markdownEls = document.querySelectorAll('.markdown');
        if(markdownEls.length > 0) {
            const lastMarkdown = markdownEls[markdownEls.length - 1];
            // 获取包含嵌套元素的所有文本内容
            lastMarkdown.innerText || lastMarkdown.textContent || 'Empty response';
        } else {
            'No response found';
        }
        '''
        
        await ws.send(json.dumps({
            "id": 3,
            "method": "Runtime.evaluate",
            "params": {"expression": extract_js}
        }))
        response = await ws.recv()
        result = json.loads(response)
        content = result.get('result', {}).get('result', {}).get('value', 'No content')
        
        return content

Main execution

主程序执行

if name == "main": query = """範例問題:請詳細分析 BTC、ETH 的價格預測走勢。 需包含相關專業指標,並用繁體中文回答。"""
result = asyncio.run(query_gemini(query, wait_seconds=60))
print("\n" + "="*50)
print("GEMINI 回應:")
print("="*50)
print(result)

**執行方式:**

```bash
python3 query_gemini.py
if name == "main": query = """範例問題:請詳細分析 BTC、ETH 的價格預測走勢。 需包含相關專業指標,並用繁體中文回答。"""
result = asyncio.run(query_gemini(query, wait_seconds=60))
print("\n" + "="*50)
print("GEMINI 回應:")
print("="*50)
print(result)

**执行方式:**

```bash
python3 query_gemini.py

方法 2: 獲取已存在的回應 (get_gemini_response.py)

方法2:提取已有响应(get_gemini_response.py)

如果 Gemini 頁面已經有回應,可以使用此腳本直接提取:
python
import asyncio
import websockets
import json
import subprocess

async def get_all_gemini_content():
    # Get the Gemini page WebSocket URL
    result = subprocess.run(
        ["curl", "-s", "http://localhost:9222/json"],
        capture_output=True, text=True
    )
    pages = json.loads(result.stdout)
    
    # Find Gemini page
    gemini_page = None
    for page in pages:
        if page.get("type") == "page" and "gemini.google.com" in page.get("url", ""):
            gemini_page = page
            break
    
    if not gemini_page:
        print("錯誤:找不到 Gemini 頁面。")
        return None
    
    ws_url = gemini_page["webSocketDebuggerUrl"]
    print(f"正在連接到: {ws_url}\n")
    
    async with websockets.connect(ws_url) as ws:
        # Extract all markdown content from the page
        extract_js = '''
        (function() {
            const markdownEls = document.querySelectorAll('.markdown');
            console.log('Found markdown elements:', markdownEls.length);
            
            if(markdownEls.length === 0) {
                return 'No markdown elements found';
            }
            
            // Get the last two markdown elements (user query and AI response)
            const responses = [];
            const startIdx = Math.max(0, markdownEls.length - 2);
            
            for(let i = startIdx; i < markdownEls.length; i++) {
                const text = markdownEls[i].innerText || markdownEls[i].textContent || '';
                if(text.trim()) {
                    responses.push(`[回應 ${i+1}]:\\n${text}`);
                }
            }
            
            return responses.join('\\n\\n' + '='.repeat(80) + '\\n\\n');
        })()
        '''
        
        await ws.send(json.dumps({
            "id": 1,
            "method": "Runtime.evaluate",
            "params": {"expression": extract_js, "returnByValue": True}
        }))
        response = await ws.recv()
        result = json.loads(response)
        content = result.get('result', {}).get('result', {}).get('value', 'No content')
        
        return content
若Gemini页面已有响应,可使用此脚本直接提取:
python
import asyncio
import websockets
import json
import subprocess

async def get_all_gemini_content():
    # 获取Gemini页面的WebSocket URL
    result = subprocess.run(
        ["curl", "-s", "http://localhost:9222/json"],
        capture_output=True, text=True
    )
    pages = json.loads(result.stdout)
    
    # 定位Gemini页面
    gemini_page = None
    for page in pages:
        if page.get("type") == "page" and "gemini.google.com" in page.get("url", ""):
            gemini_page = page
            break
    
    if not gemini_page:
        print("錯誤:找不到 Gemini 頁面。")
        return None
    
    ws_url = gemini_page["webSocketDebuggerUrl"]
    print(f"正在連接到: {ws_url}\n")
    
    async with websockets.connect(ws_url) as ws:
        # 提取页面中所有markdown内容
        extract_js = '''
        (function() {
            const markdownEls = document.querySelectorAll('.markdown');
            console.log('Found markdown elements:', markdownEls.length);
            
            if(markdownEls.length === 0) {
                return 'No markdown elements found';
            }
            
            // 获取最后两个markdown元素(用户查询和AI响应)
            const responses = [];
            const startIdx = Math.max(0, markdownEls.length - 2);
            
            for(let i = startIdx; i < markdownEls.length; i++) {
                const text = markdownEls[i].innerText || markdownEls[i].textContent || '';
                if(text.trim()) {
                    responses.push(`[回應 ${i+1}]:\\n${text}`);
                }
            }
            
            return responses.join('\\n\\n' + '='.repeat(80) + '\\n\\n');
        })()
        '''
        
        await ws.send(json.dumps({
            "id": 1,
            "method": "Runtime.evaluate",
            "params": {"expression": extract_js, "returnByValue": True}
        }))
        response = await ws.recv()
        result = json.loads(response)
        content = result.get('result', {}).get('result', {}).get('value', 'No content')
        
        return content

Main execution

主程序执行

if name == "main": result = asyncio.run(get_all_gemini_content()) print("="*80) print("GEMINI 對話內容:") print("="*80) print(result)

**執行方式:**

```bash
python3 get_gemini_response.py
if name == "main": result = asyncio.run(get_all_gemini_content()) print("="*80) print("GEMINI 對話內容:") print("="*80) print(result)

**执行方式:**

```bash
python3 get_gemini_response.py

實際使用範例

实际使用示例

bash
undefined
bash
undefined

完整流程

完整流程

rm -rf /tmp/chrome-gemini-profile &&
rsync -a "$HOME/Library/Application Support/Google/Chrome/" /tmp/chrome-gemini-profile/ &&
"/Applications/Google Chrome.app/Contents/MacOS/Google Chrome"
--remote-debugging-port=9222
--user-data-dir="/tmp/chrome-gemini-profile"
"https://gemini.google.com/" > /dev/null 2>&1 &
rm -rf /tmp/chrome-gemini-profile &&
rsync -a "$HOME/Library/Application Support/Google/Chrome/" /tmp/chrome-gemini-profile/ &&
"/Applications/Google Chrome.app/Contents/MacOS/Google Chrome"
--remote-debugging-port=9222
--user-data-dir="/tmp/chrome-gemini-profile"
"https://gemini.google.com/" > /dev/null 2>&1 &

等待並執行查詢

等待并执行查询

sleep 8 && python3 query_gemini.py
undefined
sleep 8 && python3 query_gemini.py
undefined

清理資源

资源清理

完成查詢後,建議清理臨時文件和資源:
bash
undefined
查询完成后,建议清理临时文件和资源:
bash
undefined

1. 關閉 Chrome 除錯會話

1. 关闭Chrome调试会话

pkill -9 "Google Chrome"
pkill -9 "Google Chrome"

2. 清理臨時設定檔 (可選,釋放磁碟空間)

2. 清理临时配置文件(可选,释放磁盘空间)

rm -rf /tmp/chrome-gemini-profile
rm -rf /tmp/chrome-gemini-profile

3. 清理測試過程中生成的臨時腳本和輸出文件

3. 清理测试生成的临时脚本和输出文件

rm -f query_gemini.py get_gemini_response.py get_all_gemini_content.py rm -f gemini_response.txt gemini_full_response.txt

**最佳實踐:**

1. **每次使用後關閉 Chrome** - 避免佔用 9222 端口
2. **定期清理臨時設定檔** - `/tmp/chrome-gemini-profile` 可能佔用數百 MB
3. **保持工作目錄整潔** - 刪除測試腳本,將常用腳本整合到專案中
4. **使用完整腳本** - 將上述 `query_gemini.py` 儲存為專案文件,而非每次重新建立

---
rm -f query_gemini.py get_gemini_response.py get_all_gemini_content.py rm -f gemini_response.txt gemini_full_response.txt

**最佳实践:**

1. **每次使用后关闭Chrome**:避免占用9222端口
2. **定期清理临时配置文件**:`/tmp/chrome-gemini-profile`可能占用数百MB磁盘空间
3. **保持工作目录整洁**:删除测试脚本,将常用脚本整合到项目中
4. **复用完整脚本**:将`query_gemini.py`保存为项目文件,无需每次重新创建

---

注意事項

注意事项

  1. 等待時間調整 - 複雜查詢(如深度分析)建議
    wait_seconds=60
    或更長
  2. 回應截斷問題 - 如果回應很長,可能需要多次提取或使用
    get_all_gemini_content.py
    方法
  3. 登入狀態 - 確保 Chrome 設定檔中已登入 Google 帳號
  4. 網路穩定性 - CDP 連接需要穩定的網路環境
  5. 並發限制 - 避免同時開啟多個 Chrome 除錯會話在同一端口
  1. 等待时间调整:复杂查询(如深度分析)建议设置
    wait_seconds=60
    或更长
  2. 响应截断问题:若响应内容过长,可能需要多次提取或使用
    get_all_gemini_content.py
    方案
  3. 登录状态验证:确保Chrome配置文件中已登录Google账号
  4. 网络稳定性要求:CDP连接需要稳定的网络环境
  5. 并发限制:避免在同一端口同时启动多个Chrome调试会话