Search Results: content-extraction

Found 102 Skills

Marketing & Growthbencium/bencium-claude-co...

bencium-aeo

Generate AEO-optimized content (Answer Engine Optimization) for AI search visibility - ChatGPT, Claude, Gemini, AI Overviews. Use when optimizing websites for AI citations, creating FAQ schemas, evidence panels, or analyzing content for LLM extraction readiness.

🇺🇸|EnglishTranslated

Document Processingaidenwu0209/paddleocr-ski...

paddleocr-doc-parsing

Advanced document parsing with PaddleOCR. Returns complete document structure including text, tables, formulas, charts, and layout information. Claude extracts relevant content based on user needs.

🇺🇸|EnglishTranslated

5 scripts/Checked

Backend Developmentanswerzhao/agent-skills

web-reader

Implement web page content extraction capabilities using the z-ai-web-dev-sdk. Use this skill when the user needs to scrape web pages, extract article content, retrieve page metadata, or build applications that process web content. Supports automatic content extraction with title, HTML, and publication time retrieval.

🇺🇸|EnglishTranslated

1 scripts/Checked

Automationbadlogic/pi-skills

browser-tools

Interactive browser automation via Chrome DevTools Protocol. Use when you need to interact with web pages, test frontends, or when user interaction with a visible browser is required.

🇺🇸|EnglishTranslated

8 scripts/Attention

Tools & Utilitiesmerit-systems/agentcash-s...

web-research

Neural web search and content extraction using x402-protected APIs. Better than WebSearch for deep research and WebFetch for blocked sites. USE FOR: - Deep web research and investigation - Finding similar pages to a reference URL - Extracting clean text from web pages - Scraping sites that block standard fetchers - Getting direct answers to factual questions - Research requiring multiple sources TRIGGERS: - "research", "investigate", "deep dive", "find sources" - "similar to", "pages like", "more like this" - "scrape", "extract content from", "get the text from" - "blocked site", "can't access", "paywall" - "what is", "explain", "answer this" Use `npx agentcash fetch` for stableenrich.dev endpoints. Prefer Exa for semantic/neural search, Firecrawl for direct scraping.

🇺🇸|EnglishTranslated

Tools & Utilitiestavily-ai/skills

tavily-extract

Extract clean markdown or text content from specific URLs via the Tavily CLI. Use this skill when the user has one or more URLs and wants their content, says "extract", "grab the content from", "pull the text from", "get the page at", "read this webpage", or needs clean text from web pages. Handles JavaScript-rendered pages, returns LLM-optimized markdown, and supports query-focused chunking for targeted extraction. Can process up to 20 URLs in a single call.

🇺🇸|EnglishTranslated

Document Processingrookie-ricardo/erduo-skil...

web-to-markdown

Convert a web URL into cleaned Markdown with deterministic routing. Use when Codex needs to read article-like content from links and should apply source-aware fetch strategies: default to r.jina.ai for general pages (including X/Twitter), use defuddle.md for YouTube links, and use browser-impersonated extraction for WeChat/Zhihu/Feishu pages with Mozilla Readability cleanup.

🇺🇸|EnglishTranslated

4 scripts/Attention

Data Processingcreminiai/cremini-skills

web-fetch

Fetch web page content via Chrome DevTools Protocol (CDP). Full JS rendering, handles redirects (including Google News). Use when you need to read the text content of a web page, scrape articles, or extract information from URLs. Zero dependencies — Python 3 stdlib only. Cross-platform (Mac, Windows, Linux).

🇺🇸|EnglishTranslated

1 scripts/Checked

Tools & Utilitiesgoogleworkspace/cli

recipe-draft-email-from-doc

Read content from a Google Doc and use it as the body of a Gmail message.

🇺🇸|EnglishTranslated

Tools & Utilitiestavily-ai/skills

tavily-cli

Web search, content extraction, crawling, and deep research via the Tavily CLI. Use this skill whenever the user wants to search the web, find articles, research a topic, look something up online, extract content from a URL, grab text from a webpage, crawl documentation, download a site's pages, discover URLs on a domain, or conduct in-depth research with citations. Also use when they say "fetch this page", "pull the content from", "get the page at https://", "find me articles about", or reference extracting data from external websites. This provides LLM-optimized web search, content extraction, site crawling, URL discovery, and AI-powered deep research — capabilities beyond what agents can do natively. Do NOT trigger for local file operations, git commands, deployments, or code editing tasks.

🇺🇸|EnglishTranslated

Tools & Utilitiestavily-ai/skills

tavily-crawl

Crawl websites and extract content from multiple pages via the Tavily CLI. Use this skill when the user wants to crawl a site, download documentation, extract an entire docs section, bulk-extract pages, save a site as local markdown files, or says "crawl", "get all the pages", "download the docs", "extract everything under /docs", "bulk extract", or needs content from many pages on the same domain. Supports depth/breadth control, path filtering, semantic instructions, and saving each page as a local markdown file.

🇺🇸|EnglishTranslated

Document Processingtw93/waza

read

Use when fetching a URL, web page, or PDF as Markdown. Not for local files already in the repo.

🇺🇸|EnglishTranslated