Search Results: data-collection

Found 56 Skills

Data Processingorientpine/honeypot

data-collection-guide

Chapter 2 데이터 수집 품질 기준 및 검증 방법

🇺🇸|EnglishTranslated

Data Processingpauljbernard/content

learning-data-collection

data-collection for evidence-based learning research and evaluation.

🇺🇸|EnglishTranslated

Data Processingfirecrawl/firecrawl-workf...

firecrawl-dashboard-reporting

Pull metrics from analytics dashboards and internal web tools with Firecrawl browser. Use when the user needs dashboard reporting, cross-platform metric summaries, authenticated analytics extraction, date-range reports, or structured metrics from web dashboards.

🇺🇸|EnglishTranslated

547

Data Processingtavily-ai/skills

crawl

Crawl any website and save pages as local markdown files. Use when you need to download documentation, knowledge bases, or web content for offline access or analysis. No code required - just provide a URL.

🇺🇸|EnglishTranslated

107

1 scripts/Attention

Automationaffaan-m/everything-claud...

data-scraper-agent

Build a fully automated AI-powered data collection agent for any public source — job boards, prices, news, GitHub, sports, anything. Scrapes on a schedule, enriches data with a free LLM (Gemini Flash), stores results in Notion/Sheets/Supabase, and learns from user feedback. Runs 100% free on GitHub Actions. Use when the user wants to monitor, collect, or track any public data automatically.

🇺🇸|EnglishTranslated

Tools & Utilitiestavily-ai/skills

tavily-crawl

Crawl websites and extract content from multiple pages via the Tavily CLI. Use this skill when the user wants to crawl a site, download documentation, extract an entire docs section, bulk-extract pages, save a site as local markdown files, or says "crawl", "get all the pages", "download the docs", "extract everything under /docs", "bulk extract", or needs content from many pages on the same domain. Supports depth/breadth control, path filtering, semantic instructions, and saving each page as a local markdown file.

🇺🇸|EnglishTranslated

AI & Machine Learningpsylch/better-skills

skill-publish

Package a agent skill into a complete GitHub repository ready for distribution via skills.sh. Generates README, LICENSE, plugin.json, marketplace.json, .gitignore, and the proper directory structure. Optionally initializes a git repo and creates a GitHub repository. This skill should be used when publishing a skill, packaging a skill for distribution, preparing a skill repo, or when the user says 'publish skill', 'package skill', 'release skill', '发布技能', '打包 skill'.

🇺🇸|EnglishTranslated

1 scripts/Checked

Data Processingagentic-reserve/blockint-...

blockchain-spider-toolkit

Points to the BlockchainSpider open-source Python/Scrapy toolkit for collecting on-chain data—transfer subgraphs around an address or tx, EVM and Solana block/transaction ingestion, receipts/logs, and optional label plugins. Use when the user wants to build datasets, offline traces, or research pipelines alongside blockchain-analytics-operations and solana-tracing-specialist—not as a substitute for RPC provider ToS, rate limits, or legal review of sensitive crawls.

🇺🇸|EnglishTranslated

Backend Developmentmohitmishra786/low-level-...

pgo

Profile-guided optimisation skill for C/C++ with GCC and Clang. Use when squeezing maximum runtime performance after standard optimisation plateaus, implementing two-stage PGO builds, collecting profile data, or applying BOLT for post-link optimisation. Activates on queries about PGO, profile-guided optimization, fprofile-generate, fprofile-use, instrumented builds, or BOLT.

🇺🇸|EnglishTranslated

Data Processingsales-skills/sales

sales-lobstr

Lobstr.io platform help — no-code web scraping platform with 50+ ready-made scrapers for Google Maps, LinkedIn Sales Navigator, Twitter, YouTube, and more. Features cookie-based login sync, scheduled automation, multi-threading, and a full API with Python SDK and MCP Server. Use when configuring a Lobstr scraper, exporting data to Google Sheets or S3, setting up scheduled scraping, working with the Lobstr API or Python SDK, or managing credits. Do NOT use for general prospect list strategy (use /sales-prospect-list), cross-platform enrichment strategy (use /sales-enrich), or integration strategy (use /sales-integration).

🇺🇸|EnglishTranslated

Data Processingasgard-ai-platform/skills

algo-seo-crawl

Implement a web crawler pipeline covering URL discovery, fetching, parsing, and storage. Use this skill when the user needs to build a site crawler, audit website structure, or collect web data systematically — even if they say 'scrape a website', 'crawl all pages', or 'site audit spider'.

🇺🇸|EnglishTranslated

Backend Developmentlaravel/agent-skills

configure-nightwatch

Configures Laravel Nightwatch data collection, sampling rates, filtering rules, and redaction policies. Use when setting up Nightwatch, managing data volume, protecting sensitive data (PII), or optimizing event collection for production workloads.

🇺🇸|EnglishTranslated