Total 30,511 skills, Data Processing has 1462 skills
Showing 12 of 1462 skills
Extract Udemy course content to markdown. Use when user asks to scrape/crawl Udemy course pages.
Collect app events via evalpkgs into sqlite, then filter/report capture_results to Feishu Bitable with retry-safe writeback. Use for collect-start/collect-stop/filter/report/retry-reset workflows.
Retrieve paper metadata from arXiv using keyword queries and save results as JSONL (`papers/papers_raw.jsonl`). **Trigger**: arXiv, arxiv, paper search, metadata retrieval, 文献检索, 论文检索, 拉取元数据, 离线导入. **Use when**: 需要一个初始论文集合(survey/snapshot 的 Stage C1),来源为 arXiv(在线检索或离线导入 export)。 **Skip if**: 已经有可用的 `papers/papers_raw.jsonl`,或数据源不是 arXiv。 **Network**: 在线检索需要网络;离线 `--input <export.*>` 不需要网络。 **Guardrail**: 只做 metadata;不要在 `output/` 写长 prose。
High-performance Rust web crawler with stealth mode, LLM-ready Markdown export, multi-format output, sitemap discovery, and robots.txt support. Optimized for content extraction, site mapping, structure analysis, and LLM/RAG pipelines.
Design and document statistical algorithms with pseudocode and complexity analysis
Six-phase protocol for adapting methods across research domains
Numerical algorithms and computational techniques for statistics
Structured methodology for constructing and verifying mathematical proofs in statistical research
Comprehensive guide for MDAnalysis - the Python library for analyzing molecular dynamics trajectories. Use for trajectory loading, RMSD/RMSF calculations, distance/angle/dihedral analysis, atom selections, hydrogen bonds, solvent accessible surface area, protein structure analysis, membrane analysis, and integration with Biopython. Essential for MD simulation analysis.
A Pythonic interface to the HDF5 binary data format. It allows you to store huge amounts of numerical data and easily manipulate that data from NumPy. Features a hierarchical structure similar to a file system. Use for storing datasets larger than RAM, organizing complex scientific data hierarchically, storing numerical arrays with high-speed random access, keeping metadata attached to data, sharing data between languages, and reading/writing large datasets in chunks.
A Python data visualization library based on matplotlib. It provides a high-level interface for drawing attractive and informative statistical graphics. Great for exploring relationships between variables and visualizing distributions. Use for statistical data visualization, exploratory data analysis (EDA), relationship plots, distribution plots, categorical comparisons, regression visualization, heatmaps, cluster maps, and creating publication-quality statistical graphics from Pandas DataFrames.
Protein Dynamics, Evolution, and Structure analysis. Specialized in Normal Mode Analysis (NMA) using Anisotropic (ANM) and Gaussian Network Models (GNM). Features tools for structural ensemble analysis, PCA, and co-evolutionary analysis (Evol). Use for protein flexibility prediction, collective motions, structural ensemble comparison, hinge region identification, binding site analysis, MD trajectory filtering, and evolutionary analysis.