Total 30,488 skills, Data Processing has 1462 skills
Showing 12 of 1462 skills
Distributed computing for larger-than-RAM pandas/NumPy workflows. Use when you need to scale existing pandas/NumPy code beyond memory or across clusters. Best for parallel file processing, distributed ML, integration with existing pandas code. For out-of-core analytics on single machine use vaex; for in-memory speed use polars.
Comprehensive molecular biology toolkit. Use for sequence manipulation, file parsing (FASTA/GenBank/PDB), phylogenetics, and programmatic NCBI/PubMed access (Bio.Entrez). Best for batch processing, custom bioinformatics pipelines, BLAST automation. For quick lookups use gget; for multi-service integration use bioservices.
Professional stock price tracking, fundamental analysis, and financial reporting tool. Supports global markets (US, KR, etc.), Crypto, and Forex with real-time data. (1) Real-time quotes, (2) Valuation metrics (PE, EPS, ROE), (3) Earnings calendar and consensus, (4) High-quality Candlestick & Line charts with technical indicators (MA5/20/60).
Macro liquidity monitoring and risk early-warning system. By tracking 4 core indicators (Fed Net Liquidity, SOFR Overnight Financing Rate, MOVE Treasury Volatility Index, Yen Carry Trade Signals), it provides real-time assessment of liquidity conditions in the global financial system, outputting liquidity ratings and risk response recommendations. When users mention topics such as liquidity, Fed balance sheet reduction (QT), TGA account, reverse repo ON RRP, SOFR rate, MOVE index, Treasury volatility, yen carry trade, USDJPY and interest rate differentials, impact of QT on markets, whether money is tight, liquidity inflection points, tightening financial conditions, etc., this skill should be used. Even if users ask broadly "how is liquidity right now" or "is the Fed draining or injecting liquidity," this skill should be triggered to provide a structured analytical framework.
Bitcoin bottom-timing judgment model. By tracking 6 core indicators (RSI technical oversold, volume dry-up, MVRV ratio, social media fear index, miner shutdown price, long-term holder behavior), it comprehensively evaluates whether Bitcoin has entered a bottom-fishing zone and outputs a bottom-fishing rating and position-building recommendations. When users mention topics such as Bitcoin bottom-fishing, whether BTC has bottomed out, Bitcoin oversold, MVRV, miner shutdown price, long-term holder LTH, Bitcoin fear index, whether to buy Bitcoin, BTC position entry timing, crypto market bottom signals, Bitcoin cycle bottom, etc., be sure to use this skill. Even if the user simply asks "Can I buy the dip on Bitcoin now?" or "Has BTC finished dropping?", this skill should be triggered to provide a structured analysis framework.
Cheminformatics toolkit for fine-grained molecular control. SMILES/SDF parsing, descriptors (MW, LogP, TPSA), fingerprints, substructure search, 2D/3D generation, similarity, reactions. For standard workflows with simpler interface, use datamol (wrapper around RDKit). Use rdkit for advanced control, custom sanitization, specialized algorithms.
Query the CELLxGENE Census (61M+ cells) programmatically. Use when you need expression data across tissues, diseases, or cell types from the largest curated single-cell atlas. Best for population-scale queries, reference atlas comparisons. For analyzing your own data use scanpy or scvi-tools.
Lightweight WSI tile extraction and preprocessing. Use for basic slide processing tissue detection, tile extraction, stain normalization for H&E images. Best for simple pipelines, dataset preparation, quick tile-based analysis. For advanced spatial proteomics, multiplexed imaging, or deep learning pipelines use pathml.
Standard single-cell RNA-seq analysis pipeline. Use for QC, normalization, dimensionality reduction (PCA/UMAP/t-SNE), clustering, differential expression, and visualization. Best for exploratory scRNA-seq analysis with established workflows. For deep learning models use scvi-tools; for data format questions use anndata.
Complete mass spectrometry analysis platform. Use for proteomics workflows feature detection, peptide identification, protein quantification, and complex LC-MS/MS pipelines. Supports extensive file formats and algorithms. Best for proteomics, comprehensive MS data processing. For simple spectral comparison and metabolite ID use matchms.
Spectral similarity and compound identification for metabolomics. Use for comparing mass spectra, computing similarity scores (cosine, modified cosine), and identifying unknown compounds from spectral libraries. Best for metabolite identification, spectral matching, library searching. For full LC-MS/MS proteomics pipelines use pyopenms.
Data structure for annotated matrices in single-cell analysis. Use when working with .h5ad files or integrating with the scverse ecosystem. This is the data format skill—for analysis workflows use scanpy; for probabilistic models use scvi-tools; for population-scale queries use cellxgene-census.