feature-engineering-kit

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Feature Engineering Kit

特征工程工具包

Automated feature engineering with encodings, scaling, and transformations.
自动完成包含编码、缩放和转换的特征工程。

Features

特性

  • Encodings: One-hot, label, target encoding
  • Scaling: Standard, min-max, robust scaling
  • Polynomial Features: Generate interactions
  • Binning: Discretize continuous features
  • Date Features: Extract time-based features
  • Text Features: TF-IDF, word counts
  • Missing Value Handling: Imputation strategies
  • 编码:独热编码、标签编码、目标编码
  • 缩放:标准化缩放、最小-最大缩放、鲁棒缩放
  • 多项式特征:生成交互项
  • 分箱:将连续特征离散化
  • 日期特征:提取基于时间的特征
  • 文本特征:TF-IDF、词频统计
  • 缺失值处理:插补策略

CLI Usage

命令行界面(CLI)使用方法

bash
python feature_engineering.py --data train.csv --output engineered.csv --config config.json
bash
python feature_engineering.py --data train.csv --output engineered.csv --config config.json

Dependencies

依赖项

  • scikit-learn>=1.3.0
  • pandas>=2.0.0
  • numpy>=1.24.0
  • scikit-learn>=1.3.0
  • pandas>=2.0.0
  • numpy>=1.24.0