Loading...
Loading...
Use when "Polars", "fast dataframe", "lazy evaluation", "Arrow backend", or asking about "pandas alternative", "parallel dataframe", "large CSV processing", "ETL pipeline", "expression API"
npx skill4agent add eyadsibai/ltk polars| Mode | Function | Executes | Use Case |
|---|---|---|---|
| Eager | | Immediately | Small data, exploration |
| Lazy | | On | Large data, pipelines |
| Operation | Purpose |
|---|---|
| Choose columns |
| Choose rows by condition |
| Add/modify columns |
| Remove columns |
| First/last n rows |
| Operation | Purpose |
|---|---|
| Group and aggregate |
| Reshape wide |
| Reshape long |
| Distinct values |
| Join Type | Description |
|---|---|
| inner | Matching rows only |
| left | All left + matching right |
| outer | All rows from both |
| cross | Cartesian product |
| semi | Left rows with match |
| anti | Left rows without match |
pl.col()| Expression | Purpose |
|---|---|
| Reference column |
| Literal value |
| All columns |
| All except |
| Category | Methods |
|---|---|
| Aggregation | |
| String | |
| DateTime | |
| Conditional | |
| Window | |
| Pandas | Polars |
|---|---|
| |
| |
| |
| |
| |
.alias()| Format | Read | Write | Notes |
|---|---|---|---|
| CSV | | | Human readable |
| Parquet | | | Fast, compressed |
| JSON | | | Newline-delimited |
| IPC/Arrow | | | Zero-copy |
scan_*| Tip | Why |
|---|---|
| Use lazy mode | Query optimization |
| Use Parquet | Column-oriented, compressed |
| Select columns early | Projection pushdown |
| Filter early | Predicate pushdown |
| Avoid Python UDFs | Breaks parallelism |
| Use expressions | Vectorized operations |
| Set dtypes on read | Avoid inference overhead |
| Tool | Best For | Limitations |
|---|---|---|
| Polars | 1-100GB, speed critical | Must fit in RAM |
| Pandas | Small data, ecosystem | Slow, memory hungry |
| Dask | Larger than RAM | More complex API |
| Spark | Cluster computing | Infrastructure overhead |
| DuckDB | SQL interface | Different API style |