Loading...
Loading...
Queries data warehouse and answers business questions about data. Handles questions requiring database/warehouse queries including "who uses X", "how many Y", "show me Z", "find customers", "what is the count", data lookups, metrics, trends, or SQL analysis.
npx skill4agent add astronomer/agents analyzing-datacurl -LsSf https://astral.sh/uv/install.sh | shuv run scripts/cli.py pattern lookup "<user's question>"Analysis Progress:
- [ ] Step 1: pattern lookup (check for cached strategy)
- [ ] Step 2: concept lookup (check for known tables)
- [ ] Step 3: Search codebase for table definitions (Grep)
- [ ] Step 4: Read SQL file to get table/column names
- [ ] Step 5: Execute query via kernel (run_sql)
- [ ] Step 6: learn_concept (ALWAYS before presenting results)
- [ ] Step 7: learn_pattern (ALWAYS if discovery required)
- [ ] Step 8: record_pattern_outcome (if you used a pattern in Step 1)
- [ ] Step 9: Present findings to useruv run scripts/cli.py warehouse list # List available warehouses
uv run scripts/cli.py start # Start kernel with default warehouse
uv run scripts/cli.py start -w my_pg # Start with specific warehouse
uv run scripts/cli.py exec "..." # Execute Python code
uv run scripts/cli.py status # Check kernel status
uv run scripts/cli.py restart # Restart kernel
uv run scripts/cli.py stop # Stop kernel
uv run scripts/cli.py install plotly # Install additional packages# Look up a concept
uv run scripts/cli.py concept lookup customers
# Learn a new concept
uv run scripts/cli.py concept learn customers HQ.MART_CUST.CURRENT_ASTRO_CUSTS -k ACCT_ID
# List all concepts
uv run scripts/cli.py concept list
# Import concepts from warehouse.md
uv run scripts/cli.py concept import -p /path/to/warehouse.md# Look up patterns for a question
uv run scripts/cli.py pattern lookup "who uses operator X"
# Learn a new pattern
uv run scripts/cli.py pattern learn operator_usage \
-q "who uses X operator" \
-q "which customers use X" \
-s "1. Query TASK_RUNS for operator_class" \
-s "2. Join with ORGS on org_id" \
-t "HQ.MODEL_ASTRO.TASK_RUNS" \
-t "HQ.MODEL_ASTRO.ORGANIZATIONS" \
-g "TASK_RUNS is huge - always filter by date"
# Record pattern outcome
uv run scripts/cli.py pattern record operator_usage --success
# List all patterns
uv run scripts/cli.py pattern list
# Delete a pattern
uv run scripts/cli.py pattern delete operator_usage# Look up cached table schema
uv run scripts/cli.py table lookup HQ.MART_CUST.CURRENT_ASTRO_CUSTS
# Cache a table schema
uv run scripts/cli.py table cache DB.SCHEMA.TABLE -c '[{"name":"id","type":"INT"}]'
# List all cached tables
uv run scripts/cli.py table list
# Delete from cache
uv run scripts/cli.py table delete DB.SCHEMA.TABLE# View cache statistics
uv run scripts/cli.py cache status
# Clear all caches
uv run scripts/cli.py cache clear
# Clear only stale entries (older than 90 days)
uv run scripts/cli.py cache clear --stale-only# 1. Check for existing patterns
uv run scripts/cli.py pattern lookup "how many customers"
# 2. Check for known concepts
uv run scripts/cli.py concept lookup customers
# 3. Execute query
uv run scripts/cli.py exec "df = run_sql('SELECT COUNT(*) FROM HQ.MART_CUST.CURRENT_ASTRO_CUSTS')"
uv run scripts/cli.py exec "print(df)"
# 4. Cache what we learned
uv run scripts/cli.py concept learn customers HQ.MART_CUST.CURRENT_ASTRO_CUSTS -k ACCT_ID| Function | Description |
|---|---|
| Execute SQL, return Polars DataFrame |
| Execute SQL, return Pandas DataFrame |
| Polars library (imported) |
| Pandas library (imported) |
Grep pattern="<concept>" glob="**/*.sql"| Repo Type | Where to Look |
|---|---|
| Gusty | |
| dbt | |
| Concept | Table | Key Column | Date Column |
|---|---|---|---|
| customers | HQ.MART_CUST.CURRENT_ASTRO_CUSTS | ACCT_ID | - |
| organizations | HQ.MODEL_ASTRO.ORGANIZATIONS | ORG_ID | CREATED_TS |
| deployments | HQ.MODEL_ASTRO.DEPLOYMENTS | DEPLOYMENT_ID | CREATED_TS |
| task_runs | HQ.MODEL_ASTRO.TASK_RUNS | - | START_TS |
| dag_runs | HQ.MODEL_ASTRO.DAG_RUNS | - | START_TS |
| users | HQ.MODEL_ASTRO.USERS | USER_ID | - |
| accounts | HQ.MODEL_CRM.SF_ACCOUNTS | ACCT_ID | - |
METRICS_*MART_*AGG_*| Operation | Snowflake | PostgreSQL | BigQuery |
|---|---|---|---|
| Date subtract | | | |
| Case-insensitive | | | |