ODPS Information Schema
This skill is for Information Schema (IS) metadata queries ONLY. If the user's question is about DDL/DML, listing tables, or general MaxCompute usage (not IS views), do NOT use this skill — use MCP tools (list_tables, get_table_schema) or odpscmd instead.
Query MaxCompute metadata through INFORMATION_SCHEMA views for storage, cost, permission, task, and governance analysis.
Prerequisites <a name="prerequisites"></a>
MANDATORY: Every IS query MUST set namespace flag. Without it, ALL queries fail with "Table not found".
- MCP:
hints={"odps.namespace.schema":"true"}
in
- odpscmd:
SET odps.namespace.schema=true;
before each query
- No exceptions. Applies to ALL
SYSTEM_CATALOG.INFORMATION_SCHEMA.*
queries.
IS views require tenant-level permission. If you get access errors, the user needs tenant-level role — see references/ram-policies.md for Policy template.
Data freshness: History views (TASKS_HISTORY, TUNNELS_HISTORY) have ~5 min delay, realtime views ~3 hours. For yesterday's data, query after 06:00 to ensure completeness.
Tenant-level vs Project-level IS: MaxCompute has two IS levels.
Tenant-level (
SYSTEM_CATALOG.INFORMATION_SCHEMA.*
) is the default — it covers all projects under the same metadata center and is
recommended.
Project-level (
) is per-project only, requires
install package Information_Schema.systables
, and is
being deprecated (since 2024-03 new projects no longer auto-install). Key differences: (1) project-level has fewer views (no CATALOGS, VOLUMES, FOREIGN_SERVERS, SCHEMAS, PARTITION_ACCESS_INFO, TABLE_ACCESS_INFO, QUOTA_USAGE; has SCHEMA_PRIVILEGES which tenant lacks); (2) project-level TASKS_HISTORY has
while tenant-level does NOT; (3) project-level
is always
while tenant-level is the actual project name. See
Project-level IS Adaptation for transformation rules.
For MCP configuration, see references/mcp-tools-reference.md.
Execution Channels <a name="channels"></a>
MCP preferred when
mcp__maxcompute-catalog__*
tools are available. Fall back to odpscmd on connection/auth errors.
| Channel | Use For | Key Detail |
|---|
| MCP (tenant-level) | DQL, metadata, search | + hints={"odps.namespace.schema":"true"}
; sync limit 1000 rows; supports IS views (verified 2026-04) |
| MCP (project-level) | DQL, metadata, search | + (no namespace flag); view prefix: |
| odpscmd (tenant-level) | DDL/DML, large results, MCP unavailable | SET odps.namespace.schema=true;
prefix required |
| odpscmd (project-level) | DDL/DML, large results, MCP unavailable | No namespace flag; view prefix: |
See references/mcp-tools-reference.md for 15 MCP tools with routing guide.
Important Rules <a name="rules"></a>
- Always set namespace flag — every tenant-level IS query, no exceptions. Project-level IS queries do NOT need this flag
- Filter by — TASKS_HISTORY / TUNNELS_HISTORY are partitioned; always add filter to avoid full scan
- No SELECT * — use explicit column names
- Cross-metadata-center NOT supported — each region is independent
- last_access_time is NULL for partitioned tables — use
COALESCE(last_access_time, last_modified_time)
or check PARTITIONS view. Also: not collected for ALGO jobs or Hologres direct reads; up to 24h delay from actual access.
- status values — TASKS_HISTORY: (normal), , (rare). Never count Terminated as failure.
- operate_type values — TUNNELS_HISTORY: , , , ,
- Views without time fields — COLUMNS has no time column. TABLE_PRIVILEGES/COLUMN_PRIVILEGES have no time column, only . These views support static snapshot only, not time-series.
- cost_cpu / cost_mem are DOUBLE — unit: 100×core×seconds / MB×seconds. Convert to CU-hours:
- Duration — use
DATEDIFF(end_time, start_time, 'ss')
(seconds). No column exists.
- Non-existent fields trap — see Critical Column Reference below
- JOIN IS views requires 3-field key — when joining any two IS views, the ON condition MUST include , , AND . Missing any one causes incorrect results in multi-catalog environments
Project-level IS Adaptation <a name="project-level-adaptation"></a>
All SQL templates in this skill default to
tenant-level syntax (
SYSTEM_CATALOG.INFORMATION_SCHEMA.*
+ namespace flag). If the environment only supports
project-level IS, apply these mechanical transformations to every generated SQL:
| Transform | Tenant-level (default) | Project-level |
|---|
| View prefix | SYSTEM_CATALOG.INFORMATION_SCHEMA.
| |
| Namespace flag (MCP) | hints={"odps.namespace.schema":"true"}
| (remove flag) |
| Namespace flag (odpscmd) | SET odps.namespace.schema=true;
| Remove entirely |
| Scope | All projects in metadata center | Current project only |
| Views unavailable | — | CATALOGS, VOLUMES, FOREIGN_SERVERS, SCHEMAS, PARTITION_ACCESS_INFO, TABLE_ACCESS_INFO, QUOTA_USAGE |
| View exclusive to this level | — | SCHEMA_PRIVILEGES |
| TASKS_HISTORY extra column | — | (project name; tenant-level lacks this) |
| value | Actual project name | Fixed |
Example transformation:
-- Tenant-level (default):
SET odps.namespace.schema=true;
SELECT table_name, data_length FROM SYSTEM_CATALOG.INFORMATION_SCHEMA.TABLES WHERE ...
-- Project-level (after transformation):
SELECT table_name, data_length FROM Information_Schema.tables WHERE ...
When to switch: If a tenant-level query fails with
(and namespace flag is correctly set), or if the user explicitly says they only have project-level IS, apply the transformation rules above to all subsequent queries.
Critical Column Name Reference <a name="column-reference"></a>
| Concept | Correct | Wrong |
|---|
| Table size | | size_bytes, size |
| Task instance | | task_id |
| Task submitter | | task_owner |
| Task project | (tenant-level) | project_name, task_schema (project-level IS only) |
| Task error | | error_message |
| Task duration | DATEDIFF(end_time, start_time, 'ss')
| duration_ms |
| Task status | | task_status |
| Task input size | | scan_bytes, processed_bytes |
| Table comment | | comment |
| Column comment | | comment |
| Privilege grantee | , | grantee |
| Privilege time | | grant_time |
| Resource size | | size_bytes |
| Tunnel session | | tunnel_id |
| Tunnel data size | | size_bytes |
| User identity | | — |
| Timestamp type | | TIMESTAMP |
| Table modified | | last_ddl_time |
| cost_cpu type | | BIGINT |
For verified query examples using these columns, see references/verified-queries.md.
Routing Index <a name="routing"></a>
SKILL.md contains critical column names and namespace rules. Load sub-files only when needed:
- If multiple rows match, load ALL matched files. E.g., a non-English term causal query needs both terminology.md and playbooks+causal-templates.
- If SKILL.md inline info (tables below) is sufficient, do NOT load extra files.
- NOT about IS views? → This skill is not applicable. Use MCP tools (list_tables, get_table_schema, execute_sql) or odpscmd for DDL/DML/general queries.
| Query Type | When | Load Extra File |
|---|
| NOT an IS query | DDL/DML, list tables, run SQL, general ODPS | None — use MCP tools or odpscmd instead |
| Single-view query | One IS view, no JOIN | None — SKILL.md only |
| Live monitoring | TASKS / QUOTA_USAGE | None — SKILL.md only |
| 2+ IS view JOIN | Combining views | references/joins.md |
| Named metric/template | "comment coverage", "CU trend", "zombie table detection" | references/verified-queries.md + references/metrics.md |
| Multi-step diagnosis | "Why did CU spike?", root-cause analysis | references/playbooks.md + references/causal-templates.md |
| Non-English synonyms | "cpu时间", "作业时长", "存储占用", or any CJK/localized terms | references/terminology.md (or use inline mapping below) |
| Schema/field lookup | "What columns does X have?" | references/views-reference.md |
| Access denied error | Permission denied on IS view | references/ram-policies.md |
| Troubleshooting | Table not found, timeout, etc. | references/TROUBLESHOOTING.md |
Anti-pattern: Do NOT load extra files for these
| User says | Looks like | Actually is | Load |
|---|
| "storage pressure, list top 20 tables" | Diagnostics | Single-view | SKILL.md only |
| "permission audit, who has SELECT on X" | Playbook | Single-view | SKILL.md only |
| "cost attribution by owner" | Causal | Single-view | SKILL.md only |
<!-- SYNC: derived from references/joins.md paths #1 #2 #3 #4 #5 #8 #10 -->
Inline Join Conditions (for 2+ view JOINs)
When joining IS views, you MUST include
,
, AND
in join conditions. Common join paths:
| Left | Right | Join Condition |
|---|
| TABLES | COLUMNS | t.table_catalog = c.table_catalog AND t.table_schema = c.table_schema AND t.table_name = c.table_name
|
| TABLES | PARTITIONS | t.table_catalog = p.table_catalog AND t.table_schema = p.table_schema AND t.table_name = p.table_name
|
| TABLES | TABLE_PRIVILEGES | t.table_catalog = p.table_catalog AND t.table_schema = p.table_schema AND t.table_name = p.table_name
|
| TABLES | TABLE_ACCESS_INFO | t.table_catalog = a.table_catalog AND t.table_schema = a.table_schema AND t.table_name = a.table_name
|
| TABLES | TABLE_LABELS | t.table_catalog = l.table_catalog AND t.table_schema = l.table_schema AND t.table_name = l.table_name
|
| USERS | USER_ROLES | |
| COLUMNS | COLUMN_LABELS | c.table_catalog = l.table_catalog AND c.table_schema = l.table_schema AND c.table_name = l.table_name AND c.column_name = l.column_name
|
For all 16 join paths, see references/joins.md. The 7 most common paths are inlined below.
<!-- SYNC: derived from references/terminology.md — metric terms: storage usage, task CPU consumption, task execution duration, CU-hours, queue wait; dimension terms: zombie tables; metric: task failure rate -->
Inline Terminology Mapping (common non-English terms)
| Non-English term | English equivalent | Correct column/source | Common mistake |
|---|
| cpu时间 / CPU消耗 | CPU time / CPU consumption | (DOUBLE), ÷100÷3600 = CU·hour | cpu_time |
| 作业时长 / 任务耗时 | Task duration / task elapsed time | DATEDIFF(end_time, start_time, 'ss')
| duration_ms |
| 存储占用 / 表大小 | Storage usage / table size | (÷1073741824 = GB) | size_bytes |
| 僵尸表 | Zombie table | TABLES + TABLE_ACCESS_INFO | — |
| 排队时间 | Queue wait time | NOT available in IS views | — |
| CU时 / CU消耗 | CU-hours / CU consumption | SUM(cost_cpu) / 100.0 / 3600
| — |
| 任务失败率 | Task failure rate | ratio in TASKS_HISTORY | — |
For all 59 terms, see references/terminology.md.
Error Recovery <a name="error-recovery"></a>
| Error Signal | Root Cause | Fix |
|---|
| on IS view | Missing namespace flag | Add SET odps.namespace.schema=true;
/ hints={"odps.namespace.schema":"true"}
. Verify with Q30 smoke test |
| / on IS view | Missing tenant-level role | Verify access with check_access(include_grants=true)
. User needs tenant-level role — load references/ram-policies.md for Policy template |
on SYSTEM_CATALOG.INFORMATION_SCHEMA.*
(namespace flag correctly set) | Environment only supports project-level IS | Apply Project-level IS Adaptation transformation rules to all subsequent queries: switch prefix to , remove namespace flag |
Information_Schema not found
/ | Project-level IS not installed in this project | User must run install package Information_Schema.systables
as project owner or Super_Administrator. After install, query as Information_Schema.view_name
(no namespace flag). Note: project-level IS is being deprecated — prefer tenant-level |
Object 'Information_Schema' not found
on new project | New projects (since 2024-03) don't auto-install project-level IS | Switch to tenant-level IS (SYSTEM_CATALOG.INFORMATION_SCHEMA.*
) or manually install package |
| TASKS_HISTORY query slow/expensive | No filter | Add WHERE ds >= TO_CHAR(DATEADD(GETDATE(), -14, 'dd'), 'yyyymmdd')
|
| MCP returns exactly 1000 rows | Sync limit truncation | Re-run with , or add tighter WHERE/LIMIT |
| Used non-existent column | Check Critical Column Reference above — common: →, → |
| TUNNELS_HISTORY sync timeout (>30s) | Tunnel record volume much larger than TASKS_HISTORY | Use + , or reduce ds to 1 day |
| Async timeout (>30s) | Large scan | Use first; add filter; split query |
| IS view shows no recent data | ~5 min delay for history views | Query yesterday's data after 06:00 |
| odpscmd query hangs | Large result set or full-table scan | Use odps_is_query.sh -t <seconds>
to set timeout (default 300s); add filter |
| Namespace flag set but still | Other causes (wrong project, typo, schema issue) | Load references/TROUBLESHOOTING.md for T1–T7 scenarios |
Core Views <a name="core-views"></a>
| View | Purpose | Key Columns |
|---|
| Table metadata | table_name, owner_name, data_length, table_type, lifecycle, last_modified_time |
| Column metadata | column_name, data_type, column_comment, is_partition_key |
| Partition metadata | partition_name, data_length, create_time, last_modified_time |
| Running jobs (live, seconds delay) | inst_id, task_name, owner_name, status, cpu_usage (core×100), mem_usage (MB) |
| Query history | inst_id, task_name, owner_name, status, task_type, start_time, end_time, result, cost_cpu, input_bytes, ds |
| Tunnel history | session_id, object_name, operate_type, data_size, owner_name, ds |
| Table permissions | table_name, user_name, privilege_type, expired |
| ⚠️ | Table access stats | table_name, access_count, access_bytes, last_access_time |
| Subscription quota monitoring | name, cpu_elastic_quota_max, cpu_elastic_quota_used, mem_elastic_quota_max, mem_elastic_quota_used |
| Project users | user_name, user_id, identity_provider |
| User-role mapping | user_name, role_name, user_role_catalog |
| ⚠️ | Project list | catalog_name, status, owner_name, region |
For all 31 views with complete field definitions, see references/views-reference.md. Views marked ⚠️ are tenant-level only (not available in project-level IS).
Additional Resources <a name="resources"></a>
- references/views-reference.md — Complete field definitions for all 31 IS views
- references/verified-queries.md — 30 pre-validated SQL query templates (including smoke test)
- references/entities.md — Entity-to-table mapping
- references/metrics.md — Metric definitions with SQL expressions
- references/joins.md — Join paths between views
- references/playbooks.md — 23 diagnostic scenario playbooks
- references/causal-templates.md — Root-cause analysis templates
- references/terminology.md — 59-term synonym dictionary for NL2SQL
- references/ram-policies.md — Tenant permission setup and Policy template
- references/mcp-tools-reference.md — 15 MCP tools with routing guide + MCP setup + installation
- scripts/odps_is_query.sh — CLI query tool (16 query types + custom, including smoke-test). Supports for timeout (default 300s), for date, for project. Custom mode only allows SELECT (DDL/DML rejected).
- references/TROUBLESHOOTING.md — 7 error scenarios with fix templates (T1–T7)
Official Documentation <a name="docs"></a>