Loading...
Loading...
SQL optimization best practices for OceanBase database (MySQL & Oracle modes). Covers query optimization, index usage, execution plan analysis, slow query tuning, and performance optimization techniques. Activates for SQL optimization, query performance, index design, execution plan, slow query, database performance.
npx skill4agent add amber-moe/oceanbase-doc-skills oceanbase-sql-optimization| Aspect | MySQL Mode | Oracle Mode |
|---|---|---|
| EXPLAIN Syntax | | |
| System Views | | |
| Data Types | | |
| Time Functions | | |
| Schema Access | | |
| Limit Syntax | | |
| String Concatenation | | |
obclient [SALES_DB]> EXPLAIN SELECT * FROM order_table WHERE order_date >= '2024-01-01';obclient [SALES_DB]> EXPLAIN SELECT * FROM order_table WHERE order_date >= DATE '2024-01-01';+------------------------------------------------------------------------------------------------+
| Query Plan |
+------------------------------------------------------------------------------------------------+
| =========================================== |
| |ID|OPERATOR |NAME |EST. ROWS|COST | |
| ------------------------------------------- |
| |0 |TABLE SCAN |order_table|1000 |1000 | |
| |1 | TABLE GET |order_table|1000 |1000 | |
+=========================================== |
| Outputs & filters: |
| ------------------------------------- |
| 0 - output([order_table.order_id], [order_table.customer_id], [order_table.order_date]), |
| filter([order_table.order_date >= '2024-01-01']), |
| access([order_table.order_id], [order_table.customer_id], [order_table.order_date]), |
| partitions(p0) |
+------------------------------------------------------------------------------------------------+TABLE SCANTABLE GETpartitions(p1)falsetrue-- ✅ GOOD: Index on WHERE clause column
obclient [SALES_DB]> CREATE INDEX idx_order_date ON order_table(order_date);
-- ✅ GOOD: Composite index for multi-column queries (column order matters)
obclient [SALES_DB]> CREATE INDEX idx_customer_date ON order_table(customer_id, order_date);
-- ✅ GOOD: Unique index for unique constraints
obclient [SALES_DB]> CREATE UNIQUE INDEX idx_order_number ON order_table(order_number);
-- ✅ GOOD: Index with STORING clause for covering index
obclient [SALES_DB]> CREATE INDEX idx_customer_date ON order_table(customer_id, order_date)
STORING (total_amount, status);
-- ❌ BAD: Index on rarely queried columns
obclient [SALES_DB]> CREATE INDEX idx_notes ON order_table(notes);CREATE INDEX idx_expr ON t1((c1 + c2))-- ✅ GOOD: Select only needed columns
obclient [SALES_DB]> SELECT order_id, customer_id, total_amount
FROM order_table
WHERE order_date >= '2024-01-01';
-- ❌ BAD: Select all columns
obclient [SALES_DB]> SELECT * FROM order_table WHERE order_date >= '2024-01-01';-- ✅ GOOD: Use indexed column in WHERE
obclient [SALES_DB]> SELECT * FROM order_table WHERE order_id = 12345;
-- ✅ GOOD: Use range queries on indexed columns
obclient [SALES_DB]> SELECT * FROM order_table
WHERE order_date BETWEEN '2024-01-01' AND '2024-12-31';
-- ❌ BAD: Function on indexed column prevents index usage
obclient [SALES_DB]> SELECT * FROM order_table WHERE YEAR(order_date) = 2024;
-- ✅ GOOD: Rewrite to use index
obclient [SALES_DB]> SELECT * FROM order_table
WHERE order_date >= '2024-01-01' AND order_date < '2025-01-01';partitions-- ✅ GOOD: Partition key in WHERE clause enables partition pruning
obclient [SALES_DB]> CREATE TABLE order_table (
order_id INT,
customer_id INT,
order_date DATE
) PARTITION BY HASH(order_id) PARTITIONS 5;
obclient [SALES_DB]> SELECT * FROM order_table WHERE order_id = 12345;+------------------------------------------------------------------------------------------------+
| Query Plan |
+------------------------------------------------------------------------------------------------+
| ================================================= |
| |ID|OPERATOR |NAME |EST. ROWS|EST.TIME(us)| |
| ------------------------------------------------- |
| |0 |TABLE SCAN|order_table|990 |383 | |
| ================================================= |
| Outputs & filters: |
| ------------------------------------- |
| 0 - output([order_table.order_id], [order_table.customer_id], [order_table.order_date]), |
| filter(nil), |
| access([order_table.order_id], [order_table.customer_id], [order_table.order_date]), |
| partitions(p1) -- Only one partition accessed! |
+------------------------------------------------------------------------------------------------+-- ❌ BAD: Missing partition key prevents pruning (scans all partitions)
obclient [SALES_DB]> SELECT * FROM order_table WHERE customer_id = 1001;
-- Check execution plan: partitions(p0, p1, p2, p3, p4) - all partitions scanned-- ✅ GOOD: Range query matches partition boundaries
obclient [SALES_DB]> CREATE TABLE order_table (
order_id INT,
customer_id INT,
order_date DATE
) PARTITION BY RANGE(order_date) (
PARTITION p0 VALUES LESS THAN('2024-01-01'),
PARTITION p1 VALUES LESS THAN('2024-02-01'),
PARTITION p2 VALUES LESS THAN('2024-03-01')
);
obclient [SALES_DB]> SELECT * FROM order_table
WHERE order_date >= '2024-01-01' AND order_date < '2024-02-01';+------------------------------------------------------------------------------------------------+
| Query Plan |
+------------------------------------------------------------------------------------------------+
| ================================================= |
| |ID|OPERATOR |NAME |EST. ROWS|EST.TIME(us)| |
| ------------------------------------------------- |
| |0 |TABLE SCAN|order_table|1 |46 | |
| ================================================= |
| Outputs & filters: |
| ------------------------------------- |
| 0 - output([order_table.order_id], [order_table.customer_id], [order_table.order_date]), |
| filter([order_table.order_date >= '2024-01-01'], [order_table.order_date < '2024-02-01']),
| access([order_table.order_id], [order_table.customer_id], [order_table.order_date]), |
| partitions(p1) -- Only partition p1 accessed! |
+------------------------------------------------------------------------------------------------+-- ❌ BAD: Function prevents partition pruning
obclient [SALES_DB]> SELECT * FROM order_table WHERE MONTH(order_date) = 1;
-- Check execution plan: partitions(p0, p1, p2) - all partitions scanned
-- ✅ GOOD: Rewrite to use partition pruning
obclient [SALES_DB]> SELECT * FROM order_table
WHERE order_date >= '2024-01-01' AND order_date < '2024-02-01';/*+ USE_NL(table1, table2) */-- ✅ GOOD: NLJ with indexed join column
obclient [SALES_DB]> CREATE INDEX idx_customer_id ON order_table(customer_id);
obclient [SALES_DB]> EXPLAIN SELECT /*+USE_NL(c, o)*/ o.order_id, c.customer_name
FROM customer_table c
INNER JOIN order_table o ON c.customer_id = o.customer_id
WHERE c.customer_type = 'VIP';+------------------------------------------------------------------------------------------------+
| Query Plan |
+------------------------------------------------------------------------------------------------+
| =========================================== |
| |ID|OPERATOR |NAME |EST. ROWS|EST.TIME(us)| |
| ------------------------------------------- |
| |0 |NESTED-LOOP JOIN| |990 |37346 | |
| |1 | TABLE SCAN |c |999 |669 | |
| |2 | TABLE GET |o |1 |36 | -- Index lookup! |
| =========================================== |
| Outputs & filters: |
| ------------------------------------- |
| 0 - output([o.order_id], [c.customer_name]), filter(nil), |
| conds(nil), nl_params_([c.customer_id]) |
| 1 - output([c.customer_id], [c.customer_name]), filter([c.customer_type = 'VIP']), |
| access([c.customer_id], [c.customer_name]), partitions(p0) |
| 2 - output([o.order_id]), filter(nil), |
| access([o.order_id]), partitions(p0), |
| range_key([o.customer_id]), range([? = o.customer_id]) -- Index used! |
+------------------------------------------------------------------------------------------------+/*+ USE_HASH(table1, table2) */-- ✅ GOOD: Hash Join for large table joins
obclient [SALES_DB]> EXPLAIN SELECT /*+USE_HASH(o, c)*/ o.order_id, c.customer_name
FROM order_table o
INNER JOIN customer_table c ON o.customer_id = c.customer_id;+------------------------------------------------------------------------------------------------+
| Query Plan |
+------------------------------------------------------------------------------------------------+
| ======================================= |
| |ID|OPERATOR |NAME|EST. ROWS|EST.TIME(us)| |
| --------------------------------------- |
| |0 |HASH JOIN | |98010000 |66774608 | |
| |1 | TABLE SCAN|o |100000 |68478 | |
| |2 | TABLE SCAN|c |100000 |68478 | |
| ======================================= |
| Outputs & filters: |
| ------------------------------------- |
| 0 - output([o.order_id], [c.customer_name]), filter(nil), |
| equal_conds([o.customer_id = c.customer_id]), other_conds(nil) |
+------------------------------------------------------------------------------------------------+/*+ USE_MERGE(table1, table2) */-- ✅ GOOD: Merge Join when data is sorted
obclient [SALES_DB]> EXPLAIN SELECT /*+USE_MERGE(o, c)*/ o.order_id, c.customer_name
FROM order_table o
INNER JOIN customer_table c ON o.customer_id = c.customer_id
ORDER BY o.customer_id;-- ✅ GOOD: Control join order explicitly
obclient [SALES_DB]> SELECT /*+LEADING(c, o)*/ o.order_id, c.customer_name
FROM customer_table c
INNER JOIN order_table o ON c.customer_id = o.customer_id
WHERE c.customer_type = 'VIP'
AND o.order_date >= '2024-01-01';
-- ❌ BAD: Let optimizer choose suboptimal order
obclient [SALES_DB]> SELECT o.order_id, c.customer_name
FROM order_table o
INNER JOIN customer_table c ON o.customer_id = c.customer_id
WHERE c.customer_type = 'VIP';-- ✅ GOOD: Index on join column enables efficient joins
obclient [SALES_DB]> CREATE INDEX idx_customer_id ON order_table(customer_id);
-- Then join queries can use index (NLJ with TABLE GET)
obclient [SALES_DB]> SELECT o.*, c.customer_name
FROM order_table o
INNER JOIN customer_table c ON o.customer_id = c.customer_id;-- ✅ GOOD: COUNT(*) is optimized
obclient [SALES_DB]> SELECT COUNT(*) FROM order_table WHERE order_date >= '2024-01-01';
-- ✅ GOOD: COUNT(column) when you need non-NULL count
obclient [SALES_DB]> SELECT COUNT(customer_id) FROM order_table;
-- ❌ BAD: COUNT(DISTINCT) on large datasets can be slow
obclient [SALES_DB]> SELECT COUNT(DISTINCT customer_id) FROM order_table;
-- ✅ GOOD: Use GROUP BY with LIMIT (MySQL) or FETCH FIRST (Oracle) for top N
-- MySQL Mode:
obclient [SALES_DB]> SELECT customer_id, SUM(total_amount) AS total
FROM order_table
GROUP BY customer_id
ORDER BY total DESC
LIMIT 10;
-- Oracle Mode:
obclient [SALES_DB]> SELECT customer_id, SUM(total_amount) AS total
FROM order_table
GROUP BY customer_id
ORDER BY total DESC
FETCH FIRST 10 ROWS ONLY;-- ❌ BAD: Correlated subquery (executes for each row)
obclient [SALES_DB]> SELECT o.order_id, o.total_amount
FROM order_table o
WHERE o.total_amount > (
SELECT AVG(total_amount)
FROM order_table
WHERE customer_id = o.customer_id
);
-- ✅ GOOD: Convert to JOIN
obclient [SALES_DB]> SELECT o.order_id, o.total_amount
FROM order_table o
INNER JOIN (
SELECT customer_id, AVG(total_amount) as avg_amount
FROM order_table
GROUP BY customer_id
) avg_orders ON o.customer_id = avg_orders.customer_id
WHERE o.total_amount > avg_orders.avg_amount;-- ✅ GOOD: EXISTS stops at first match
obclient [SALES_DB]> SELECT * FROM order_table o
WHERE EXISTS (
SELECT 1 FROM customer_table c
WHERE c.customer_id = o.customer_id
AND c.customer_type = 'VIP'
);
-- ⚠️ OK: IN works but may be slower for large lists
obclient [SALES_DB]> SELECT * FROM order_table
WHERE customer_id IN (SELECT customer_id FROM customer_table WHERE customer_type = 'VIP');-- Enable SQL Audit
obclient> ALTER SYSTEM SET enable_sql_audit = true;
-- Set SQL Audit memory percentage (default: 3%, range: [0,80])
obclient> SET GLOBAL ob_sql_audit_percentage = 3;oceanbase.GV$OB_SQL_AUDITTIME_TO_USEC(NOW())GV$OB_SQL_AUDITSYSDATE - INTERVAL '30' MINUTE-- Find TOP SQL by CPU usage in last 30 minutes
obclient> SELECT sql_id, COUNT(*) AS executions, SUM(execute_time) AS tot_cpu_time,
AVG(execute_time) AS avg_cpu_time,
SUM(execute_time)/(30*60*1000*1000) AS cpu_cnt, query_sql
FROM oceanbase.GV$OB_SQL_AUDIT -- MySQL: add 'oceanbase.' prefix
WHERE tenant_id = 'mysql001' -- MySQL: string, Oracle: number
AND request_time BETWEEN (TIME_TO_USEC(NOW())-30*60*1000*1000) AND TIME_TO_USEC(NOW()) -- MySQL
-- Oracle: BETWEEN (SYSDATE - INTERVAL '30' MINUTE) AND SYSDATE
AND is_executor_rpc = 0
GROUP BY sql_id
HAVING COUNT(*) > 1
ORDER BY cpu_cnt DESC
LIMIT 10; -- MySQL: LIMIT, Oracle: FETCH FIRST 10 ROWS ONLYusec_to_time()TO_CHAR()-- Find queries with execution time > 100ms
obclient> SELECT request_id,
usec_to_time(request_time) AS request_time, -- MySQL
-- TO_CHAR(request_time, 'YYYY-MM-DD HH24:MI:SS.FF') AS request_time, -- Oracle
elapsed_time, queue_time, execute_time, flt_trace_id, query_sql
FROM oceanbase.v$ob_sql_audit -- MySQL: add 'oceanbase.' prefix
WHERE elapsed_time > 100000 -- > 100ms in microseconds
ORDER BY elapsed_time DESC
LIMIT 10; -- MySQL: LIMIT, Oracle: FETCH FIRST 10 ROWS ONLYoceanbase.gv$ob_sql_auditGV$OB_SQL_AUDIT-- Get detailed statistics for a specific SQL
obclient> SELECT sql_id, query_sql, executions, elapsed_time, execute_time,
queue_time, return_rows, affected_rows, partition_cnt,
table_scan, is_hit_plan, plan_id
FROM oceanbase.gv$ob_sql_audit -- MySQL: add 'oceanbase.' prefix
WHERE sql_id = 'your_sql_id'
ORDER BY elapsed_time DESC;oceanbase.GV$OB_PLAN_CACHE_PLAN_STATGV$OB_PLAN_CACHE_PLAN_STAT-- Get execution plan from plan cache
obclient> SELECT tenant_id, svr_ip, svr_port, sql_id, plan_id,
last_active_time, first_load_time, outline_data
FROM oceanbase.GV$OB_PLAN_CACHE_PLAN_STAT -- MySQL: add 'oceanbase.' prefix
WHERE tenant_id = 1002 AND sql_id = 'your_sql_id'
AND svr_ip = 'xxx.xxx.xxx.xxx' AND svr_port = 35046;
-- Get plan details
obclient> SELECT operator, name, rows, cost
FROM oceanbase.GV$OB_PLAN_CACHE_PLAN_EXPLAIN -- MySQL: add 'oceanbase.' prefix
WHERE tenant_id = 1002 AND plan_id = 741
AND svr_ip = 'xxx.xxx.xxx.xxx' AND svr_port = 35046;obclient> SELECT user, tenant, sql_id, concat(time, 's') AS time, info,
svr_ip, svr_port, trace_id
FROM oceanbase.GV$OB_PROCESSLIST
WHERE state = 'ACTIVE' ORDER BY time DESC LIMIT 10;obclient> SELECT user_name, tenant_name, sql_id, time || 's' AS time, info,
svr_ip, svr_port, trace_id
FROM GV$OB_PROCESSLIST
WHERE state = 'ACTIVE' ORDER BY time DESC FETCH FIRST 10 ROWS ONLY;obclient> SELECT sql_id, COUNT(*) AS executions, AVG(execute_time) AS avg_cpu_time
FROM oceanbase.GV$OB_SQL_AUDIT
WHERE tenant_id = 'mysql001'
AND request_time BETWEEN (TIME_TO_USEC(NOW())-30*60*1000*1000) AND TIME_TO_USEC(NOW())
GROUP BY sql_id
ORDER BY SUM(execute_time) DESC
LIMIT 1;obclient> EXPLAIN SELECT ... FROM v_tt01 WHERE COL001 IN (20017476);TBL3(IDX_TBL3_COL170)range(MIN,MIN ; MAX,MAX)always truefilter([20017476 = cast(cast(TBL3.COL170, VARCHAR2(20 BYTE)), NUMBER)])V_TT01TO_NUMBERCOL001TBL3.COL170VARCHAR2(20)NUMBER-- Before: View had TO_NUMBER conversion
CREATE VIEW V_TT01_OLD AS
SELECT To_number("tt01"."col001") AS "COL001", ...
-- After: Remove TO_NUMBER (COL001 is already NUMBER)
CREATE VIEW V_TT01 AS
SELECT "tt01"."col001" AS "COL001", ...-- Before: TBL3.COL170 was VARCHAR2(20)
CREATE TABLE TBL3 (
COL170 VARCHAR2(20) NOT NULL,
...
);
-- After: Change to NUMBER(20) to match join condition
ALTER TABLE TBL3 MODIFY COL170 NUMBER(20) NOT NULL;-- Add hash partitioning on join keys
ALTER TABLE TBL1 PARTITION BY HASH(COL001) PARTITIONS 8;
ALTER TABLE TBL2 PARTITION BY HASH(COL004) PARTITIONS 8;
ALTER TABLE TBL3 PARTITION BY HASH(COL170) PARTITIONS 8;
ALTER TABLE TT01 PARTITION BY HASH(COL001) PARTITIONS 8;
-- Create table group to avoid cross-partition joins
CREATE TABLEGROUP order_tg PARTITION BY HASH(COL001) PARTITIONS 8;
ALTER TABLE TBL1 SET TABLEGROUP order_tg;
ALTER TABLE TBL2 SET TABLEGROUP order_tg;
ALTER TABLE TBL3 SET TABLEGROUP order_tg;
ALTER TABLE TT01 SET TABLEGROUP order_tg;PX PARTITION ITERATORTABLE SCANINDEX_TBL3_COL170-- ❌ BAD: Multiple queries in loop
-- Instead, use JOIN or batch queries-- ❌ BAD: SELECT * returns unnecessary data
SELECT * FROM order_table;
-- ✅ GOOD: Select only needed columns
SELECT order_id, customer_id, total_amount FROM order_table;-- ❌ BAD: Prevents index usage
WHERE UPPER(customer_name) = 'JOHN'
-- ✅ GOOD: Store normalized data or use function-based index
WHERE customer_name = 'JOHN'-- ❌ BAD: String to number conversion
WHERE order_id = '12345'
-- ✅ GOOD: Match data types
WHERE order_id = 12345-- Check table statistics
obclient [SALES_DB]> ANALYZE TABLE order_table;
-- View table size and row count
obclient [SALES_DB]> SELECT
table_name,
table_rows,
data_length,
index_length
FROM information_schema.tables
WHERE table_schema = 'SALES_DB';
-- Check index usage
obclient [SALES_DB]> SHOW INDEX FROM order_table;| Operator | Meaning | Optimization |
|---|---|---|
| TABLE SCAN / TABLE FULL SCAN | Full table scan | Add index or use partition key |
| TABLE RANGE SCAN | Range scan on table | ✅ Good, but consider index |
| TABLE GET | Direct row access via primary key | ✅ Best |
| HASH JOIN | Hash join algorithm | ✅ Good for large equal joins |
| NESTED-LOOP JOIN | Nested loop join | ✅ Good for small outer table with indexed inner |
| MERGE JOIN | Merge join algorithm | ✅ Good for pre-sorted data |
| SORT | Sorting operation | Add ORDER BY index or use sorted index |
| AGGREGATE | Aggregation | Consider materialized views |
| PX PARTITION ITERATOR | Parallel partition iterator | ✅ Good, partition pruning working |
| EXCHANGE OUT DISTR | Distributed exchange | Indicates distributed execution |