cap-apps-performance
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChinesePerformance Optimization - Critical Best Practices
性能优化 - 关键最佳实践
NEVER Fetch Full Datasets
绝对不要获取完整数据集
CRITICAL RULE: Never fetch entire datasets, especially large ones (100k+ rows). Always use targeted queries with only the columns needed for each visualization.
关键规则: 绝对不要获取完整数据集,尤其是大型数据集(10万行以上)。始终使用针对性查询,仅获取每个可视化所需的列。
Performance Impact
性能影响
typescript
// ❌ TERRIBLE - Fetches 300k rows × 30 columns = 9M data points
const allData = await new Query()
.select(['col1', 'col2', 'col3', ...]) // All columns
.fetch('dataset');
// ✅ GOOD - Fetches 300k rows × 2 columns = 600k data points (93% reduction)
const totals = await new Query()
.select(['Transactions', 'Total Amount (USD)'])
.fetch('dataset');
// Then aggregate client-side
// ✅ BEST - Server-side aggregation, returns only aggregated results
const totals = await new Query()
.groupBy('Merchant Category', { 'Total Amount (USD)': 'sum' })
.fetch('dataset');
// Returns only ~10-20 rows (one per category)typescript
// ❌ 极差 - 获取30万行×30列=900万个数据点
const allData = await new Query()
.select(['col1', 'col2', 'col3', ...]) // 所有列
.fetch('dataset');
// ✅ 良好 - 获取30万行×2列=60万个数据点(减少93%)
const totals = await new Query()
.select(['Transactions', 'Total Amount (USD)'])
.fetch('dataset');
// 然后在客户端进行聚合
// ✅ 最优 - 服务端聚合,仅返回聚合结果
const totals = await new Query()
.groupBy('Merchant Category', { 'Total Amount (USD)': 'sum' })
.fetch('dataset');
// 仅返回约10-20行(每个类别一行)Query Optimization Strategy
查询优化策略
1. Use Server-Side Aggregation When Possible
1. 尽可能使用服务端聚合
Priority order:
- Best: with aggregations (server-side)
.groupBy() - Good: with only needed columns, then aggregate client-side
.select() - Acceptable: with multiple columns if aggregation isn't possible
.select() - Never: Fetch all columns or entire dataset
typescript
// ✅ BEST - Server aggregates, returns minimal data
const salesByRegion = await new Query()
.groupBy('region', { 'Sales_Amount': 'sum', 'Order_Count': 'count' })
.fetch('sales');
// ✅ GOOD - Only fetch columns needed, aggregate client-side
const salesData = await new Query()
.select(['region', 'Sales_Amount'])
.fetch('sales');
const totalSales = salesData.reduce((sum, row) => sum + row.Sales_Amount, 0);
// ❌ BAD - Fetches all columns
const allData = await new Query().fetch('sales');优先级顺序:
- 最优: 带聚合操作的(服务端)
.groupBy() - 良好: 仅选择所需列的,然后在客户端聚合
.select() - 可接受: 如果无法聚合,使用选择多列
.select() - 绝对不要: 获取所有列或完整数据集
typescript
// ✅ 最优 - 服务端聚合,返回最少数据
const salesByRegion = await new Query()
.groupBy('region', { 'Sales_Amount': 'sum', 'Order_Count': 'count' })
.fetch('sales');
// ✅ 良好 - 仅获取所需列,在客户端聚合
const salesData = await new Query()
.select(['region', 'Sales_Amount'])
.fetch('sales');
const totalSales = salesData.reduce((sum, row) => sum + row.Sales_Amount, 0);
// ❌ 糟糕 - 获取所有列
const allData = await new Query().fetch('sales');2. One Query Per Visualization
2. 每个可视化对应一个查询
Each visualization should have its own optimized query:
typescript
// ✅ CORRECT - Separate optimized queries
const quickStats = await fetchQuickStats(); // Only Account Key, Account Status
const riskMetrics = await fetchRiskMetrics(); // Only Transactions, KYC Status, etc.
const categories = await fetchCategories(); // Grouped by Merchant Category
// ❌ WRONG - One query for everything
const allData = await fetchAllData();
const quickStats = calculateFromAll(allData);
const riskMetrics = calculateFromAll(allData);
const categories = calculateFromAll(allData);每个可视化都应有自己的优化查询:
typescript
// ✅ 正确 - 独立的优化查询
const quickStats = await fetchQuickStats(); // 仅获取Account Key、Account Status
const riskMetrics = await fetchRiskMetrics(); // 仅获取Transactions、KYC Status等
const categories = await fetchCategories(); // 按Merchant Category分组
// ❌ 错误 - 一个查询获取所有数据
const allData = await fetchAllData();
const quickStats = calculateFromAll(allData);
const riskMetrics = calculateFromAll(allData);
const categories = calculateFromAll(allData);3. Column Selection Strategy
3. 列选择策略
Always specify columns explicitly:
- Only select columns needed for the specific calculation
- Don't use without arguments (fetches all columns)
.select() - For aggregations, only select the grouping column + aggregated columns
typescript
// ✅ CORRECT - Only 2 columns
const data = await new Query()
.select(['Account Key', 'Account Status'])
.fetch('dataset');
// ❌ WRONG - Fetches all columns
const data = await new Query().fetch('dataset');始终明确指定列:
- 仅选择特定计算所需的列
- 不要不带参数使用(会获取所有列)
.select() - 对于聚合操作,仅选择分组列和聚合列
typescript
// ✅ 正确 - 仅2列
const data = await new Query()
.select(['Account Key', 'Account Status'])
.fetch('dataset');
// ❌ 错误 - 获取所有列
const data = await new Query().fetch('dataset');Common Patterns
常见模式
Pattern 1: Totals Without Grouping
模式1:无分组的总计
When you need totals but no grouping column:
typescript
// Option A: Select only needed columns, sum client-side
const totalsData = await new Query()
.select(['Transactions', 'Total Amount (USD)'])
.fetch('dataset');
const totalTransactions = totalsData.reduce((sum, row) => sum + (row.Transactions || 0), 0);
const totalVolume = totalsData.reduce((sum, row) => sum + (row['Total Amount (USD)'] || 0), 0);
// Option B: Use a constant grouping column (if dataset has one)
// Not always possible, but more efficient if available当你需要总计但不需要分组列时:
typescript
// 选项A:仅选择所需列,在客户端求和
const totalsData = await new Query()
.select(['Transactions', 'Total Amount (USD)'])
.fetch('dataset');
const totalTransactions = totalsData.reduce((sum, row) => sum + (row.Transactions || 0), 0);
const totalVolume = totalsData.reduce((sum, row) => sum + (row['Total Amount (USD)'] || 0), 0);
// 选项B:使用常量分组列(如果数据集有)
// 并非总是可行,但如果可用则效率更高Pattern 2: Counts with Filters
模式2:带过滤的计数
typescript
// ✅ CORRECT - Only fetch the column needed for counting
const kycData = await new Query()
.select(['KYC Status'])
.fetch('dataset');
const approvedCount = kycData.filter(row => row['KYC Status'] === 'APPROVED').length;
// Better if possible: Use server-side count aggregation
// (See Query API limitations below)typescript
// ✅ 正确 - 仅获取计数所需的列
const kycData = await new Query()
.select(['KYC Status'])
.fetch('dataset');
const approvedCount = kycData.filter(row => row['KYC Status'] === 'APPROVED').length;
// 如果可能更好:使用服务端计数聚合
// (见下文Query API限制)Pattern 3: Unique Values
模式3:唯一值
typescript
// ✅ CORRECT - Only fetch the column needed
const statesData = await new Query()
.select(['State'])
.fetch('dataset');
const uniqueStates = new Set(statesData.map(row => row.State)).size;typescript
// ✅ 正确 - 仅获取所需列
const statesData = await new Query()
.select(['State'])
.fetch('dataset');
const uniqueStates = new Set(statesData.map(row => row.State)).size;Query API Limitations
Query API限制
.aggregate()
Doesn't Work
.aggregate().aggregate()
无法工作
.aggregate()CRITICAL: The method shown in some documentation does not work in practice. It causes error: .
.aggregate()DA0057: An alias list was provided but it could not be parsedtypescript
// ❌ DOES NOT WORK - Causes DA0057 error
const totals = await new Query()
.aggregate({ 'Transactions': 'sum', 'Total Amount (USD)': 'sum' })
.fetch('dataset');
// ✅ WORKAROUND - Use .groupBy() with a grouping column
const totals = await new Query()
.groupBy('some_column', { 'Transactions': 'sum', 'Total Amount (USD)': 'sum' })
.fetch('dataset');
// ✅ ALTERNATIVE - Select columns and aggregate client-side
const totals = await new Query()
.select(['Transactions', 'Total Amount (USD)'])
.fetch('dataset');
// Then sum client-side关键提示: 某些文档中展示的方法在实际中无法工作。会导致错误:。
.aggregate()DA0057: An alias list was provided but it could not be parsedtypescript
// ❌ 无法工作 - 导致DA0057错误
const totals = await new Query()
.aggregate({ 'Transactions': 'sum', 'Total Amount (USD)': 'sum' })
.fetch('dataset');
// ✅ 替代方案 - 使用带分组列的.groupBy()
const totals = await new Query()
.groupBy('some_column', { 'Transactions': 'sum', 'Total Amount (USD)': 'sum' })
.fetch('dataset');
// ✅ 另一种替代方案 - 选择列并在客户端聚合
const totals = await new Query()
.select(['Transactions', 'Total Amount (USD)'])
.fetch('dataset');
// 然后在客户端求和.groupBy()
Requires a Grouping Column
.groupBy().groupBy()
需要分组列
.groupBy()You cannot use with only aggregations - you must provide a grouping column:
.groupBy()typescript
// ❌ DOES NOT WORK - No grouping column
.groupBy({ 'Transactions': 'sum' })
// ✅ WORKS - Has grouping column
.groupBy('region', { 'Transactions': 'sum' })
// ✅ WORKS - Multiple groupBy calls
.groupBy('region')
.groupBy({ 'Transactions': 'sum' })你不能仅使用聚合操作而不提供分组列来调用:
.groupBy()typescript
// ❌ 无法工作 - 无分组列
.groupBy({ 'Transactions': 'sum' })
// ✅ 可行 - 有分组列
.groupBy('region', { 'Transactions': 'sum' })
// ✅ 可行 - 多次调用groupBy
.groupBy('region')
.groupBy({ 'Transactions': 'sum' })Performance Monitoring
性能监控
Always check:
- Network tab - How much data is being transferred?
- Console - Any warnings about large queries?
- Response times - Are queries taking too long?
If a query is slow or returns too much data:
- Reduce columns selected
- Add filters to reduce rows
- Use server-side aggregation instead of client-side
- Consider pagination for large result sets
始终检查:
- 网络标签页 - 传输了多少数据?
- 控制台 - 有没有关于大型查询的警告?
- 响应时间 - 查询是否耗时过长?
如果查询缓慢或返回数据过多:
- 减少选择的列数
- 添加过滤条件以减少行数
- 使用服务端聚合替代客户端聚合
- 考虑对大型结果集使用分页