bigquery-analytics

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Usage

使用说明

All scripts can be executed using Node.js. Replace
<param_name>
and
<param_value>
with actual values.
Bash:
node <skill_dir>/scripts/<script_name>.js '{"<param_name>": "<param_value>"}'
PowerShell:
node <skill_dir>/scripts/<script_name>.js '{\"<param_name>\": \"<param_value>\"}'
Note: The scripts automatically load the environment variables from various .env files. Do not ask the user to set vars unless skill executions fails due to env var absence.
所有脚本均可通过Node.js执行。将
<param_name>
<param_value>
替换为实际值。
Bash命令:
node <skill_dir>/scripts/<script_name>.js '{"<param_name>": "<param_value>"}'
PowerShell命令:
node <skill_dir>/scripts/<script_name>.js '{\"<param_name>\": \"<param_value>\"}'
注意:脚本会自动从多个.env文件加载环境变量。除非因缺少环境变量导致技能执行失败,否则无需要求用户设置变量。

Scripts

脚本

analyze_contribution

analyze_contribution

Use this skill to analyze the contribution about changes to key metrics in multi-dimensional data.
使用此技能分析多维数据中关键指标变化的影响因素。

Parameters

参数

NameTypeDescriptionRequiredDefault
input_datastringThe data that contain the test and control data to analyze. Can be a fully qualified BigQuery table ID or a SQL query.Yes
contribution_metricstringThe name of the column that contains the metric to analyze.
	Provides the expression to use to calculate the metric you are analyzing.
	To calculate a summable metric, the expression must be in the form SUM(metric_column_name),
	where metric_column_name is a numeric data type.

	To calculate a summable ratio metric, the expression must be in the form
	SUM(numerator_metric_column_name)/SUM(denominator_metric_column_name),
	where numerator_metric_column_name and denominator_metric_column_name are numeric data types.

	To calculate a summable by category metric, the expression must be in the form
	SUM(metric_sum_column_name)/COUNT(DISTINCT categorical_column_name). The summed column must be a numeric data type.
	The categorical column must have type BOOL, DATE, DATETIME, TIME, TIMESTAMP, STRING, or INT64. | Yes |  |
| is_test_col | string | The name of the column that identifies whether a row is in the test or control group. | Yes | | | dimension_id_cols | array | An array of column names that uniquely identify each dimension. | No | | | top_k_insights_by_apriori_support | integer | The number of top insights to return, ranked by apriori support. | No |
30
| | pruning_method | string | The method to use for pruning redundant insights. Can be 'NO_PRUNING' or 'PRUNE_REDUNDANT_INSIGHTS'. | No |
PRUNE_REDUNDANT_INSIGHTS
|

名称类型描述是否必填默认值
input_datastring包含待分析的测试组和对照组数据。可以是完整的BigQuery表ID或SQL查询语句。
contribution_metricstring待分析指标的列名。
	提供用于计算待分析指标的表达式。
	若计算可求和指标,表达式必须为SUM(metric_column_name)形式,
	其中metric_column_name为数值类型。

	若计算可求和比率指标,表达式必须为
	SUM(numerator_metric_column_name)/SUM(denominator_metric_column_name)形式,
	其中numerator_metric_column_name和denominator_metric_column_name均为数值类型。

	若计算按类别求和的指标,表达式必须为
	SUM(metric_sum_column_name)/COUNT(DISTINCT categorical_column_name)形式。求和列必须为数值类型。
	类别列类型可以是BOOL、DATE、DATETIME、TIME、TIMESTAMP、STRING或INT64。 | 是 |  |
| is_test_col | string | 用于标识数据行属于测试组还是对照组的列名。 | 是 | | | dimension_id_cols | array | 唯一标识每个维度的列名数组。 | 否 | | | top_k_insights_by_apriori_support | integer | 返回的按先验支持度排序的顶级洞察数量。 | 否 |
30
| | pruning_method | string | 用于修剪冗余洞察的方法。可选值为'NO_PRUNING'或'PRUNE_REDUNDANT_INSIGHTS'。 | 否 |
PRUNE_REDUNDANT_INSIGHTS
|

ask_data_insights

ask_data_insights

Use this skill to perform data analysis, get insights, or answer complex questions about the contents of specific BigQuery tables.
使用此技能执行数据分析、获取洞察, 或回答关于特定BigQuery表内容的复杂问题。

Parameters

参数

NameTypeDescriptionRequiredDefault
user_query_with_contextstringThe user's question, potentially including conversation history and system instructions for context.Yes
table_referencesstringA JSON string of a list of BigQuery tables to use as context. Each object in the list must contain 'projectId', 'datasetId', and 'tableId'. Example: '[{"projectId": "my-gcp-project", "datasetId": "my_dataset", "tableId": "my_table"}]'.Yes

名称类型描述是否必填默认值
user_query_with_contextstring用户的问题,可能包含对话历史和系统指令作为上下文。
table_referencesstring作为上下文的BigQuery表列表的JSON字符串。列表中的每个对象必须包含'projectId'、'datasetId'和'tableId'。示例:'[{"projectId": "my-gcp-project", "datasetId": "my_dataset", "tableId": "my_table"}]'。

forecast

forecast

Use this skill to forecast time series data.
使用此技能进行时间序列数据预测。

Parameters

参数

NameTypeDescriptionRequiredDefault
history_datastringThe table id or the query of the history time series data.Yes
timestamp_colstringThe name of the time series timestamp column.Yes
data_colstringThe name of the time series data column.Yes
id_colsarrayAn array of the time series id column names.No
[]
horizonintegerThe number of forecasting steps.No
10

名称类型描述是否必填默认值
history_datastring历史时间序列数据的表ID或查询语句。
timestamp_colstring时间序列时间戳列的名称。
data_colstring时间序列数据列的名称。
id_colsarray时间序列ID列的名称数组。
[]
horizoninteger预测步数。
10

search_catalog

search_catalog

Use this skill to find tables, views, models, routines or connections.
使用此技能查找表、视图、模型、例程或连接。

Parameters

参数

NameTypeDescriptionRequiredDefault
promptstringPrompt representing search intention. Do not rewrite the prompt.Yes
datasetIdsarrayArray of dataset IDs.No
[]
projectIdsarrayArray of project IDs.No
[]
typesarrayArray of data types to filter by.No
[]
pageSizeintegerNumber of results in the search page.No
5

名称类型描述是否必填默认值
promptstring代表搜索意图的提示词。请勿改写提示词。
datasetIdsarray数据集ID数组。
[]
projectIdsarray项目ID数组。
[]
typesarray用于过滤的数据类型数组。
[]
pageSizeinteger搜索结果每页的数量。
5