json-and-csv-data-transformation

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

JSON and CSV Data Transformation

JSON与CSV数据转换

Transform data between JSON, CSV, and other formats. Filter, map, flatten nested objects, and reshape data for analysis, reporting, and API integration.
在JSON、CSV和其他格式之间转换数据,支持过滤、映射、扁平化嵌套对象,可重塑数据结构以满足分析、报表和API集成需求。

When to use

适用场景

  • Use case 1: When the user asks to convert data between JSON and CSV formats
  • Use case 2: When you need to filter, extract, or transform specific fields from data
  • Use case 3: For flattening nested JSON structures into tabular format
  • Use case 4: When processing API responses for analysis or reporting
  • 场景1:当用户需要在JSON和CSV格式之间转换数据时
  • 场景2:当你需要过滤、提取或转换数据中的特定字段时
  • 场景3:将嵌套的JSON结构扁平化为表格格式
  • 场景4:处理API响应用于分析或生成报表时

Required tools / APIs

所需工具/API

  • jq — Command-line JSON processor (essential for JSON manipulation)
  • csvkit — Suite of CSV tools (csvjson, csvcut, csvgrep, etc.)
  • No external API required
Install options:
bash
undefined
  • jq — 命令行JSON处理器(JSON操作的核心工具)
  • csvkit — CSV工具套件(包含csvjson、csvcut、csvgrep等工具)
  • 无需调用外部API
安装选项:
bash
undefined

Ubuntu/Debian

Ubuntu/Debian

sudo apt-get install -y jq csvkit
sudo apt-get install -y jq csvkit

macOS

macOS

brew install jq csvkit
brew install jq csvkit

Node.js (native support, no packages needed for basic operations)

Node.js(原生支持,基础操作无需安装额外包)

For advanced CSV parsing: npm install csv-parse csv-stringify

高级CSV解析需要安装:npm install csv-parse csv-stringify

undefined
undefined

Skills

功能技能

json_to_csv

json_to_csv

Convert JSON array to CSV format.
bash
undefined
将JSON数组转换为CSV格式。
bash
undefined

Simple JSON array to CSV

简单JSON数组转CSV

echo '[{"name":"Alice","age":30},{"name":"Bob","age":25}]' | jq -r '(.[0] | keys_unsorted) as $keys | $keys, (map([.[ $keys[] ]]) | .[] | @csv)'
echo '[{"name":"Alice","age":30},{"name":"Bob","age":25}]' | jq -r '(.[0] | keys_unsorted) as $keys | $keys, (map([.[ $keys[] ]]) | .[] | @csv)'

JSON file to CSV file

JSON文件转CSV文件

jq -r '(.[0] | keys_unsorted) as $keys | $keys, (map([.[ $keys[] ]]) | .[] | @csv)' data.json > output.csv
jq -r '(.[0] | keys_unsorted) as $keys | $keys, (map([.[ $keys[] ]]) | .[] | @csv)' data.json > output.csv

JSON to CSV with specific fields

JSON转CSV并指定输出字段

jq -r '.[] | [.id, .name, .email] | @csv' users.json
jq -r '.[] | [.id, .name, .email] | @csv' users.json

Using csvkit (simpler syntax)

使用csvkit实现(语法更简单)

cat data.json | in2csv -f json > output.csv

**Node.js:**

```javascript
function jsonToCSV(jsonArray) {
  if (!Array.isArray(jsonArray) || jsonArray.length === 0) {
    return '';
  }
  
  // Get headers from first object
  const headers = Object.keys(jsonArray[0]);
  
  // Escape CSV values
  const escape = (val) => {
    if (val === null || val === undefined) return '';
    const str = String(val);
    if (str.includes(',') || str.includes('"') || str.includes('\n')) {
      return `"${str.replace(/"/g, '""')}"`;
    }
    return str;
  };
  
  // Build CSV
  const headerRow = headers.join(',');
  const dataRows = jsonArray.map(obj =>
    headers.map(header => escape(obj[header])).join(',')
  );
  
  return [headerRow, ...dataRows].join('\n');
}

// Usage
// const data = [
//   { name: 'Alice', age: 30, city: 'New York' },
//   { name: 'Bob', age: 25, city: 'San Francisco' }
// ];
// console.log(jsonToCSV(data));
cat data.json | in2csv -f json > output.csv

**Node.js实现:**

```javascript
function jsonToCSV(jsonArray) {
  if (!Array.isArray(jsonArray) || jsonArray.length === 0) {
    return '';
  }
  
  // 从第一个对象获取表头
  const headers = Object.keys(jsonArray[0]);
  
  // 转义CSV特殊值
  const escape = (val) => {
    if (val === null || val === undefined) return '';
    const str = String(val);
    if (str.includes(',') || str.includes('"') || str.includes('\n')) {
      return `"${str.replace(/"/g, '""')}"`;
    }
    return str;
  };
  
  // 构建CSV内容
  const headerRow = headers.join(',');
  const dataRows = jsonArray.map(obj =>
    headers.map(header => escape(obj[header])).join(',')
  );
  
  return [headerRow, ...dataRows].join('\n');
}

// 使用示例
// const data = [
//   { name: 'Alice', age: 30, city: 'New York' },
//   { name: 'Bob', age: 25, city: 'San Francisco' }
// ];
// console.log(jsonToCSV(data));

csv_to_json

csv_to_json

Convert CSV to JSON array.
bash
undefined
将CSV转换为JSON数组。
bash
undefined

CSV to JSON

CSV转JSON

csvjson data.csv
csvjson data.csv

CSV to JSON with pretty printing

CSV转JSON并格式化输出

csvjson data.csv | jq '.'
csvjson data.csv | jq '.'

CSV to JSON array of objects

CSV转JSON对象数组

csvjson --stream data.csv
csvjson --stream data.csv

CSV file to JSON file

CSV文件转JSON文件

csvjson input.csv > output.json
csvjson input.csv > output.json

Using pure jq (if headers are in first row)

纯jq实现(假设首行是表头)

jq -Rsn '[inputs | split(",") | {name: .[0], age: .[1], city: .[2]}]' < data.csv

**Node.js:**

```javascript
function csvToJSON(csvString) {
  const lines = csvString.trim().split('\n');
  if (lines.length < 2) return [];
  
  // Parse CSV value (handle quotes)
  const parseCSVValue = (val) => {
    val = val.trim();
    if (val.startsWith('"') && val.endsWith('"')) {
      return val.slice(1, -1).replace(/""/g, '"');
    }
    return val;
  };
  
  // Split CSV line (basic implementation)
  const splitCSVLine = (line) => {
    const result = [];
    let current = '';
    let inQuotes = false;
    
    for (let i = 0; i < line.length; i++) {
      const char = line[i];
      
      if (char === '"') {
        inQuotes = !inQuotes;
        current += char;
      } else if (char === ',' && !inQuotes) {
        result.push(parseCSVValue(current));
        current = '';
      } else {
        current += char;
      }
    }
    result.push(parseCSVValue(current));
    return result;
  };
  
  const headers = splitCSVLine(lines[0]);
  const data = lines.slice(1).map(line => {
    const values = splitCSVLine(line);
    const obj = {};
    headers.forEach((header, i) => {
      obj[header] = values[i] || '';
    });
    return obj;
  });
  
  return data;
}

// Usage
// const csv = `name,age,city
// Alice,30,New York
// Bob,25,"San Francisco"`;
// console.log(JSON.stringify(csvToJSON(csv), null, 2));
jq -Rsn '[inputs | split(",") | {name: .[0], age: .[1], city: .[2]}]' < data.csv

**Node.js实现:**

```javascript
function csvToJSON(csvString) {
  const lines = csvString.trim().split('\n');
  if (lines.length < 2) return [];
  
  // 解析CSV值(处理引号)
  const parseCSVValue = (val) => {
    val = val.trim();
    if (val.startsWith('"') && val.endsWith('"')) {
      return val.slice(1, -1).replace(/""/g, '"');
    }
    return val;
  };
  
  // 拆分CSV行(基础实现)
  const splitCSVLine = (line) => {
    const result = [];
    let current = '';
    let inQuotes = false;
    
    for (let i = 0; i < line.length; i++) {
      const char = line[i];
      
      if (char === '"') {
        inQuotes = !inQuotes;
        current += char;
      } else if (char === ',' && !inQuotes) {
        result.push(parseCSVValue(current));
        current = '';
      } else {
        current += char;
      }
    }
    result.push(parseCSVValue(current));
    return result;
  };
  
  const headers = splitCSVLine(lines[0]);
  const data = lines.slice(1).map(line => {
    const values = splitCSVLine(line);
    const obj = {};
    headers.forEach((header, i) => {
      obj[header] = values[i] || '';
    });
    return obj;
  });
  
  return data;
}

// 使用示例
// const csv = `name,age,city
// Alice,30,New York
// Bob,25,"San Francisco"`;
// console.log(JSON.stringify(csvToJSON(csv), null, 2));

filter_and_extract_json

filter_and_extract_json

Filter and extract specific fields from JSON.
bash
undefined
过滤并提取JSON中的特定字段。
bash
undefined

Extract specific fields

提取特定字段

jq '.[] | {name: .name, email: .email}' users.json
jq '.[] | {name: .name, email: .email}' users.json

Filter by condition

按条件过滤

jq '.[] | select(.age > 25)' users.json
jq '.[] | select(.age > 25)' users.json

Filter and extract

过滤并提取

jq '[.[] | select(.active == true) | {id: .id, name: .name}]' data.json
jq '[.[] | select(.active == true) | {id: .id, name: .name}]' data.json

Extract nested fields

提取嵌套字段

jq '.[] | {name: .name, street: .address.street, city: .address.city}' data.json
jq '.[] | {name: .name, street: .address.street, city: .address.city}' data.json

Get array of single field

获取单个字段组成的数组

jq '.[].name' users.json
jq '.[].name' users.json

Filter with multiple conditions

多条件过滤

jq '.[] | select(.age > 20 and .country == "USA")' users.json
jq '.[] | select(.age > 20 and .country == "USA")' users.json

Map and transform values

映射并转换值

jq '.[] | .price = (.price * 1.1)' products.json

**Node.js:**

```javascript
function filterAndExtractJSON(data, options) {
  const { filter, extract } = options;
  
  let result = Array.isArray(data) ? data : [data];
  
  // Apply filter function
  if (filter) {
    result = result.filter(filter);
  }
  
  // Extract specific fields
  if (extract) {
    result = result.map(item => {
      const extracted = {};
      extract.forEach(field => {
        // Support nested fields with dot notation
        const value = field.split('.').reduce((obj, key) => obj?.[key], item);
        extracted[field] = value;
      });
      return extracted;
    });
  }
  
  return result;
}

// Usage
// const users = [
//   { id: 1, name: 'Alice', age: 30, address: { city: 'NYC' } },
//   { id: 2, name: 'Bob', age: 25, address: { city: 'SF' } },
//   { id: 3, name: 'Charlie', age: 35, address: { city: 'LA' } }
// ];
// 
// const result = filterAndExtractJSON(users, {
//   filter: user => user.age > 25,
//   extract: ['name', 'age', 'address.city']
// });
// console.log(result);
jq '.[] | .price = (.price * 1.1)' products.json

**Node.js实现:**

```javascript
function filterAndExtractJSON(data, options) {
  const { filter, extract } = options;
  
  let result = Array.isArray(data) ? data : [data];
  
  // 应用过滤函数
  if (filter) {
    result = result.filter(filter);
  }
  
  // 提取特定字段
  if (extract) {
    result = result.map(item => {
      const extracted = {};
      extract.forEach(field => {
        // 支持点分隔的嵌套字段
        const value = field.split('.').reduce((obj, key) => obj?.[key], item);
        extracted[field] = value;
      });
      return extracted;
    });
  }
  
  return result;
}

// 使用示例
// const users = [
//   { id: 1, name: 'Alice', age: 30, address: { city: 'NYC' } },
//   { id: 2, name: 'Bob', age: 25, address: { city: 'SF' } },
//   { id: 3, name: 'Charlie', age: 35, address: { city: 'LA' } }
// ];
// 
// const result = filterAndExtractJSON(users, {
//   filter: user => user.age > 25,
//   extract: ['name', 'age', 'address.city']
// });
// console.log(result);

flatten_nested_json

flatten_nested_json

Flatten nested JSON objects into flat structure.
bash
undefined
将嵌套的JSON对象扁平化为单层结构。
bash
undefined

Flatten nested JSON with jq

使用jq扁平化嵌套JSON

jq '[.[] | {id: .id, name: .name, street: .address.street, city: .address.city, zip: .address.zip}]' users.json
jq '[.[] | {id: .id, name: .name, street: .address.street, city: .address.city, zip: .address.zip}]' users.json

Flatten all nested fields with custom separator

使用自定义分隔符扁平化所有嵌套字段

jq '[.[] | to_entries | map({key: .key, value: .value}) | from_entries]' data.json
jq '[.[] | to_entries | map({key: .key, value: .value}) | from_entries]' data.json

Flatten deeply nested structure

扁平化深度嵌套结构

jq 'recurse | select(type != "object" and type != "array")' complex.json

**Node.js:**

```javascript
function flattenJSON(obj, prefix = '', separator = '.') {
  const flattened = {};
  
  for (const key in obj) {
    const value = obj[key];
    const newKey = prefix ? `${prefix}${separator}${key}` : key;
    
    if (value !== null && typeof value === 'object' && !Array.isArray(value)) {
      // Recursively flatten nested objects
      Object.assign(flattened, flattenJSON(value, newKey, separator));
    } else if (Array.isArray(value)) {
      // Convert arrays to string or flatten each item
      flattened[newKey] = JSON.stringify(value);
    } else {
      flattened[newKey] = value;
    }
  }
  
  return flattened;
}

// Usage
// const nested = {
//   id: 1,
//   name: 'Alice',
//   address: {
//     street: '123 Main St',
//     city: 'NYC',
//     coordinates: { lat: 40.7, lon: -74.0 }
//   },
//   tags: ['user', 'active']
// };
// console.log(flattenJSON(nested));
// Output: {
//   id: 1,
//   name: 'Alice',
//   'address.street': '123 Main St',
//   'address.city': 'NYC',
//   'address.coordinates.lat': 40.7,
//   'address.coordinates.lon': -74.0,
//   tags: '["user","active"]'
// }
jq 'recurse | select(type != "object" and type != "array")' complex.json

**Node.js实现:**

```javascript
function flattenJSON(obj, prefix = '', separator = '.') {
  const flattened = {};
  
  for (const key in obj) {
    const value = obj[key];
    const newKey = prefix ? `${prefix}${separator}${key}` : key;
    
    if (value !== null && typeof value === 'object' && !Array.isArray(value)) {
      // 递归扁平化嵌套对象
      Object.assign(flattened, flattenJSON(value, newKey, separator));
    } else if (Array.isArray(value)) {
      // 将数组转为字符串或扁平化每个元素
      flattened[newKey] = JSON.stringify(value);
    } else {
      flattened[newKey] = value;
    }
  }
  
  return flattened;
}

// 使用示例
// const nested = {
//   id: 1,
//   name: 'Alice',
//   address: {
//     street: '123 Main St',
//     city: 'NYC',
//     coordinates: { lat: 40.7, lon: -74.0 }
//   },
//   tags: ['user', 'active']
// };
// console.log(flattenJSON(nested));
// 输出: {
//   id: 1,
//   name: 'Alice',
//   'address.street': '123 Main St',
//   'address.city': 'NYC',
//   'address.coordinates.lat': 40.7,
//   'address.coordinates.lon': -74.0,
//   tags: '["user","active"]'
// }

transform_csv_data

transform_csv_data

Transform and manipulate CSV data.
bash
undefined
转换和操作CSV数据。
bash
undefined

Select specific columns

选择指定列

csvcut -c name,email,age users.csv
csvcut -c name,email,age users.csv

Filter rows by value

按值过滤行

csvgrep -c age -r "^[3-9][0-9]$" users.csv # age >= 30
csvgrep -c age -r "^[3-9][0-9]$" users.csv # 年龄 >= 30

Sort CSV

排序CSV

csvsort -c age -r users.csv # reverse sort by age
csvsort -c age -r users.csv # 按年龄倒序排序

Remove duplicate rows

去重行

csvcut -c name,email users.csv | uniq
csvcut -c name,email users.csv | uniq

Combine: filter, select columns, sort

组合操作:过滤、选列、排序

csvgrep -c country -m "USA" users.csv | csvcut -c name,age | csvsort -c age
csvgrep -c country -m "USA" users.csv | csvcut -c name,age | csvsort -c age

Add calculated column (requires csvpy or awk)

添加计算列(需要csvpy或awk)

awk -F',' 'BEGIN{OFS=","} NR==1{print $0,"total"} NR>1{print $0,$2*$3}' data.csv
awk -F',' 'BEGIN{OFS=","} NR==1{print $0,"total"} NR>1{print $0,$2*$3}' data.csv

Merge two CSV files by column

按列合并两个CSV文件

csvjoin -c id users.csv orders.csv

**Node.js:**

```javascript
function transformCSV(csvData, transformations) {
  const { selectColumns, filterRows, sortBy } = transformations;
  
  // Parse CSV to objects
  const data = csvToJSON(csvData);
  
  let result = data;
  
  // Filter rows
  if (filterRows) {
    result = result.filter(filterRows);
  }
  
  // Select columns
  if (selectColumns) {
    result = result.map(row => {
      const selected = {};
      selectColumns.forEach(col => {
        selected[col] = row[col];
      });
      return selected;
    });
  }
  
  // Sort
  if (sortBy) {
    const { column, reverse } = sortBy;
    result.sort((a, b) => {
      const aVal = a[column];
      const bVal = b[column];
      const comparison = aVal > bVal ? 1 : aVal < bVal ? -1 : 0;
      return reverse ? -comparison : comparison;
    });
  }
  
  // Convert back to CSV
  return jsonToCSV(result);
}

// Usage
// const csv = `name,age,country
// Alice,30,USA
// Bob,25,Canada
// Charlie,35,USA`;
//
// const transformed = transformCSV(csv, {
//   filterRows: row => row.country === 'USA',
//   selectColumns: ['name', 'age'],
//   sortBy: { column: 'age', reverse: true }
// });
// console.log(transformed);
csvjoin -c id users.csv orders.csv

**Node.js实现:**

```javascript
function transformCSV(csvData, transformations) {
  const { selectColumns, filterRows, sortBy } = transformations;
  
  // 将CSV解析为对象
  const data = csvToJSON(csvData);
  
  let result = data;
  
  // 过滤行
  if (filterRows) {
    result = result.filter(filterRows);
  }
  
  // 选择列
  if (selectColumns) {
    result = result.map(row => {
      const selected = {};
      selectColumns.forEach(col => {
        selected[col] = row[col];
      });
      return selected;
    });
  }
  
  // 排序
  if (sortBy) {
    const { column, reverse } = sortBy;
    result.sort((a, b) => {
      const aVal = a[column];
      const bVal = b[column];
      const comparison = aVal > bVal ? 1 : aVal < bVal ? -1 : 0;
      return reverse ? -comparison : comparison;
    });
  }
  
  // 转换回CSV格式
  return jsonToCSV(result);
}

// 使用示例
// const csv = `name,age,country
// Alice,30,USA
// Bob,25,Canada
// Charlie,35,USA`;
//
// const transformed = transformCSV(csv, {
//   filterRows: row => row.country === 'USA',
//   selectColumns: ['name', 'age'],
//   sortBy: { column: 'age', reverse: true }
// });
// console.log(transformed);

aggregate_and_group_json

aggregate_and_group_json

Aggregate and group JSON data (similar to SQL GROUP BY).
bash
undefined
聚合和分组JSON数据(类似SQL的GROUP BY功能)。
bash
undefined

Group by field and count

按字段分组并计数

jq 'group_by(.country) | map({country: .[0].country, count: length})' users.json
jq 'group_by(.country) | map({country: .[0].country, count: length})' users.json

Sum values by group

按分组求和

jq 'group_by(.category) | map({category: .[0].category, total: map(.price) | add})' products.json
jq 'group_by(.category) | map({category: .[0].category, total: map(.price) | add})' products.json

Average by group

按分组求平均值

jq 'group_by(.department) | map({department: .[0].department, avg_salary: (map(.salary) | add / length)})' employees.json
jq 'group_by(.department) | map({department: .[0].department, avg_salary: (map(.salary) | add / length)})' employees.json

Multiple aggregations

多维度聚合

jq 'group_by(.region) | map({ region: .[0].region, count: length, total_sales: map(.sales) | add, avg_sales: (map(.sales) | add / length) })' sales.json

**Node.js:**

```javascript
function groupAndAggregate(data, groupBy, aggregations) {
  // Group data
  const grouped = {};
  data.forEach(item => {
    const key = item[groupBy];
    if (!grouped[key]) grouped[key] = [];
    grouped[key].push(item);
  });
  
  // Apply aggregations
  return Object.entries(grouped).map(([key, items]) => {
    const result = { [groupBy]: key };
    
    aggregations.forEach(agg => {
      if (agg.type === 'count') {
        result[agg.name] = items.length;
      } else if (agg.type === 'sum') {
        result[agg.name] = items.reduce((sum, item) => sum + (item[agg.field] || 0), 0);
      } else if (agg.type === 'avg') {
        const sum = items.reduce((s, item) => s + (item[agg.field] || 0), 0);
        result[agg.name] = items.length > 0 ? sum / items.length : 0;
      } else if (agg.type === 'min') {
        result[agg.name] = Math.min(...items.map(item => item[agg.field] || Infinity));
      } else if (agg.type === 'max') {
        result[agg.name] = Math.max(...items.map(item => item[agg.field] || -Infinity));
      }
    });
    
    return result;
  });
}

// Usage
// const sales = [
//   { region: 'East', product: 'A', amount: 100 },
//   { region: 'East', product: 'B', amount: 200 },
//   { region: 'West', product: 'A', amount: 150 },
//   { region: 'West', product: 'B', amount: 250 }
// ];
//
// const result = groupAndAggregate(sales, 'region', [
//   { name: 'count', type: 'count' },
//   { name: 'total_amount', type: 'sum', field: 'amount' },
//   { name: 'avg_amount', type: 'avg', field: 'amount' }
// ]);
// console.log(result);
jq 'group_by(.region) | map({ region: .[0].region, count: length, total_sales: map(.sales) | add, avg_sales: (map(.sales) | add / length) })' sales.json

**Node.js实现:**

```javascript
function groupAndAggregate(data, groupBy, aggregations) {
  // 分组数据
  const grouped = {};
  data.forEach(item => {
    const key = item[groupBy];
    if (!grouped[key]) grouped[key] = [];
    grouped[key].push(item);
  });
  
  // 应用聚合规则
  return Object.entries(grouped).map(([key, items]) => {
    const result = { [groupBy]: key };
    
    aggregations.forEach(agg => {
      if (agg.type === 'count') {
        result[agg.name] = items.length;
      } else if (agg.type === 'sum') {
        result[agg.name] = items.reduce((sum, item) => sum + (item[agg.field] || 0), 0);
      } else if (agg.type === 'avg') {
        const sum = items.reduce((s, item) => s + (item[agg.field] || 0), 0);
        result[agg.name] = items.length > 0 ? sum / items.length : 0;
      } else if (agg.type === 'min') {
        result[agg.name] = Math.min(...items.map(item => item[agg.field] || Infinity));
      } else if (agg.type === 'max') {
        result[agg.name] = Math.max(...items.map(item => item[agg.field] || -Infinity));
      }
    });
    
    return result;
  });
}

// 使用示例
// const sales = [
//   { region: 'East', product: 'A', amount: 100 },
//   { region: 'East', product: 'B', amount: 200 },
//   { region: 'West', product: 'A', amount: 150 },
//   { region: 'West', product: 'B', amount: 250 }
// ];
//
// const result = groupAndAggregate(sales, 'region', [
//   { name: 'count', type: 'count' },
//   { name: 'total_amount', type: 'sum', field: 'amount' },
//   { name: 'avg_amount', type: 'avg', field: 'amount' }
// ]);
// console.log(result);

Rate limits / Best practices

速率限制/最佳实践

  • Stream large files — Use jq with
    -c
    flag and process line by line for large datasets
  • Validate data — Check JSON/CSV format before transformation
  • Handle missing fields — Use default values for null/undefined fields
  • Memory management — For files >100MB, use streaming parsers
  • Type conversion — Be aware of number/string conversions in CSV
  • Preserve data types — JSON maintains types, CSV converts everything to strings
  • ⚠️ Character encoding — Ensure UTF-8 encoding for international characters
  • ⚠️ Quote escaping — Properly escape quotes in CSV values
  • 大文件采用流式处理 — 处理大型数据集时,搭配
    -c
    参数使用jq逐行处理
  • 数据校验 — 转换前先检查JSON/CSV格式是否合法
  • 缺失字段处理 — 为空/未定义的字段设置默认值
  • 内存管理 — 大于100MB的文件使用流式解析器
  • 类型转换注意事项 — 注意CSV中数字/字符串的自动转换问题
  • 保留数据类型 — JSON原生支持多种数据类型,CSV会将所有值转为字符串
  • ⚠️ 字符编码 — 确保多语言字符使用UTF-8编码
  • ⚠️ 引号转义 — CSV值中的引号需要正确转义

Agent prompt

Agent提示词

text
You have JSON and CSV data transformation capability. When a user asks to transform data:

1. Identify the input format:
   - JSON: Look for {...} or [...]
   - CSV: Look for comma-separated values with headers

2. For JSON to CSV:
   - Use jq with @csv filter: `jq -r '... | @csv'`
   - Or csvkit: `in2csv -f json`
   - Node.js: Convert array of objects to CSV string

3. For CSV to JSON:
   - Use csvjson from csvkit: `csvjson file.csv`
   - Node.js: Parse CSV headers and data rows into objects

4. For filtering/extracting:
   - Use jq select(): `jq '.[] | select(.age > 25)'`
   - Use csvkit csvgrep: `csvgrep -c column -m value`
   - Node.js: Use Array.filter() and map()

5. For flattening:
   - Flatten nested JSON objects into dot notation
   - Convert nested structures to tabular format
   - Handle arrays by stringifying or creating separate rows

6. For aggregation:
   - Use jq group_by(): `jq 'group_by(.field) | map({...})'`
   - CSV: Convert to JSON, aggregate, convert back
   - Node.js: Implement grouping and aggregation functions

Always:
- Preserve data integrity (no data loss)
- Handle edge cases (empty values, special characters)
- Validate output format matches expected structure
- For large files (>100MB), recommend streaming approaches
text
你具备JSON和CSV数据转换能力。当用户要求转换数据时:

1. 识别输入格式:
   - JSON:查找{...}或[...]结构
   - CSV:查找带表头的逗号分隔值结构

2. JSON转CSV场景:
   - 使用jq的@csv过滤器:`jq -r '... | @csv'`
   - 或使用csvkit:`in2csv -f json`
   - Node.js:将对象数组转换为CSV字符串

3. CSV转JSON场景:
   - 使用csvkit的csvjson工具:`csvjson file.csv`
   - Node.js:解析CSV表头和数据行生成对象

4. 过滤/提取场景:
   - 使用jq的select()方法:`jq '.[] | select(.age > 25)'`
   - 使用csvkit的csvgrep:`csvgrep -c column -m value`
   - Node.js:使用Array.filter()和map()方法

5. 扁平化场景:
   - 将嵌套JSON对象扁平化为点分隔的键名格式
   - 将嵌套结构转为表格格式
   - 数组可通过序列化或拆分多行处理

6. 聚合场景:
   - 使用jq的group_by()方法:`jq 'group_by(.field) | map({...})'`
   - CSV场景:先转JSON,聚合后再转回CSV
   - Node.js:实现分组和聚合函数

始终遵守以下规则:
- 保证数据完整性(无数据丢失)
- 处理边界情况(空值、特殊字符)
- 校验输出格式符合预期结构
- 大于100MB的文件推荐使用流式处理方案

Troubleshooting

问题排查

Error: "parse error: Invalid numeric literal"
  • Symptom: jq fails to parse JSON
  • Solution: Validate JSON format with
    jq empty file.json
    , fix syntax errors
CSV columns not aligned:
  • Symptom: Data appears in wrong columns after transformation
  • Solution: Check for unescaped commas in data, ensure quotes are properly escaped
Empty output from jq:
  • Symptom: jq returns no results
  • Solution: Check filter expression syntax, verify data structure matches filter
Special characters broken in CSV:
  • Symptom: Non-ASCII characters appear garbled
  • Solution: Ensure UTF-8 encoding:
    iconv -f UTF-8 -t UTF-8 file.csv
Memory error with large files:
  • Symptom: Process runs out of memory
  • Solution: Use streaming mode:
    jq -c
    or Node.js streams for line-by-line processing
JSON doesn't convert to flat CSV:
  • Symptom: Nested objects create complex CSV structure
  • Solution: Flatten JSON first before converting to CSV
错误:"parse error: Invalid numeric literal"
  • 症状:jq无法解析JSON
  • 解决方案:使用
    jq empty file.json
    校验JSON格式,修复语法错误
CSV列不对齐:
  • 症状:转换后数据出现在错误的列中
  • 解决方案:检查数据中是否存在未转义的逗号,确保引号正确转义
jq返回空输出:
  • 症状:jq执行后没有返回结果
  • 解决方案:检查过滤表达式语法,确认数据结构与过滤器匹配
CSV中特殊字符乱码:
  • 症状:非ASCII字符显示为乱码
  • 解决方案:确认使用UTF-8编码:
    iconv -f UTF-8 -t UTF-8 file.csv
大文件处理内存溢出:
  • 症状:进程内存不足退出
  • 解决方案:使用流式模式:
    jq -c
    或Node.js流逐行处理
JSON无法转为扁平CSV:
  • 症状:嵌套对象生成的CSV结构复杂
  • 解决方案:转换为CSV前先扁平化JSON结构

See also

参考链接

  • ../database-query-and-export/SKILL.md — Export database results as JSON/CSV
  • ../web-search-api/SKILL.md — Transform API responses to desired format
  • ../using-web-scraping/SKILL.md — Process scraped data into structured formats
  • ../database-query-and-export/SKILL.md — 将数据库查询结果导出为JSON/CSV格式
  • ../web-search-api/SKILL.md — 将API响应转换为所需格式
  • ../using-web-scraping/SKILL.md — 将爬取的数据处理为结构化格式