sample-text-processor

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

Sample Text Processor

示例文本处理器

Name: sample-text-processor Tier: BASIC Category: Text Processing Dependencies: None (Python Standard Library Only) Author: Claude Skills Engineering Team Version: 1.0.0 Last Updated: 2026-02-16

名称: sample-text-processor 层级: BASIC 分类: 文本处理依赖: 无（仅需Python标准库）作者: Claude Skills Engineering Team 版本: 1.0.0 最后更新: 2026-02-16

Description

描述

The Sample Text Processor is a simple skill designed to demonstrate the basic structure and functionality expected in the claude-skills ecosystem. This skill provides fundamental text processing capabilities including word counting, character analysis, and basic text transformations.

This skill serves as a reference implementation for BASIC tier requirements and can be used as a template for creating new skills. It demonstrates proper file structure, documentation standards, and implementation patterns that align with ecosystem best practices.

The skill processes text files and provides statistics and transformations in both human-readable and JSON formats, showcasing the dual output requirement for skills in the claude-skills repository.

示例文本处理器是一款简单的Skill，旨在展示claude-skills生态中预期的基础结构和功能。该Skill提供基础的文本处理能力，包括字数统计、字符分析和基础文本转换。

该Skill是BASIC层级要求的参考实现，可作为创建新Skill的模板。它展示了符合生态最佳实践的规范文件结构、文档标准和实现模式，与生态最佳实践对齐。

该Skill可处理文本文件，并以人类可读格式和JSON格式提供统计结果和转换后内容，满足claude-skills仓库中对Skill双输出的要求。

Features

功能特性

Core Functionality

核心功能

Word Count Analysis: Count total words, unique words, and word frequency
Character Statistics: Analyze character count, line count, and special characters
Text Transformations: Convert text to uppercase, lowercase, or title case
File Processing: Process single text files or batch process directories
Dual Output Formats: Generate results in both JSON and human-readable formats

字数统计分析: 统计总字数、唯一字数和词频
字符统计: 分析字符数、行数和特殊字符
文本转换: 将文本转换为大写、小写或标题格式
文件处理: 处理单个文本文件或批量处理目录
双输出格式: 生成JSON和人类可读两种格式的结果

Technical Features

技术特性

Command-line interface with comprehensive argument parsing
Error handling for common file and processing issues
Progress reporting for batch operations
Configurable output formatting and verbosity levels
Cross-platform compatibility with standard library only dependencies

带有完善参数解析功能的命令行界面
针对常见文件和处理问题的错误处理机制
批量操作的进度报告
可配置的输出格式和详细程度
仅依赖标准库，实现跨平台兼容

Usage

使用方法

Basic Text Analysis

基础文本分析

bash

python text_processor.py analyze document.txt
python text_processor.py analyze document.txt --output results.json

bash

python text_processor.py analyze document.txt
python text_processor.py analyze document.txt --output results.json

Text Transformation

文本转换

bash

python text_processor.py transform document.txt --mode uppercase
python text_processor.py transform document.txt --mode title --output transformed.txt

bash

python text_processor.py transform document.txt --mode uppercase
python text_processor.py transform document.txt --mode title --output transformed.txt

Batch Processing

批量处理

bash

python text_processor.py batch text_files/ --output results/
python text_processor.py batch text_files/ --format json --output batch_results.json

bash

python text_processor.py batch text_files/ --output results/
python text_processor.py batch text_files/ --format json --output batch_results.json

Examples

示例

Example 1: Basic Word Count

示例1：基础字数统计

bash

$ python text_processor.py analyze sample.txt
=== TEXT ANALYSIS RESULTS ===
File: sample.txt
Total words: 150
Unique words: 85
Total characters: 750
Lines: 12
Most frequent word: "the" (8 occurrences)

bash

$ python text_processor.py analyze sample.txt
=== TEXT ANALYSIS RESULTS ===
File: sample.txt
Total words: 150
Unique words: 85
Total characters: 750
Lines: 12
Most frequent word: "the" (8 occurrences)

Example 2: JSON Output

示例2：JSON输出

bash

$ python text_processor.py analyze sample.txt --format json
{
  "file": "sample.txt",
  "statistics": {
    "total_words": 150,
    "unique_words": 85,
    "total_characters": 750,
    "lines": 12,
    "most_frequent": {
      "word": "the",
      "count": 8
    }
  }
}

bash

$ python text_processor.py analyze sample.txt --format json
{
  "file": "sample.txt",
  "statistics": {
    "total_words": 150,
    "unique_words": 85,
    "total_characters": 750,
    "lines": 12,
    "most_frequent": {
      "word": "the",
      "count": 8
    }
  }
}

Example 3: Text Transformation

示例3：文本转换

bash

$ python text_processor.py transform sample.txt --mode title
Original: "hello world from the text processor"
Transformed: "Hello World From The Text Processor"

bash

$ python text_processor.py transform sample.txt --mode title
Original: "hello world from the text processor"
Transformed: "Hello World From The Text Processor"

Installation

安装

This skill requires only Python 3.7 or later with the standard library. No external dependencies are required.

Clone or download the skill directory
Navigate to the scripts directory
Run the text processor directly with Python

bash

cd scripts/
python text_processor.py --help

本Skill仅需要Python 3.7或更高版本及标准库，无需外部依赖。

克隆或下载Skill目录
进入scripts目录
直接使用Python运行文本处理器

bash

cd scripts/
python text_processor.py --help

Configuration

配置

The text processor supports various configuration options through command-line arguments:

```
--format
```
: Output format (json, text)
```
--verbose
```
: Enable verbose output and progress reporting
```
--output
```
: Specify output file or directory
```
--encoding
```
: Specify text file encoding (default: utf-8)

文本处理器支持通过命令行参数配置多种选项：

```
--format
```
: 输出格式（json、text）
```
--verbose
```
: 启用详细输出和进度报告
```
--output
```
: 指定输出文件或目录
```
--encoding
```
: 指定文本文件编码（默认：utf-8）

Architecture

架构

The skill follows a simple modular architecture:

TextProcessor Class: Core processing logic and statistics calculation
OutputFormatter Class: Handles dual output format generation
FileManager Class: Manages file I/O operations and batch processing
CLI Interface: Command-line argument parsing and user interaction

该Skill采用简单的模块化架构：

TextProcessor类: 核心处理逻辑和统计计算
OutputFormatter类: 处理双输出格式生成
FileManager类: 管理文件I/O操作和批量处理
CLI界面: 命令行参数解析和用户交互

Error Handling

错误处理

The skill includes comprehensive error handling for:

File not found or permission errors
Invalid encoding or corrupted text files
Memory limitations for very large files
Output directory creation and write permissions
Invalid command-line arguments and parameters

该Skill包含针对以下场景的完善错误处理：

文件未找到或权限错误
无效编码或损坏的文本文件
超大文件的内存限制
输出目录创建和写入权限问题
无效的命令行参数和参数值

Performance Considerations

性能注意事项

Efficient memory usage for large text files through streaming
Optimized word counting using dictionary lookups
Batch processing with progress reporting for large datasets
Configurable encoding detection for international text

通过流式处理实现大文本文件的高效内存使用
使用字典查找优化字数统计
针对大型数据集的带进度报告的批量处理
可配置的编码检测，支持多语言文本

Contributing

贡献指南

This skill serves as a reference implementation and contributions are welcome to demonstrate best practices:

Follow PEP 8 coding standards
Include comprehensive docstrings
Add test cases with sample data
Update documentation for any new features
Ensure backward compatibility

本Skill作为参考实现，欢迎贡献代码以展示最佳实践：

遵循PEP 8编码标准
包含完善的文档字符串
添加带示例数据的测试用例
新增功能时更新对应文档
确保向后兼容性

Limitations

限制

As a BASIC tier skill, some advanced features are intentionally omitted:

Complex text analysis (sentiment, language detection)
Advanced file format support (PDF, Word documents)
Database integration or external API calls
Parallel processing for very large datasets

This skill demonstrates the essential structure and quality standards required for BASIC tier skills in the claude-skills ecosystem while remaining simple and focused on core functionality.

作为BASIC层级的Skill，部分高级功能被有意省略：

复杂文本分析（情感分析、语言检测）
高级文件格式支持（PDF、Word文档）
数据库集成或外部API调用
超大数据集的并行处理

本Skill展示了claude-skills生态中BASIC层级Skill所需的核心结构和质量标准，同时保持简单，专注于核心功能。