powershell-utf8-fixer
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChinesePowerShell UTF-8 Fixer
PowerShell UTF-8 修复工具
Problem
问题
PowerShell on Windows requires UTF-8 with BOM encoding for scripts containing non-ASCII characters (Korean, Chinese, Japanese, emoji, etc.). Without the BOM (Byte Order Mark), PowerShell interprets the file using the system's default encoding (typically CP949 on Korean Windows), causing character display issues.
Symptoms:
- Korean text appears as or garbled characters
??? - Emoji and special characters don't display correctly
- output shows corrupted text
Write-Host - One script works fine while another with identical code shows garbled text
Root cause: File encoding mismatch
- ✓ UTF-8 with BOM (EF BB BF): PowerShell reads correctly
- ✗ UTF-8 without BOM: PowerShell uses system default encoding → garbled text
Windows系统上的PowerShell要求包含非ASCII字符(韩语、中文、日语、表情符号等)的脚本采用带BOM的UTF-8编码。如果没有BOM(字节顺序标记),PowerShell会使用系统默认编码(韩语Windows通常为CP949)来解析文件,从而导致字符显示异常。
症状:
- 韩语文本显示为或乱码
??? - 表情符号和特殊字符无法正常显示
- 输出的文本损坏
Write-Host - 两段代码完全相同的脚本,一段正常显示,另一段出现乱码
根本原因: 文件编码不匹配
- ✓ 带BOM的UTF-8(EF BB BF):PowerShell可正确读取
- ✗ 不带BOM的UTF-8:PowerShell使用系统默认编码 → 乱码
Quick Fix
快速修复
When encountering encoding issues in PowerShell scripts:
- Check encoding:
bash
node scripts/check_powershell_encoding.js <file_or_directory>遇到PowerShell脚本编码问题时:
- 检查编码:
bash
node scripts/check_powershell_encoding.js <file_or_directory>Or with npm:
或使用npm:
npm run check <file_or_directory>
2. **Fix encoding:**
```bash
node scripts/fix_powershell_encoding.js <file_or_directory>npm run check <file_or_directory>
2. **修复编码:**
```bash
node scripts/fix_powershell_encoding.js <file_or_directory>Or with npm:
或使用npm:
npm run fix <file_or_directory>
undefinednpm run fix <file_or_directory>
undefinedWorkflow
工作流程
When Creating New PowerShell Scripts
创建新PowerShell脚本时
After creating any file with non-ASCII characters:
.ps1bash
node scripts/fix_powershell_encoding.js script.ps1This ensures the file is saved with UTF-8 BOM from the start.
创建包含非ASCII字符的文件后:
.ps1bash
node scripts/fix_powershell_encoding.js script.ps1这能确保文件从一开始就以带BOM的UTF-8格式保存。
When Diagnosing Display Issues
排查显示问题时
If a PowerShell script shows garbled text:
- Check the encoding:
bash
node scripts/check_powershell_encoding.js problematic_script.ps1- If it shows "UTF-8 without BOM", fix it:
bash
node scripts/fix_powershell_encoding.js problematic_script.ps1- Test the script again - text should now display correctly
如果PowerShell脚本出现乱码:
- 检查编码:
bash
node scripts/check_powershell_encoding.js problematic_script.ps1- 如果显示“UTF-8 without BOM”,则进行修复:
bash
node scripts/fix_powershell_encoding.js problematic_script.ps1- 再次测试脚本 - 文本应能正常显示
Batch Processing
批量处理
To check/fix all PowerShell scripts in a directory:
bash
undefined要检查/修复目录下所有PowerShell脚本:
bash
undefinedCheck all scripts
检查所有脚本
node scripts/check_powershell_encoding.js scripts/windows/
node scripts/check_powershell_encoding.js scripts/windows/
Fix all scripts that need it
修复所有需要处理的脚本
node scripts/fix_powershell_encoding.js scripts/windows/
undefinednode scripts/fix_powershell_encoding.js scripts/windows/
undefinedPrevention
预防措施
To prevent encoding issues in the future:
- Before committing: Run the checker on modified files
.ps1 - In CI/CD: Add encoding validation to your pipeline
- Editor settings: Configure your editor to save files as UTF-8 with BOM
.ps1
为避免未来出现编码问题:
- 提交前: 对修改后的文件运行检查工具
.ps1 - CI/CD中: 在流水线中添加编码验证步骤
- 编辑器设置: 配置编辑器将文件保存为带BOM的UTF-8格式
.ps1
Editor Configuration Examples
编辑器配置示例
VS Code (settings.json):
json
{
"[powershell]": {
"files.encoding": "utf8bom"
}
}Cursor (settings.json):
json
{
"[powershell]": {
"files.encoding": "utf8bom"
}
}VS Code(settings.json):
json
{
"[powershell]": {
"files.encoding": "utf8bom"
}
}Cursor(settings.json):
json
{
"[powershell]": {
"files.encoding": "utf8bom"
}
}Scripts
脚本说明
check_powershell_encoding.js
check_powershell_encoding.js
Diagnoses encoding issues without modifying files.
Usage:
bash
node scripts/check_powershell_encoding.js <file_or_directory>Output:
- ✓ UTF-8 with BOM: File is correctly encoded
- ⚠ UTF-8 without BOM: File needs fixing
- ⚠ UTF-16: File uses UTF-16 encoding
- ✗ Unknown: Unable to detect encoding
Exit codes:
- 0: All files have UTF-8 BOM
- 1: Some files need fixing or have errors
诊断编码问题但不修改文件。
使用方法:
bash
node scripts/check_powershell_encoding.js <file_or_directory>输出:
- ✓ UTF-8 with BOM:文件编码正确
- ⚠ UTF-8 without BOM:文件需要修复
- ⚠ UTF-16:文件使用UTF-16编码
- ✗ Unknown:无法检测编码
退出码:
- 0:所有文件均为带BOM的UTF-8格式
- 1:部分文件需要修复或存在错误
fix_powershell_encoding.js
fix_powershell_encoding.js
Adds UTF-8 BOM to PowerShell files that don't have it.
Usage:
bash
node scripts/fix_powershell_encoding.js <file_or_directory>Behavior:
- Reads file content as UTF-8
- Writes back with UTF-8 BOM (utf-8-sig)
- Skips files that already have UTF-8 BOM
- Processes files recursively in directories
.ps1
Exit codes:
- 0: All files processed successfully
- 1: Errors occurred during processing
为不带BOM的PowerShell文件添加UTF-8 BOM。
使用方法:
bash
node scripts/fix_powershell_encoding.js <file_or_directory>行为:
- 以UTF-8格式读取文件内容
- 以带BOM的UTF-8(utf-8-sig)格式写回
- 跳过已带BOM的UTF-8文件
- 递归处理目录下的文件
.ps1
退出码:
- 0:所有文件处理成功
- 1:处理过程中出现错误
Technical Details
技术细节
UTF-8 BOM: The byte sequence at the start of a file signals UTF-8 encoding to PowerShell and other Windows applications.
EF BB BFWhy PowerShell needs BOM:
- Without BOM, PowerShell uses (often CP949/CP1252)
[Console]::OutputEncoding - With BOM, PowerShell correctly identifies the file as UTF-8
- This is specific to Windows PowerShell's file reading behavior
Alternative workarounds (not recommended):
- Adding encoding commands to each script (verbose, error-prone)
- Using parameters (doesn't help with file reading)
-Encoding UTF8 - Avoiding non-ASCII characters (limits usability)
The proper solution is to save files with UTF-8 BOM.
UTF-8 BOM: 文件开头的字节序列用于向PowerShell及其他Windows应用标识UTF-8编码。
EF BB BFPowerShell需要BOM的原因:
- 没有BOM时,PowerShell会使用(通常为CP949/CP1252)
[Console]::OutputEncoding - 有BOM时,PowerShell能正确识别文件为UTF-8格式
- 这是Windows PowerShell文件读取行为的特有机制
替代解决方法(不推荐):
- 在每个脚本中添加编码命令(繁琐且易出错)
- 使用参数(对文件读取无帮助)
-Encoding UTF8 - 避免使用非ASCII字符(限制可用性)
正确的解决方案是将文件保存为带BOM的UTF-8格式。