OpenCCNET 1.1.0
dotnet add package OpenCCNET --version 1.1.0
NuGet\Install-Package OpenCCNET -Version 1.1.0
<PackageReference Include="OpenCCNET" Version="1.1.0" />
<PackageVersion Include="OpenCCNET" Version="1.1.0" />
<PackageReference Include="OpenCCNET" />
paket add OpenCCNET --version 1.1.0
#r "nuget: OpenCCNET, 1.1.0"
#:package OpenCCNET@1.1.0
#addin nuget:?package=OpenCCNET&version=1.1.0
#tool nuget:?package=OpenCCNET&version=1.1.0
OpenCC.NET
简体中文 | English
介绍
OpenCC.NET 是 OpenCC (Open Chinese Convert, 开放中文转换) 的 C# 非官方版本,支持中文简繁体之间词汇级别的转换,同时还支持地域间异体字以及词汇的转换。
特点
- 严格区分「一简对多繁」和「一简对多异」
- 完全兼容异体字
- 严格审校一简对多繁词条,原则为「能分则不合」
- 支持港/台异体字转换,以及大陆/台湾常用词汇转换
- 完全兼容OpenCC原生词库,可以自由修改、导入、扩展
- 支持自定义分词逻辑
- 基于 .NET Standard 2.0,同时支持 .NET Framework 4.6.1 和 .NET Core 2.0 及以上版本
开始
获取
Nuget 搜索 OpenCCNET 并安装,在项目代码中引入命名空间 OpenCCNET。Nuget 包中自带 Dictionary(字典文件)和 JiebaResource(Jieba.NET运行所需的词典及其它数据文件),默认会复制到程序输出目录。
使用
在使用前请调用ZhConverter.Initialize(),含四个默认参数:
dictionaryDirectory: 字典文件路径(默认为"Dictionary")jiebaResourceDirectory: Jieba.NET资源路径(默认为"JiebaResource")isParallelEnabled: 是否启用并行处理(默认为false)segmentMode: 分词模式(默认为结巴分词)
// 默认初始化(使用结巴分词)
ZhConverter.Initialize();
// 或者指定分词模式(例如:OpenCC 的原版最大匹配分词算法)
ZhConverter.Initialize(segmentMode: SegmentMode.MaxMatch);
OpenCC.NET 提供了两种风格的API:
ZhConverter静态类
| 方法 | 简介 | 备注 |
|---|---|---|
| HansToHant(string) | 简体中文=>繁体中文(OpenCC标准) | |
| HansToTW(string, bool=false) | 简体中文=>繁体中文(台湾) | bool参数决定是否转换为台湾地区常用词汇 |
| HansToHK(string) | 简体中文=>繁体中文(香港) | |
| HantToHans(string) | 繁体中文=>简体中文 | |
| HantToTW(string, bool=false) | 繁体中文=>繁体中文(台湾) | bool参数决定是否转换为台湾地区常用词汇 |
| HantToHK(string) | 繁体中文=>繁体中文(香港) | |
| TWToHans(string, bool=false) | 繁体中文(台湾)=>简体中文 | bool参数决定是否转换为大陆地区常用词汇 |
| TWToHant(string, bool=false) | 繁体中文(台湾)=>繁体中文(OpenCC标准) | bool参数决定是否转换为大陆地区常用词汇 |
| HKToHans(string) | 繁体中文(香港)=>简体中文 | |
| HKToHant(string) | 繁体中文(香港)=>繁体中文(OpenCC标准) | |
| KyuuToShin(string) | 日语(旧字体)=>日语(新字体) | |
| ShinToKyuu(string) | 日语(新字体)=>日语(旧字体) |
var input = "为我的电脑换了新的内存,开启电脑后感觉看网络视频更加流畅了";
// 爲我的電腦換了新的內存,開啓電腦後感覺看網絡視頻更加流暢了
Console.WriteLine(ZhConverter.HansToHant(input));
// 為我的電腦換了新的內存,開啟電腦後感覺看網絡視頻更加流暢了
Console.WriteLine(ZhConverter.HansToTW(input));
// 為我的電腦換了新的記憶體,開啟電腦後感覺看網路影片更加流暢了
Console.WriteLine(ZhConverter.HansToTW(input, true));
// 為我的電腦換了新的內存,開啓電腦後感覺看網絡視頻更加流暢了
Console.WriteLine(ZhConverter.HansToHK(input));
// 沖繩縣內の學校
Console.WriteLine(ZhConverter.ShinToKyuu("沖縄県内の学校"));
string类扩展方法
| 方法 | 简介 | 备注 |
|---|---|---|
| ToHantFromHans() | 简体中文=>繁体中文(OpenCC标准) | |
| ToTWFromHans(bool=false) | 简体中文=>繁体中文(台湾) | bool参数决定是否转换为台湾地区常用词汇 |
| ToHKFromHans() | 简体中文=>繁体中文(香港) | |
| ToHansFromHant() | 繁体中文=>简体中文 | |
| ToTWFromHant(bool=false) | 繁体中文=>繁体中文(台湾) | bool参数决定是否转换为台湾地区常用词汇 |
| ToHKFromHant() | 繁体中文=>繁体中文(香港) | |
| ToHansFromTW(bool=false) | 繁体中文(台湾)=>简体中文 | bool参数决定是否转换为大陆地区常用词汇 |
| ToHantFromTW(bool=false) | 繁体中文(台湾)=>繁体中文(OpenCC标准) | bool参数决定是否转换为大陆地区常用词汇 |
| ToHansFromHK() | 繁体中文(香港)=>简体中文 | |
| ToHantFromHK() | 繁体中文(香港)=>繁体中文(OpenCC标准) | |
| ToShinFromKyuu() | 日语(旧字体)=>日语(新字体) | |
| ToKyuuFromShin() | 日语(新字体)=>日语(旧字体) |
var input = "為我的電腦換了新的記憶體,開啟電腦後感覺看網路影片更加流暢了";
// 爲我的電腦換了新的記憶體,開啓電腦後感覺看網路影片更加流暢了
Console.WriteLine(input.ToHantFromTW());
// 为我的电脑换了新的记忆体,开启电脑后感觉看网路影片更加流畅了
Console.WriteLine(input.ToHansFromTW());
// 为我的电脑换了新的内存,打开电脑后感觉看网络视频更加流畅了
Console.WriteLine(input.ToHansFromTW(true));
// 独逸連邦共和国
Console.WriteLine("獨逸聯邦共和國".ToShinFromKyuu());
自定义
分词模式
OpenCC.NET 支持三种分词模式,可以根据需求灵活切换:
1. 结巴分词模式(Jieba)- 默认
使用 jieba.NET 进行中文分词。默认设置 Jieba.NET 资源路径为 "JiebaResource",可以自行指定。
// 初始化时指定
ZhConverter.Initialize(segmentMode: SegmentMode.Jieba);
// 或运行时切换
ZhConverter.ZhSegment.SetMode(SegmentMode.Jieba);
2. 最大匹配算法模式(MaxMatch)
使用 OpenCC 原版的最大匹配分词算法。
// 初始化时指定
ZhConverter.Initialize(segmentMode: SegmentMode.MaxMatch);
// 或运行时切换
ZhConverter.ZhSegment.SetMode(SegmentMode.MaxMatch);
3. 自定义分词模式(Custom)
使用用户自定义的分词算法,先分词一次,然后在转换链中重复使用分词结果。
// 方式1:直接设置分词委托(自动切换到 Custom 模式)(兼容老版本)
ZhConverter.ZhSegment.Segment = input =>
{
// 自定义分词逻辑,例如按空格分词
return input.Split(' ', StringSplitOptions.RemoveEmptyEntries);
};
// 方式2:使用 SetCustomSegment 方法
ZhConverter.ZhSegment.SetCustomSegment(input =>
{
// 自定义分词逻辑,例如按字符分词
return input.Select(c => c.ToString());
});
Jieba 分词自定义
OpenCC.NET默认使用jieba.NET实现分词,项目中使用了静态的JiebaSegmenter
public static JiebaSegmenter Jieba = new JiebaSegmenter();
因此可以通过ZhConverter.ZhSegment.Jieba进行自定义设置,详情请见jieba.NET。
调用ResetSegment()可重新指定使用Jieba分词并且重置Jieba参数。
并行处理
对于大量文本的转换,可以启用并行处理来提高性能:
// 初始化时启用
ZhConverter.Initialize(isParallelEnabled: true);
// 或运行时设置
ZhConverter.IsParallelEnabled = true;
引用
OpenCC
BYVoid/OpenCC 提供词库。
jieba.NET
anderscui/jieba.NET 提供分词功能。
| Product | Versions Compatible and additional computed target framework versions. |
|---|---|
| .NET | net5.0 was computed. net5.0-windows was computed. net6.0 was computed. net6.0-android was computed. net6.0-ios was computed. net6.0-maccatalyst was computed. net6.0-macos was computed. net6.0-tvos was computed. net6.0-windows was computed. net7.0 was computed. net7.0-android was computed. net7.0-ios was computed. net7.0-maccatalyst was computed. net7.0-macos was computed. net7.0-tvos was computed. net7.0-windows was computed. net8.0 was computed. net8.0-android was computed. net8.0-browser was computed. net8.0-ios was computed. net8.0-maccatalyst was computed. net8.0-macos was computed. net8.0-tvos was computed. net8.0-windows was computed. net9.0 was computed. net9.0-android was computed. net9.0-browser was computed. net9.0-ios was computed. net9.0-maccatalyst was computed. net9.0-macos was computed. net9.0-tvos was computed. net9.0-windows was computed. net10.0 was computed. net10.0-android was computed. net10.0-browser was computed. net10.0-ios was computed. net10.0-maccatalyst was computed. net10.0-macos was computed. net10.0-tvos was computed. net10.0-windows was computed. |
| .NET Core | netcoreapp2.0 was computed. netcoreapp2.1 was computed. netcoreapp2.2 was computed. netcoreapp3.0 was computed. netcoreapp3.1 was computed. |
| .NET Standard | netstandard2.0 is compatible. netstandard2.1 was computed. |
| .NET Framework | net461 was computed. net462 was computed. net463 was computed. net47 was computed. net471 was computed. net472 was computed. net48 was computed. net481 was computed. |
| MonoAndroid | monoandroid was computed. |
| MonoMac | monomac was computed. |
| MonoTouch | monotouch was computed. |
| Tizen | tizen40 was computed. tizen60 was computed. |
| Xamarin.iOS | xamarinios was computed. |
| Xamarin.Mac | xamarinmac was computed. |
| Xamarin.TVOS | xamarintvos was computed. |
| Xamarin.WatchOS | xamarinwatchos was computed. |
-
.NETStandard 2.0
- jieba.NET (>= 0.42.2)
NuGet packages
This package is not used by any NuGet packages.
GitHub repositories
This package is not used by any popular GitHub repositories.
| Version | Downloads | Last Updated | |
|---|---|---|---|
| 1.1.0 | 263 | 11/15/2025 | |
| 1.0.3 | 4,126 | 3/21/2025 | |
| 1.0.2 | 139,402 | 5/5/2022 | |
| 1.0.1 | 628 | 4/26/2022 | |
| 1.0.0 | 616 | 4/25/2022 | |
| 0.2.2 | 620 | 1/18/2022 | |
| 0.2.1 | 692 | 1/18/2022 | |
| 0.2.0 | 777 | 12/3/2021 | |
| 0.1.4 | 449 | 11/29/2021 | |
| 0.1.3 | 421 | 11/23/2021 | |
| 0.1.2 | 406 | 11/23/2021 | |
| 0.1.1 | 1,214 | 6/10/2021 | |
| 0.1.0 | 515 | 6/10/2021 |