ptr727.LanguageTags 1.0.46

dotnet add package ptr727.LanguageTags --version 1.0.46
                    
NuGet\Install-Package ptr727.LanguageTags -Version 1.0.46
                    
This command is intended to be used within the Package Manager Console in Visual Studio, as it uses the NuGet module's version of Install-Package.
<PackageReference Include="ptr727.LanguageTags" Version="1.0.46" />
                    
For projects that support PackageReference, copy this XML node into the project file to reference the package.
<PackageVersion Include="ptr727.LanguageTags" Version="1.0.46" />
                    
Directory.Packages.props
<PackageReference Include="ptr727.LanguageTags" />
                    
Project file
For projects that support Central Package Management (CPM), copy this XML node into the solution Directory.Packages.props file to version the package.
paket add ptr727.LanguageTags --version 1.0.46
                    
#r "nuget: ptr727.LanguageTags, 1.0.46"
                    
#r directive can be used in F# Interactive and Polyglot Notebooks. Copy this into the interactive tool or source code of the script to reference the package.
#:package ptr727.LanguageTags@1.0.46
                    
#:package directive can be used in C# file-based apps starting in .NET 10 preview 4. Copy this into a .cs file before any lines of code to reference the package.
#addin nuget:?package=ptr727.LanguageTags&version=1.0.46
                    
Install as a Cake Addin
#tool nuget:?package=ptr727.LanguageTags&version=1.0.46
                    
Install as a Cake Tool

LanguageTags

C# .NET library for ISO 639-2, ISO 639-3, RFC 5646 / BCP 47 language tags.

Build Status

Code and Pipeline is on GitHub
GitHub Last Commit
GitHub Workflow Status

NuGet Package

Packages published on NuGet
NuGet

Introduction

This project serves two primary purposes:

  • Publishing ISO 639-2, ISO 639-3, RFC 5646 language tag records in JSON and C# format.
  • Code for IETF BCP 47 language tag construction and parsing per the RFC 5646 semantic rules.

Terminology clarification:

  • An IETF BCP 47 language tag is a standardized code that is used to identify human languages on the Internet.
  • The tag structure is standardized by the Internet Engineering Task Force (IETF) in Best Current Practice (BCP) 47.
  • RFC 5646 defines the BCP 47 language tag syntax and semantic rules.
  • The subtags are maintained by Internet Assigned Numbers Authority (IANA) Language Subtag Registry.
  • ISO 639 is a standard for classifying languages and language groups, and is maintained by the International Organization for Standardization (ISO).
  • RFC 5646 incorporates ISO 639, ISO 15924, ISO 3166, and UN M.49 codes as the foundation for its language tags.

Note that the implemented language tag parsing and normalization logic may be incomplete or inaccurate.

Refer to Language Tag Libraries for other known implementations.
Refer to References for specification details.

Build Artifacts

The build tool downloads language tag data files, converts them into JSON files for easy consumption, and generates C# classes with all the tags for direct use in code.

The data files are updated weekly using a scheduled actions job.

Usage

Tag Format

Refer to RFC 5646 Section 2.1 for complete language tag syntax and rules.

IETF language tags are constructed from sub-tags in the form of:

Examples:

  • zh : [Language]
  • zh-yue : [Language]-[Extended language]
  • zh-yue-hk: [Language]-[Extended language]-[Region]
  • hy-latn-it-arevela: [Language]-[Script]-[Region]-[Variant]
  • en-a-bbb-x-a-ccc : [Language]-[Extension]-[Private Use]
  • en-latn-gb-boont-r-extended-sequence-x-private : [Language]-[Script]-[Region]-[Variant]-[Extension]-[Private Use]

Tag Lookup

Tag records can be constructed by calling Create(), or loaded from data LoadData(), or loaded from JSON LoadJson(). The records and record collections are immutable and can safely be reused and shared across threads.

Each class implements a Find(string languageTag, bool includeDescription) method that will search all tags in all records for a matching tag.
This is mostly a convenience function and specific use cases should use specific tags.

Iso6392Data iso6392 = Iso6392Data.Create();
Iso6392Data.Record record = iso6392.Find("afr", false)
// record.Part2B = afr
// record.RefName = Afrikaans
record = iso6392.Find("zulu", true)
// record.Part2B = zul
// record.RefName = Zulu
Iso6393Data iso6393 = Iso6393Data.LoadData("iso6393");
Iso6393Data.Record record = iso6393.Find("zh", false)
// record.Id = zho
// record.Part1 = zh
// record.RefName = Chinese
record = iso6392.Find("yue chinese", true)
// record.Id = yue
// record.RefName = Yue Chinese
Rfc5646 rfc5646 = Rfc5646.LoadJson("rfc5646.json");
Rfc5646.Record record = rfc5646.Find("de", false)
// record.SubTag = de
// record.Description = German
record = iso6392.Find("zh-cmn-Hant", false)
// record.Tag = zh-cmn-Hant
// record.Description = Mandarin Chinese (Traditional)
record = iso6392.Find("Inuktitut in Canadian", true)
// record.Tag = iu-Cans
// record.Description = Inuktitut in Canadian Aboriginal Syllabic script

Tag Conversion

Tags can be converted between ISO 639 and IETF forms using GetIetfFromIso() and GetIsoFromIetf().
Tag lookup will use the user defined Overrides map, or the tag record lists, or the local system CultureInfo.
If a match is not found the undetermined und tag will be returned.

LanguageLookup languageLookup = new();
languageLookup.GetIetfFromIso("afr"); // af
languageLookup.GetIetfFromIso("zho"); // zh
LanguageLookup languageLookup = new();
languageLookup.GetIsoFromIetf("af"); // afr
languageLookup.GetIsoFromIetf("zh-cmn-Hant"); // chi
languageLookup.GetIsoFromIetf("cmn-Hant"); // chi

Tag Matching

Tag matching can be used to select content based on preferred vs. available languages.
E.g. in HTTP Accept-Language and Content-Language, or Matroska media stream LanguageIETF Element.

IETF language tags are in the form of [Language]-[Extended language]-[Script]-[Region]-[Variant]-[Extension]-[Private Use], and sub-tag matching happens left to right until a match is found.

Examples:

  • pt will match pt Portuguese, or pt-BR Brazilian Portuguese, or pt-PT European Portuguese.
  • pt-BR will only match pt-BR Brazilian Portuguese\
  • zh will match zh Chinese, or zh-Hans simplified Chinese, or zh-Hant for traditional Chinese, and other variants.
  • zh-Hans will only match zh-Hans simplified Chinese.
LanguageLookup languageLookup = new();
languageLookup.IsMatch("en", "en-US"); // true
languageLookup.IsMatch("zh", "zh-cmn-Hant"); // true
languageLookup.IsMatch("sr-Latn", "sr-Latn-RS"); // true
languageLookup.IsMatch("zha", "zh-Hans"); // false
languageLookup.IsMatch("zh-Hant", "zh-Hans"); // false

Tag Builder

The LanguageTagBuilder class supports fluent builder style tag construction, and will return a constructed LanguageTag class through the final Build() or Normalize() methods.

The Build() method will construct the tag, but will not perform any correctness validation or normalization.
Use the Validate() method to test for shape correctness. See Tag Validation for details.

The Normalize() method will build the tag and perform validation and normalization. See Tag Normalization for details.

LanguageTag languageTag = new LanguageTagBuilder()
    .PrimaryLanguage("en")
    .Script("latn")
    .Region("gb")
    .Variant("boont")
    .ExtensionsPrefix('r')
    .ExtensionsAdd("extended")
    .ExtensionsAdd("sequence")
    .PrivateUseAdd("private")
    .Build();
languageTag.ToString(); // en-latn-gb-boont-r-extended-sequence-x-private
LanguageTag languageTag = new LanguageTagBuilder()
    .PrivateUseAddRange(["private", "use"])
    .Build();
languageTag.ToString(); // x-private-use
LanguageTag languageTag = new LanguageTagBuilder()
    .Language("ar")
    .ExtendedLanguage("latn")
    .Region("de")
    .Variant("nedis")
    .Normalize();
languageTag.ToString(); // arb-Latn-DE-nedis

Tag Parser

The LanguageTagParser class Parse() method will parse the text form language tag and return a constructed LanguageTag class, or null in case of parsing failure.

Parsing will validate all subtags for correctness in type, length, and position, but not value, and case will not be modified.

Grandfathered tags will be converted to their current preferred form and parsed as such.
E.g. en-gb-oeden-GB-oxendict, i-klingontlh.

The Normalize() method will parse the text tag, and perform validation and normalization. See Tag Normalization for details.

LanguageTag languageTag = new LanguageTagParser()
    .Parse("en-latn-gb-boont-r-extended-sequence-x-private");
// languageTag.Language = en
// languageTag.Script = latn
// languageTag.Region = gb
// languageTag.VariantList = [ boont ]
// languageTag.ExtensionList = [ Prefix: r, TagList: [ extended, sequence ] ]
// languageTag.PrivateUse = [ Prefix: x, TagList: [ private ] ]
languageTag.ToString(); // en-latn-gb-boont-r-extended-sequence-x-private
LanguageTag languageTag = new LanguageTagParser()
    .Parse("en-gb-oed"); // Grandfathered
// languageTag.Language = en
// languageTag.Region = gb
// languageTag.VariantList = [ oxendict ]
languageTag.ToString(); // en-gb-oxendict

Tag Normalization

The LanguageTagParser class Normalize() method will convert tags to their canonical form.
See RFC 5646 Section 4.5 for details

Normalization includes the following:

  • Replace the language subtag with their preferred values.
    • E.g. iwhe, inid
  • Replace extended language subtags with their preferred language subtag values.
    • E.g. ar-afbafb, zh-yueyue
  • Remove or replace redundant subtags their preferred values.
    • E.g. zh-cmn-Hantcmn-Hant, zh-gangan, sgn-COcsn
  • Remove redundant script subtags.
    • E.g. af-Latnaf, en-Latnen
  • Normalize case.
    • All subtags lowercase.
    • Script title case, e.g. Latn.
    • Region uppercase, e.g. GB.
  • Sort sub tags.
    • Sort variant subtags by value.
    • Sort extension subtags by prefix and subtag values.
    • Sort private use subtags by value.
languageTag = new LanguageTagBuilder()
    .Language("en")
    .ExtensionAdd('b', ["ccc"]) // Add b before a to force a sort
    .ExtensionAdd('a', ["bbb", "aaa"]) // Add bbb before aaa to force a sort
    .PrivateUseAddRange(["ccc", "a"]) // Add ccc before a to force a sort
    .Normalize();
languageTag.ToString(); // en-a-aaa-bbb-b-ccc-x-a-ccc

LanguageTag languageTag = new LanguageTagParser()
    .Normalize("en-latn-gb-boont-r-sequence-extended-x-private");
languageTag.ToString(); // en-GB-boont-r-extended-sequence-x-private
LanguageTag languageTag = new LanguageTagParser()
    .Parse("ar-arb-latn-de-nedis-foobar");
languageTag.ToString(); // ar-arb-latn-de-nedis-foobar

LanguageTag normalizeTag = new LanguageTagParser()
    .Normalize(languageTag);
normalizeTag.ToString(); // arb-Latn-DE-foobar-nedis

Tag Validation

The LanguageTagParser and LanguageTag class Validate() method will verify subtags for correctness.
See RFC 5646 Section 2.1 and RFC 5646 Section 2.2.9 for details. Refer to Tag Format for a summary.

Note that LanguageTag objects created by Parse() or Normalize() are already verified for form correctness during parsing, and Validate() is primarily of use when using the LanguageTagBuilder Build() method directly.

Validation includes the following:

  • Subtag shape correctness, see Tag Format for a summary.
  • No duplicate variants, extension prefixes, extension tags, or private tags.
  • No missing subtags.

Testing

The BCP47 language subtag lookup site offers convenient tag parsing and validation capabilities.

Refer to unit tests for code validation.
Note that testing attests to the desired behavior in code, but the implemented functionality may not be complete or accurate per the RFC 5646 specification.

References

Language Tag Libraries

3rd Party Tools

License

Licensed under the MIT License
GitHub

Product Compatible and additional computed target framework versions.
.NET net9.0 is compatible.  net9.0-android was computed.  net9.0-browser was computed.  net9.0-ios was computed.  net9.0-maccatalyst was computed.  net9.0-macos was computed.  net9.0-tvos was computed.  net9.0-windows was computed.  net10.0 was computed.  net10.0-android was computed.  net10.0-browser was computed.  net10.0-ios was computed.  net10.0-maccatalyst was computed.  net10.0-macos was computed.  net10.0-tvos was computed.  net10.0-windows was computed. 
Compatible target framework(s)
Included target framework(s) (in package)
Learn more about Target Frameworks and .NET Standard.
  • net9.0

    • No dependencies.

NuGet packages

This package is not used by any NuGet packages.

GitHub repositories

This package is not used by any popular GitHub repositories.

Version Downloads Last Updated
1.0.46 175 8/14/2025
1.0.45-g3185f6be67 122 8/14/2025
1.0.44 179 8/11/2025
1.0.43-gcfcbaf6aaa 121 8/11/2025
1.0.42-gb52ba3d4ff 119 8/11/2025
1.0.40 268 7/26/2025
1.0.39-g88d004516d 249 7/26/2025
1.0.38-gf999e9bb85 253 7/26/2025
1.0.37-g9ecfb29655 22 7/19/2025
1.0.35 120 7/17/2025
1.0.35-g93861ef687 114 7/17/2025
1.0.34-gd7ad864d88 112 7/17/2025
1.0.33-g56e9a89989 108 7/17/2025
1.0.32 114 7/17/2025
1.0.32-g8305691a24 109 7/17/2025
1.0.31-gc13a26c4f6 109 7/17/2025
1.0.29-gd8e72ab853 112 7/17/2025
1.0.26 179 7/15/2025
1.0.24 263 7/11/2025
1.0.23-gbb8ffe5bb2 130 7/11/2025
1.0.22-ga1ac78f1df 133 7/11/2025
1.0.21 161 6/30/2025
1.0.21-g9302ac7f90 134 7/10/2025
1.0.20-gd3feb5712d 135 6/30/2025
1.0.19 174 6/23/2025
1.0.19-gde8fbe4b5f 133 6/30/2025
1.0.18-g54f8783ae2 137 6/23/2025
1.0.17-g7df622bb81 134 6/23/2025
1.0.16-g81e118866b 133 6/16/2025
1.0.15-g53ab3f4809 125 6/15/2025
1.0.14 158 6/15/2025
1.0.13 132 6/15/2025
1.0.13-g17063ec750 133 6/15/2025
1.0.12-g52f15572a7 131 6/15/2025
1.0.11-g70c3a3028c 127 6/15/2025
1.0.10-g22cc5cba88 142 6/14/2025
1.0.9-g7483a8ee5e 141 6/14/2025
1.0.8-gcb3125b986 142 6/14/2025
1.0.7-g3d65ad190a 116 6/14/2025
1.0.5-g78c1403494 119 6/14/2025
1.0.4-g5f05a80e82 147 6/14/2025
1.0.3-g7b075aa3eb 152 6/14/2025
1.0.2 148 6/14/2025
1.0.1 124 6/14/2025
1.0.0-pre 147 6/14/2025