Gzip.TextClassifier 1.0.1

dotnet add package Gzip.TextClassifier --version 1.0.1                
NuGet\Install-Package Gzip.TextClassifier -Version 1.0.1                
This command is intended to be used within the Package Manager Console in Visual Studio, as it uses the NuGet module's version of Install-Package.
<PackageReference Include="Gzip.TextClassifier" Version="1.0.1" />                
For projects that support PackageReference, copy this XML node into the project file to reference the package.
paket add Gzip.TextClassifier --version 1.0.1                
#r "nuget: Gzip.TextClassifier, 1.0.1"                
#r directive can be used in F# Interactive and Polyglot Notebooks. Copy this into the interactive tool or source code of the script to reference the package.
// Install Gzip.TextClassifier as a Cake Addin
#addin nuget:?package=Gzip.TextClassifier&version=1.0.1

// Install Gzip.TextClassifier as a Cake Tool
#tool nuget:?package=Gzip.TextClassifier&version=1.0.1                

Gzip Text Classifier

This project is a C# implementation of the Normalized Compression Distance (NCD) classification algorithm with Gzip compression, which can be read about in this paper.

The original repository written in Python, can be found at this link.

Also available as a Nuget package.

Usage

Predict csv test file:


string trainFile = @"C:\Users\Chus\Downloads\ag_news_train.csv";
string testFile = @"C:\Users\Chus\Downloads\ag_news_test.csv";

GzipClassifierOptions gzipClassifierOptions = new()
{
    TrainFile = trainFile,          // File path for csv train file
    ParallelismOnCalc = true,       // Use paralelism on distance calc. Default: true
    ParallelismOnTestFile = false,  // Use paralelism for each test. Default: false
    K = 2,                          // Value of K in k-nearest-neighbor. Default: 3
    TextColumn = 0,                 // Text column number in csv file. Default: 0
    LabelColumn = 1,                // Label column number in csv file. Default: 1
    HasHeaderRecord = true,         // Csv has header record. Deault: true
    ConsoleOutput = true,           // Output console during file prediction. Default: true
};

GzipClassifier gzipClassifier = new(gzipClassifierOptions);
double result = gzipClassifier.PredictFile(testFile);
Console.WriteLine(result);

Single text prediction:

string text = "Socialites unite dolphin groups Dolphin groups, or \"pods\", rely on socialites to keep them from collapsing, scientists claim.";
var prediction = gzipClassifier.Predict(text);
Console.WriteLine(prediction);    
Product Compatible and additional computed target framework versions.
.NET net6.0 is compatible.  net6.0-android was computed.  net6.0-ios was computed.  net6.0-maccatalyst was computed.  net6.0-macos was computed.  net6.0-tvos was computed.  net6.0-windows was computed.  net7.0 was computed.  net7.0-android was computed.  net7.0-ios was computed.  net7.0-maccatalyst was computed.  net7.0-macos was computed.  net7.0-tvos was computed.  net7.0-windows was computed.  net8.0 was computed.  net8.0-android was computed.  net8.0-browser was computed.  net8.0-ios was computed.  net8.0-maccatalyst was computed.  net8.0-macos was computed.  net8.0-tvos was computed.  net8.0-windows was computed. 
Compatible target framework(s)
Included target framework(s) (in package)
Learn more about Target Frameworks and .NET Standard.

NuGet packages

This package is not used by any NuGet packages.

GitHub repositories

This package is not used by any popular GitHub repositories.

Version Downloads Last updated
1.0.1 162 7/19/2023
1.0.0 126 7/18/2023