Toxy is a .NET data/text extraction framework similar to Apache Tika in Java. It supports a lot of popular formats such as docx, xlsx, xls, pdf, csv, txt, epub, html and so on.
See the version list below for details.
Install-Package Toxy -Version 1.6.1
dotnet add package Toxy --version 1.6.1
<PackageReference Include="Toxy" Version="1.6.1" />
paket add Toxy --version 1.6.1
#r "nuget: Toxy, 1.6.1"
// Install Toxy as a Cake Addin #addin nuget:?package=Toxy&version=1.6.1 // Install Toxy as a Cake Tool #tool nuget:?package=Toxy&version=1.6.1
1. Update PDF extraction license with commercial one - TextSharp license
2. support .msg file extraction (only support windows platform)
3. support RTF extraction with html content
1. fix PDF extraction issue
2. fix some Word extraction issue
3. Excel, Word document streams are not closed after opening by WorkbookFactory
This package is not used by any NuGet packages.
GitHub repositories (1)
Showing the top 1 popular GitHub repositories that depend on Toxy:
.NET based webcrawler