TikaOnDotNet 1.17.1
dotnet add package TikaOnDotNet --version 1.17.1
NuGet\Install-Package TikaOnDotNet -Version 1.17.1
<PackageReference Include="TikaOnDotNet" Version="1.17.1" />
paket add TikaOnDotNet --version 1.17.1
#r "nuget: TikaOnDotNet, 1.17.1"
// Install TikaOnDotNet as a Cake Addin #addin nuget:?package=TikaOnDotNet&version=1.17.1 // Install TikaOnDotNet as a Cake Tool #tool nuget:?package=TikaOnDotNet&version=1.17.1
Bare-bones IKVM Java-to-.NET port of Apache Tika. You'll want to install TikaOnDotNet.TextExtractor.
Product | Versions Compatible and additional computed target framework versions. |
---|---|
.NET Framework | net is compatible. |
-
- IKVM (>= 8.1.5717)
NuGet packages (4)
Showing the top 4 NuGet packages that depend on TikaOnDotNet:
Package | Downloads |
---|---|
TikaOnDotnet.TextExtractor
Classes for running Apache Tika through **TikaOnDotNet**. Just use TextExtractor.Extract() and you'll be on your way. |
|
DevelopmentHelpers.FileContentReader
This package combine many open sources packages and allow one interface to read may types of content files. for example:use open.xml to read docx file |
|
Skybrud.Umbraco.Search.DocumentIndexer
This package makes it possible to index and search a wide variety of filetypes in Umbraco, including .pdf and .docx |
|
Jetsons.JetPack.Text
The wrapper library that provides smart extension methods to convert document formats to high quality text. |
GitHub repositories (1)
Showing the top 1 popular GitHub repositories that depend on TikaOnDotNet:
Repository | Stars |
---|---|
vivami/SauronEye
Search tool to find specific files containing specific words, i.e. files containing passwords..
|
Version | Downloads | Last updated |
---|---|---|
1.17.1 | 547,168 | 4/3/2018 |
1.17.0 | 40,671 | 2/15/2018 |
1.16.0 | 171,846 | 7/30/2017 |
1.15.0 | 14,930 | 7/30/2017 |
1.14.2 | 123,984 | 4/22/2017 |
1.14.2-pre | 4,761 | 4/15/2017 |
1.14.1 | 329,453 | 1/13/2017 |
1.14.0 | 10,042 | 12/8/2016 |
1.13.1 | 12,541 | 8/16/2016 |
1.13.0 | 8,735 | 6/30/2016 |
1.12.2 | 43,786 | 4/12/2016 |
1.12.1 | 7,630 | 4/12/2016 |
1.12.0 | 9,223 | 4/11/2016 |
1.7.0 | 20,201 | 2/6/2015 |
1.6.4.51427 | 8,139 | 1/16/2015 |
1.6.3 | 8,768 | 9/27/2014 |
1.6.2.1 | 6,616 | 6/5/2014 |
1.6.0 | 4,072 | 6/5/2014 |
1.5.2 | 3,819 | 5/30/2014 |
1.5.0 | 4,546 | 3/5/2014 |
1.4.0.51459 | 5,230 | 7/12/2013 |
- Add new overloads to the `TextExtractor.Extract` allowing users to provide their own extraction result assemblers. Example:
```cs
public class CustomResult
{
public string Text { get; set; }
public IDictionary<string, string[]> Metadata { get; set; }
}
public static CustomResult CreateCustomResult(string text, Metadata metadata)
{
var metaDataDictionary = metadata.names().ToDictionary(name => name, metadata.getValues);
return new CustomResult
{
Metadata = metaDataDictionary,
Text = text,
};
}
[Test]
public void should_extract_author_list_from_pdf()
{
var textExtractionResult = new TextExtractor().Extract("file_with_authors.pdf", CreateCustomResult);
textExtractionResult.Metadata["meta:author"].Should().ContainInOrder("Fred Jones, M. D.", "Donald Evans D. M.");
}
```