GroupDocs.Parser
24.2.0
See the version list below for details.
dotnet add package GroupDocs.Parser --version 24.2.0
NuGet\Install-Package GroupDocs.Parser -Version 24.2.0
<PackageReference Include="GroupDocs.Parser" Version="24.2.0" />
paket add GroupDocs.Parser --version 24.2.0
#r "nuget: GroupDocs.Parser, 24.2.0"
// Install GroupDocs.Parser as a Cake Addin #addin nuget:?package=GroupDocs.Parser&version=24.2.0 // Install GroupDocs.Parser as a Cake Tool #tool nuget:?package=GroupDocs.Parser&version=24.2.0
Document Parser .NET API
Product Page | Docs | Demos | API Reference | Examples | Blog | Releases | Free Support | Temporary License
This text parser on-premise API works well to search & extract formatted text as well as the raw text from a variety of documents of supported file formats.
Document Parser Processing Features
- Parse documents by user-defined templates.
- Extract plain and structured text.
- Extract text areas with coordinates, text styles, and other information.
- Search text by a keyword or regular expression; extract text around that word.
- Extract HTML or Markdown (MD) formatted text for a fast preview.
- Increase performance by extracting raw text.
- Extract formatted text, metadata, images, containers, and attachments.
- Extract table of contents for some supported document formats.
- Parse form data from PDF documents.
Parse Document by Template
Word Processing: DOC, DOT, DOCX, DOCM, DOTX, DOTM, ODT, OTT, RTF, TXT
Spreadsheet: XLS, XLT, XLSX, XLSM, XLSB, XLTX, XLTM, ODS, OTS, XLA, XLAM, NUMBERS
Presentation: PPS, POT, PPTX, PPTM, POTX, POTM, PPSX, PPSM, ODP, OTP
Portable: PDF
Extract Text (Accurate)
Word Processing: DOC, DOT, DOCX, DOCM, DOTX, DOTM, ODT, OTT, RTF
Spreadsheet: XLS, XLT, XLSX, XLSM, XLSB, XLTX, XLTM, ODS, OTS, CSV, XLA, XLAM, NUMBERS
Presentation: PPS, POT, PPTX, PPTM, POTX, POTM, PPSX, PPSM, ODP, OTP
Email: EML, EMLX, MSG
Markup: XHTML, MHTML, MD, XML
eBook: CHM, EPUB, FB2
Portable: PDF
OneNote: ONE
Databases: Databases are supported via ADO.NET. To work with the corresponding database format install its database provider.
Extract Text (Raw)
Spreadsheet: XLS, XLT, XLSX, XLSM, XLTX, XLTM, XLA, XLAM
Presentation: PPT, PPS, POT, PPTX, PPTM, POTX, POTM, PPSX, PPSM
Portable: PDF
Extract Structured Text and Formatted Text
Word Processing: DOC, DOT, DOCX, DOCM, DOTX, DOTM, ODT, OTT, RTF
Spreadsheet: XLS, XLT, XLSX, XLSM, XLTX, XLTM, XLA, XLAM
Presentation: PPT, PPS, POT, PPTX, PPTM, POTX, POTM, PPSX, PPSM, ODP, OTP
Email: EML, EMLX, MSG
Markup: MD (Formatted Text is Not supported)
eBook: CHM, EPUB, FB2
Please visit the Supported Document Formats for more details.
Platform Independence
GroupDocs.Parser for .NET does not require any external software or third-party tool to be installed. GroupDocs.Parser for .NET supports any 32-bit or 64-bit operating system where .NET or Mono framework is installed. The other details are as follows:
Microsoft Windows: Microsoft Windows Desktop (x86, x64) (XP & up), Microsoft Windows Server (x86, x64) (2000 & up), Windows Azure
Mac OS: Mac OS X
Linux: Linux (Ubuntu, OpenSUSE, CentOS and others)
Development Environments: Microsoft Visual Studio (2010 & up), Xamarin.Android, Xamarin.IOS, Xamarin.Mac, MonoDevelop 2.4 and later.
Supported Frameworks: GroupDocs.Conversion for .NET supports .NET and Mono frameworks.
Get Started
Are you ready to give GroupDocs.Parser for .NET a try? Simply execute Install-Package GroupDocs.Parser
from Package Manager Console in Visual Studio to fetch & reference GroupDocs.Parser assembly in your project. If you already have GroupDocs.Parser for .Net and want to upgrade it, please execute Update-Package GroupDocs.Parser
to get the latest version.
Please check the GitHub Repository for other common usage scenarios.
Extract all Images and Save them in PNG
Format via C# Code
// create an instance of Parser class
using(Parser parser = new Parser(Constants.SampleZip)) {
// extract images from document
IEnumerable < PageImageArea > images = parser.GetImages();
// check if images extraction is supported
if (images == null) {
Console.WriteLine("Page images extraction isn't supported");
return;
}
// create the options to save images in PNG format
ImageOptions options = new ImageOptions(ImageFormat.Png);
int imageNumber = 0;
// iterate over images
foreach(PageImageArea image in images) {
// save the image to the png file
image.Save(imageNumber.ToString() + ".png", options);
imageNumber++;
}
}
Product Page | Docs | Demos | API Reference | Examples | Blog | Releases | Free Support | Temporary License
Product | Versions Compatible and additional computed target framework versions. |
---|---|
.NET | net5.0 was computed. net5.0-windows was computed. net6.0 was computed. net6.0-android was computed. net6.0-ios was computed. net6.0-maccatalyst was computed. net6.0-macos was computed. net6.0-tvos was computed. net6.0-windows was computed. net7.0 was computed. net7.0-android was computed. net7.0-ios was computed. net7.0-maccatalyst was computed. net7.0-macos was computed. net7.0-tvos was computed. net7.0-windows was computed. net8.0 was computed. net8.0-android was computed. net8.0-browser was computed. net8.0-ios was computed. net8.0-maccatalyst was computed. net8.0-macos was computed. net8.0-tvos was computed. net8.0-windows was computed. |
.NET Core | netcoreapp2.0 was computed. netcoreapp2.1 was computed. netcoreapp2.2 was computed. netcoreapp3.0 was computed. netcoreapp3.1 was computed. |
.NET Standard | netstandard2.0 is compatible. netstandard2.1 was computed. |
.NET Framework | net461 was computed. net462 was computed. net463 was computed. net47 was computed. net471 was computed. net472 was computed. net48 was computed. net481 was computed. |
MonoAndroid | monoandroid was computed. |
MonoMac | monomac was computed. |
MonoTouch | monotouch was computed. |
Tizen | tizen40 was computed. tizen60 was computed. |
Xamarin.iOS | xamarinios was computed. |
Xamarin.Mac | xamarinmac was computed. |
Xamarin.TVOS | xamarintvos was computed. |
Xamarin.WatchOS | xamarinwatchos was computed. |
-
.NETStandard 2.0
- Microsoft.Extensions.DependencyModel (>= 2.1.0)
- Microsoft.Win32.Registry (>= 4.7.0)
- SkiaSharp (>= 2.88.6)
- SkiaSharp.NativeAssets.Linux.NoDependencies (>= 2.88.6)
- System.Diagnostics.PerformanceCounter (>= 4.5.0)
- System.Drawing.Common (>= 5.0.3)
- System.Reflection.Emit (>= 4.7.0)
- System.Reflection.Emit.ILGeneration (>= 4.7.0)
- System.Security.Cryptography.Pkcs (>= 6.0.4)
- System.Security.Permissions (>= 4.5.0)
- System.Security.Principal.Windows (>= 5.0.0)
- System.Text.Encoding.CodePages (>= 6.0.0)
NuGet packages
This package is not used by any NuGet packages.
GitHub repositories
This package is not used by any popular GitHub repositories.
Version | Downloads | Last updated |
---|---|---|
24.10.0 | 1,075 | 11/1/2024 |
24.9.0 | 2,230 | 9/30/2024 |
24.8.0 | 29,459 | 8/30/2024 |
24.7.0 | 1,534 | 7/24/2024 |
24.6.0 | 2,710 | 6/29/2024 |
24.5.0 | 5,471 | 5/31/2024 |
24.4.0 | 5,862 | 4/23/2024 |
24.2.1 | 7,201 | 3/13/2024 |
24.2.0 | 1,309 | 2/29/2024 |
23.12.0 | 134,040 | 12/23/2023 |
23.11.0 | 36,739 | 11/24/2023 |
23.10.0 | 13,560 | 10/21/2023 |
23.8.0 | 65,528 | 8/18/2023 |
23.5.0 | 84,970 | 5/31/2023 |
23.3.0 | 16,093 | 3/31/2023 |
23.2.0 | 22,868 | 3/1/2023 |
22.11.1 | 25,281 | 1/17/2023 |
22.11.0 | 38,896 | 11/29/2022 |
22.8.0 | 74,431 | 8/12/2022 |
22.6.0 | 31,446 | 6/7/2022 |
22.2.0 | 37,310 | 2/25/2022 |
21.5.0 | 63,323 | 5/31/2021 |
21.2.0 | 50,949 | 2/22/2021 |
20.12.0 | 24,429 | 12/30/2020 |
20.10.0 | 169,301 | 10/27/2020 |
20.8.0 | 49,008 | 8/19/2020 |
20.6.1 | 47,470 | 6/30/2020 |
20.6.0 | 20,077 | 6/19/2020 |
20.5.0 | 35,188 | 5/8/2020 |
20.3.0 | 48,429 | 3/19/2020 |
20.1.0 | 35,725 | 1/31/2020 |
19.12.0 | 33,537 | 12/27/2019 |
19.11.0 | 28,458 | 11/22/2019 |
19.9.0 | 2,809 | 9/27/2019 |
19.5.0 | 3,039 | 5/29/2019 |
18.12.0 | 3,214 | 12/11/2018 |
18.11.0 | 2,701 | 11/8/2018 |
18.10.0 | 2,785 | 10/10/2018 |
18.9.0 | 2,772 | 9/5/2018 |
18.8.0 | 2,841 | 8/7/2018 |
18.7.0 | 2,791 | 7/3/2018 |
18.5.0 | 3,013 | 5/23/2018 |