ExcelDataReader.DataSet 3.7.0

dotnet add package ExcelDataReader.DataSet --version 3.7.0                
NuGet\Install-Package ExcelDataReader.DataSet -Version 3.7.0                
This command is intended to be used within the Package Manager Console in Visual Studio, as it uses the NuGet module's version of Install-Package.
<PackageReference Include="ExcelDataReader.DataSet" Version="3.7.0" />                
For projects that support PackageReference, copy this XML node into the project file to reference the package.
paket add ExcelDataReader.DataSet --version 3.7.0                
#r "nuget: ExcelDataReader.DataSet, 3.7.0"                
#r directive can be used in F# Interactive and Polyglot Notebooks. Copy this into the interactive tool or source code of the script to reference the package.
// Install ExcelDataReader.DataSet as a Cake Addin
#addin nuget:?package=ExcelDataReader.DataSet&version=3.7.0

// Install ExcelDataReader.DataSet as a Cake Tool
#tool nuget:?package=ExcelDataReader.DataSet&version=3.7.0                

ExcelDataReader

NuGet

Lightweight and fast library written in C# for reading Microsoft Excel files (2.0-2021, 365).

Please feel free to fork and submit pull requests to the develop branch.

If you are reporting an issue it is really useful if you can supply an example Excel file as this makes debugging much easier and without it we may not be able to resolve any problems.

Continuous integration

Branch Build status
develop Build status
master Build status

Supported file formats and versions

File Type Container Format File Format Excel Version(s)
.xlsx ZIP, CFB+ZIP OpenXml 2007 and newer
.xlsb ZIP, CFB OpenXml 2007 and newer
.xls CFB BIFF8 97, 2000, XP, 2003<br>98, 2001, v.X, 2004 (Mac)
.xls CFB BIFF5 5.0, 95
.xls - BIFF4 4.0
.xls - BIFF3 3.0
.xls - BIFF2 2.0, 2.2
.csv - CSV (All)

Finding the binaries

It is recommended to use NuGet through the VS Package Manager Console Install-Package <package> or using the VS "Manage NuGet Packages..." extension.

As of ExcelDataReader version 3.0, the project was split into multiple packages:

Install the ExcelDataReader base package to use the "low level" reader interface. Compatible with net462, netstandard2.0 and netstandard2.1.

Install the ExcelDataReader.DataSet extension package to use the AsDataSet() method to populate a System.Data.DataSet. This will also pull in the base package. Compatible with net462, netstandard2.0 and netstandard2.1.

How to use

using (var stream = File.Open(filePath, FileMode.Open, FileAccess.Read))
{
    // Auto-detect format, supports:
    //  - Binary Excel files (2.0-2003 format; *.xls)
    //  - OpenXml Excel files (2007 format; *.xlsx, *.xlsb)
    using (var reader = ExcelReaderFactory.CreateReader(stream))
    {
        // Choose one of either 1 or 2:

        // 1. Use the reader methods
        do
        {
            while (reader.Read())
            {
                // reader.GetDouble(0);
            }
        } while (reader.NextResult());

        // 2. Use the AsDataSet extension method
        var result = reader.AsDataSet();

        // The result of each spreadsheet is in result.Tables
    }
}

Reading .CSV files

Use ExcelReaderFactory.CreateCsvReader instead of CreateReader to parse a stream of plain text with comma separated values.

See also the configuration options FallbackEncoding and AutodetectSeparators.

The input CSV is always parsed once completely to set FieldCount, RowCount, Encoding, Separator (or twice if the CSV lacks BOM and is not UTF8), and then parsed once again while iterating the row records. Throws System.Text.DecoderFallbackException if the input cannot be parsed with the specified encoding.

The reader returns all CSV field values as strings and makes no attempts to convert the data to numbers or dates. This caller is responsible for interpreting the CSV data.

Using the reader methods

The AsDataSet() extension method is a convenient helper for quickly getting the data, but is not always available or desirable to use. IExcelDataReader extends the System.Data.IDataReader and IDataRecord interfaces to navigate and retrieve data at a lower level. The most important reader methods and properties:

  • Read() reads a row from the current sheet.
  • NextResult() advances the cursor to the next sheet.
  • ResultsCount returns the number of sheets in the current workbook.
  • Name returns the name of the current sheet.
  • CodeName returns the VBA code name identifier of the current sheet.
  • FieldCount returns the number of columns in the current sheet.
  • RowCount returns the number of rows in the current sheet. This includes terminal empty rows which are otherwise excluded by AsDataSet(). Throws InvalidOperationException on CSV files when used with AnalyzeInitialCsvRows.
  • HeaderFooter returns an object with information about the headers and footers, or null if there are none.
  • MergeCells returns an array of merged cell ranges in the current sheet.
  • RowHeight returns the visual height of the current row in points. May be 0 if the row is hidden.
  • GetColumnWidth() returns the width of a column in character units. May be 0 if the column is hidden.
  • GetFieldType() returns the type of a value in the current row. Always one of the types supported by Excel: double, int, bool, DateTime, TimeSpan, string, or null if there is no value.
  • IsDBNull() checks if a value in the current row is null.
  • GetValue() returns a value from the current row as an object, or null if there is no value.
  • GetDouble(), GetInt32(), GetBoolean(), GetDateTime(), GetString() return a value from the current row cast to their respective type.
  • GetNumberFormatString() returns a string containing the formatting codes for a value in the current row, or null if there is no value. See also the Formatting section below.
  • GetNumberFormatIndex() returns the number format index for a value in the current row. Index values below 164 refer to built-in number formats, otherwise indicate a custom number format.
  • GetCellStyle() returns an object containing style information for a cell in the current row: indent, horizontal alignment, hidden, locked.
  • The typed Get*() methods throw InvalidCastException unless the types match exactly.

CreateReader() configuration options

The ExcelReaderFactory.CreateReader(), CreateBinaryReader(), CreateOpenXmlReader(), CreateCsvReader() methods accept an optional configuration object to modify the behavior of the reader:

var reader = ExcelReaderFactory.CreateReader(stream, new ExcelReaderConfiguration()
{
    // Gets or sets the encoding to use when the input XLS lacks a CodePage
    // record, or when the input CSV lacks a BOM and does not parse as UTF8. 
    // Default: cp1252 (XLS BIFF2-5 and CSV only)
    FallbackEncoding = Encoding.GetEncoding(1252),

    // Gets or sets the password used to open password protected workbooks.
    Password = "password",

    // Gets or sets an array of CSV separator candidates. The reader 
    // autodetects which best fits the input data. Default: , ; TAB | # 
    // (CSV only)
    AutodetectSeparators = new char[] { ',', ';', '\t', '|', '#' },

    // Gets or sets a value indicating whether to leave the stream open after
    // the IExcelDataReader object is disposed. Default: false
    LeaveOpen = false,

    // Gets or sets a value indicating the number of rows to analyze for
    // encoding, separator and field count in a CSV. When set, this option
    // causes the IExcelDataReader.RowCount property to throw an exception.
    // Default: 0 - analyzes the entire file (CSV only, has no effect on other
    // formats)
    AnalyzeInitialCsvRows = 0,
});

AsDataSet() configuration options

The AsDataSet() method accepts an optional configuration object to modify the behavior of the DataSet conversion:

var result = reader.AsDataSet(new ExcelDataSetConfiguration()
{
    // Gets or sets a value indicating whether to set the DataColumn.DataType 
    // property in a second pass.
    UseColumnDataType = true,

    // Gets or sets a callback to determine whether to include the current sheet
    // in the DataSet. Called once per sheet before ConfigureDataTable.
    FilterSheet = (tableReader, sheetIndex) => true,

    // Gets or sets a callback to obtain configuration options for a DataTable. 
    ConfigureDataTable = (tableReader) => new ExcelDataTableConfiguration()
    {
        // Gets or sets a value indicating the prefix of generated column names.
        EmptyColumnNamePrefix = "Column",

        // Gets or sets a value indicating whether to use a row from the 
        // data as column names.
        UseHeaderRow = false,

        // Gets or sets a callback to determine which row is the header row. 
        // Only called when UseHeaderRow = true.
        ReadHeaderRow = (rowReader) => {
            // F.ex skip the first row and use the 2nd row as column headers:
            rowReader.Read();
        },

        // Gets or sets a callback to determine whether to include the 
        // current row in the DataTable.
        FilterRow = (rowReader) => {
            return true;
        },

        // Gets or sets a callback to determine whether to include the specific
        // column in the DataTable. Called once per column after reading the 
        // headers.
        FilterColumn = (rowReader, columnIndex) => {
            return true;
        }
    }
});

Setting up AsDataSet() configuration, use the FilterRow callback to implement a "progress indicator" while loading, e.g.:

var result = reader.AsDataSet(new ExcelDataSetConfiguration()
{
    ConfigureDataTable = (tableReader) => new ExcelDataTableConfiguration()
    {
        FilterRow = (rowReader) => {
            int progress = (int)Math.Ceiling((decimal)rowReader.Depth / (decimal)rowReader.RowCount * (decimal)100);
            // progress is in the range 0..100
            return true;
        }
    }
});

Formatting

ExcelDataReader does not support formatting directly. Users may retreive the number format string for a cell through IExcelDataReader.GetNumberFormatString(i) and use the third party ExcelNumberFormat library for formatting purposes.

Example helper method using ExcelDataReader and ExcelNumberFormat to format a value:

string GetFormattedValue(IExcelDataReader reader, int columnIndex, CultureInfo culture)
{
    var value = reader.GetValue(columnIndex);
    var formatString = reader.GetNumberFormatString(columnIndex);
    if (formatString != null)
    {
        var format = new NumberFormat(formatString);
        return format.Format(value, culture);
    }
    return Convert.ToString(value, culture);
}

See also:

Important note when upgrading from ExcelDataReader 2.x

ExcelDataReader 3 had some breaking changes, and older code may produce error messages similar to:

'IExcelDataReader' does not contain a definition for 'AsDataSet'...
'IExcelDataReader' does not contain a definition for 'IsFirstRowAsColumnNames'...

To fix:

  1. Make sure to rename any Excel namespace references in the code to the new namespace ExcelDataReader

  2. Make sure the project has a reference to the ExcelDataReader.DataSet package to use AsDataSet()

  3. Remove the line of code with IsFirstRowAsColumnNames and change the call to AsDataSet() to something like this:

var result = reader.AsDataSet(new ExcelDataSetConfiguration()
{
    ConfigureDataTable = (_) => new ExcelDataTableConfiguration()
    {
        UseHeaderRow = true
    }
});

Important note on .NET Core

By default, ExcelDataReader throws a NotSupportedException "No data is available for encoding 1252." on .NET Core and .NET 5.0 or later.

To fix, add a dependency to the package System.Text.Encoding.CodePages and then add code to register the code page provider during application initialization (f.ex in Startup.cs):

System.Text.Encoding.RegisterProvider(System.Text.CodePagesEncodingProvider.Instance);

This is required to parse strings in binary BIFF2-5 Excel documents encoded with DOS-era code pages. These encodings are registered by default in the full .NET Framework, but not on .NET Core and .NET 5.0 or later.

Product Compatible and additional computed target framework versions.
.NET net5.0 was computed.  net5.0-windows was computed.  net6.0 was computed.  net6.0-android was computed.  net6.0-ios was computed.  net6.0-maccatalyst was computed.  net6.0-macos was computed.  net6.0-tvos was computed.  net6.0-windows was computed.  net7.0 was computed.  net7.0-android was computed.  net7.0-ios was computed.  net7.0-maccatalyst was computed.  net7.0-macos was computed.  net7.0-tvos was computed.  net7.0-windows was computed.  net8.0 was computed.  net8.0-android was computed.  net8.0-browser was computed.  net8.0-ios was computed.  net8.0-maccatalyst was computed.  net8.0-macos was computed.  net8.0-tvos was computed.  net8.0-windows was computed. 
.NET Core netcoreapp2.0 was computed.  netcoreapp2.1 was computed.  netcoreapp2.2 was computed.  netcoreapp3.0 was computed.  netcoreapp3.1 was computed. 
.NET Standard netstandard2.0 is compatible.  netstandard2.1 is compatible. 
.NET Framework net461 was computed.  net462 is compatible.  net463 was computed.  net47 was computed.  net471 was computed.  net472 was computed.  net48 was computed.  net481 was computed. 
MonoAndroid monoandroid was computed. 
MonoMac monomac was computed. 
MonoTouch monotouch was computed. 
Tizen tizen40 was computed.  tizen60 was computed. 
Xamarin.iOS xamarinios was computed. 
Xamarin.Mac xamarinmac was computed. 
Xamarin.TVOS xamarintvos was computed. 
Xamarin.WatchOS xamarinwatchos was computed. 
Compatible target framework(s)
Included target framework(s) (in package)
Learn more about Target Frameworks and .NET Standard.

NuGet packages (128)

Showing the top 5 NuGet packages that depend on ExcelDataReader.DataSet:

Package Downloads
Reo.Core.DataImport

Package Description

eXpandExcelImporter

ExcelImporter Module / eXpandFramework. To debug in VS enable Source Server support under Tools/Options/Debugging.

ExcelProvider

Generate typed read-only access for a sheet in an Excel workbook containing data in columns

MudBlazor.Extensions

MudBlazor.Extensions is a small extension library for MudBlazor from https://mudblazor.com/

BusAppCore.Util

Common Utility Functions

GitHub repositories (17)

Showing the top 5 popular GitHub repositories that depend on ExcelDataReader.DataSet:

Repository Stars
mini-software/MiniExcel
Fast, Low-Memory, Easy Excel .NET helper to import/export/template spreadsheet (support Linux, Mac)
SpecFlowOSS/SpecFlow
#1 .NET BDD Framework. SpecFlow automates your testing & works with your existing code. Find Bugs before they happen. Behavior Driven Development helps developers, testers, and business representatives to get a better understanding of their collaboration
open-rpa/openrpa
Free Open Source Enterprise Grade RPA
neil3d/excel2json
把Excel表转换成json对象,并保存到一个文本文件中。
paillave/Etl.Net
Mass processing data with a complete ETL for .net developers
Version Downloads Last updated
3.7.0 974,478 6/22/2024
3.7.0-develop00415 1,734 6/13/2024
3.7.0-develop00411 2,851 6/3/2024
3.7.0-develop00385 130,782 8/16/2023
3.7.0-develop00371 14,829 6/18/2023
3.7.0-develop00365 7,757 5/17/2023
3.7.0-develop00310 707,066 8/9/2020
3.7.0-develop00306 18,722 8/2/2020
3.6.0 30,509,883 5/2/2019
3.5.0 621,795 3/1/2019
3.4.2 1,821,619 10/14/2018
3.4.1 430,309 8/29/2018
3.4.0 1,763,222 2/4/2018
3.3.0 520,194 11/24/2017
3.2.0 323,557 9/27/2017
3.1.0 234,701 7/23/2017
3.0.0 39,231 7/16/2017
3.0.0-develop00086 2,144 7/1/2017
3.0.0-develop00019 3,537 6/5/2017