PommaLabs.HtmlArk 1.1.0 The ID prefix of this package has been reserved for one of the owners of this package by NuGet.org.

HtmlArk embeds images, fonts, CSS and JavaScript into an HTML file.

Install-Package PommaLabs.HtmlArk -Version 1.1.0
dotnet add package PommaLabs.HtmlArk --version 1.1.0
<PackageReference Include="PommaLabs.HtmlArk" Version="1.1.0" />
For projects that support PackageReference, copy this XML node into the project file to reference the package.
paket add PommaLabs.HtmlArk --version 1.1.0
The NuGet Team does not provide support for this client. Please contact its maintainers for support.
#r "nuget: PommaLabs.HtmlArk, 1.1.0"
#r directive can be used in F# Interactive, C# scripting and .NET Interactive. Copy this into the interactive tool or source code of the script to reference the package.
// Install PommaLabs.HtmlArk as a Cake Addin
#addin nuget:?package=PommaLabs.HtmlArk&version=1.1.0

// Install PommaLabs.HtmlArk as a Cake Tool
#tool nuget:?package=PommaLabs.HtmlArk&version=1.1.0
The NuGet Team does not provide support for this client. Please contact its maintainers for support.

HtmlArk

License: MIT Donate Docs NuGet version NuGet downloads

standard-readme compliant GitLab pipeline status Quality gate Code coverage Renovate enabled

Embeds images, fonts, CSS and JavaScript into an HTML file. Resources are embedded using data URIs.

This project is a .NET rewrite of the homonymous Python project, from which the command line interface has been copied in order to ease interoperability.

Most disclaimers which were valid for the original library apply here too:

  • ⚠️ HtmlArk should be used with trusted HTML pages only or in a sandboxed environment. Untrusted HTML pages might contain resource links which are valid for HtmlArk but they might pose a serious security risk to your organization.
  • HtmlArk works with static HTML pages only. If an image or other resource is loaded with JavaScript, HtmlArk won't even know it exists.
  • Most browsers support data URIs, but as usual IE support might be less than ideal. Check data URIs compatibility on Can I use.

HtmlArk can be used to "pack" web pages into single HTML files. However, HtmlArk is not a crawler, so it must be paired with one in order to pack entire websites.

💡 If you plan to serve packed web pages, please remember to turn on GZIP compression. It usually yields good results and it helps to reduce download size.

Table of Contents

Install

NuGet package PommaLabs.HtmlArk is available for download:

dotnet add package PommaLabs.HtmlArk

HtmlArk .NET tool can be installed with following command:

dotnet tool install PommaLabs.HtmlArk.Tool

Usage

Library

As a library, HtmlArk can be included with the following using statement in your class:

using PommaLabs.HtmlArk;

And then, it can be used like this, for example:

IHtmlArchiver htmlArchiver = new HtmlArchiver(NullLogger<HtmlArchiver>.Instance);
string archivedHtml = await htmlArchiver.ArchiveAsync(new Uri("https://www.example.com/"));

If you use dependency injection, it can be registered this way:

services.AddHtmlArchiver(); // Maps IHtmlArchiver to HtmlArchiver as singleton.

Tool

HtmlArk .NET tool accepts the following command line arguments:

  -M, --http-client-max-resource-size    How many bytes can be downloaded for each resource.

  -T, --http-client-timeout              Timeout of the internal HTTP client.

  -A, --ignore-audios                    Ignores audios during archival.

  -C, --ignore-css                       Ignores style sheets during archival.

  -E, --ignore-errors                    Ignores unreadable resources.

  -I, --ignore-images                    Ignores images during archival.

  -J, --ignore-js                        Ignores external JavaScript during archival.

  -V, --ignore-videos                    Ignores videos during archival.

  -m, --minify                           Minifies output HTML.

  -o, --output                           Output file path. If not specified, output will be written to STDOUT.

  -v, --verbose                          Prints detailed information during HTML archival.

  --help                                 Display this help screen.

  --version                              Display version information.

  input (pos. 0)                         Required. Input URI or file path.

Interface is modeled after the original Python project, so it should be pretty easy to switch between them.

Maintainers

@pomma89.

Contributing

PRs accepted.

Small note: If editing the README, please conform to the standard-readme specification.

License

MIT © 2020-2021 Alessio Parma

HtmlArk

License: MIT Donate Docs NuGet version NuGet downloads

standard-readme compliant GitLab pipeline status Quality gate Code coverage Renovate enabled

Embeds images, fonts, CSS and JavaScript into an HTML file. Resources are embedded using data URIs.

This project is a .NET rewrite of the homonymous Python project, from which the command line interface has been copied in order to ease interoperability.

Most disclaimers which were valid for the original library apply here too:

  • ⚠️ HtmlArk should be used with trusted HTML pages only or in a sandboxed environment. Untrusted HTML pages might contain resource links which are valid for HtmlArk but they might pose a serious security risk to your organization.
  • HtmlArk works with static HTML pages only. If an image or other resource is loaded with JavaScript, HtmlArk won't even know it exists.
  • Most browsers support data URIs, but as usual IE support might be less than ideal. Check data URIs compatibility on Can I use.

HtmlArk can be used to "pack" web pages into single HTML files. However, HtmlArk is not a crawler, so it must be paired with one in order to pack entire websites.

💡 If you plan to serve packed web pages, please remember to turn on GZIP compression. It usually yields good results and it helps to reduce download size.

Table of Contents

Install

NuGet package PommaLabs.HtmlArk is available for download:

dotnet add package PommaLabs.HtmlArk

HtmlArk .NET tool can be installed with following command:

dotnet tool install PommaLabs.HtmlArk.Tool

Usage

Library

As a library, HtmlArk can be included with the following using statement in your class:

using PommaLabs.HtmlArk;

And then, it can be used like this, for example:

IHtmlArchiver htmlArchiver = new HtmlArchiver(NullLogger<HtmlArchiver>.Instance);
string archivedHtml = await htmlArchiver.ArchiveAsync(new Uri("https://www.example.com/"));

If you use dependency injection, it can be registered this way:

services.AddHtmlArchiver(); // Maps IHtmlArchiver to HtmlArchiver as singleton.

Tool

HtmlArk .NET tool accepts the following command line arguments:

  -M, --http-client-max-resource-size    How many bytes can be downloaded for each resource.

  -T, --http-client-timeout              Timeout of the internal HTTP client.

  -A, --ignore-audios                    Ignores audios during archival.

  -C, --ignore-css                       Ignores style sheets during archival.

  -E, --ignore-errors                    Ignores unreadable resources.

  -I, --ignore-images                    Ignores images during archival.

  -J, --ignore-js                        Ignores external JavaScript during archival.

  -V, --ignore-videos                    Ignores videos during archival.

  -m, --minify                           Minifies output HTML.

  -o, --output                           Output file path. If not specified, output will be written to STDOUT.

  -v, --verbose                          Prints detailed information during HTML archival.

  --help                                 Display this help screen.

  --version                              Display version information.

  input (pos. 0)                         Required. Input URI or file path.

Interface is modeled after the original Python project, so it should be pretty easy to switch between them.

Maintainers

@pomma89.

Contributing

PRs accepted.

Small note: If editing the README, please conform to the standard-readme specification.

License

MIT © 2020-2021 Alessio Parma

Release Notes

https://gitlab.com/pommalabs/htmlark/-/releases

NuGet packages

This package is not used by any NuGet packages.

GitHub repositories

This package is not used by any popular GitHub repositories.

Version History

Version Downloads Last updated
1.1.0 46 8/13/2021
1.0.0 41 8/12/2021
0.2.0 58 4/25/2021
0.1.1 231 11/15/2020
0.1.0 117 11/15/2020