Florence2 25.12.63049

dotnet add package Florence2 --version 25.12.63049
                    
NuGet\Install-Package Florence2 -Version 25.12.63049
                    
This command is intended to be used within the Package Manager Console in Visual Studio, as it uses the NuGet module's version of Install-Package.
<PackageReference Include="Florence2" Version="25.12.63049" />
                    
For projects that support PackageReference, copy this XML node into the project file to reference the package.
<PackageVersion Include="Florence2" Version="25.12.63049" />
                    
Directory.Packages.props
<PackageReference Include="Florence2" />
                    
Project file
For projects that support Central Package Management (CPM), copy this XML node into the solution Directory.Packages.props file to version the package.
paket add Florence2 --version 25.12.63049
                    
#r "nuget: Florence2, 25.12.63049"
                    
#r directive can be used in F# Interactive and Polyglot Notebooks. Copy this into the interactive tool or source code of the script to reference the package.
#:package Florence2@25.12.63049
                    
#:package directive can be used in C# file-based apps starting in .NET 10 preview 4. Copy this into a .cs file before any lines of code to reference the package.
#addin nuget:?package=Florence2&version=25.12.63049
                    
Install as a Cake Addin
#tool nuget:?package=Florence2&version=25.12.63049
                    
Install as a Cake Tool

Florence2 — C# Wrapper for Microsoft’s Florence-2 Vision Model

A lightweight, easy-to-use C# library that provides access to Microsoft’s Florence-2-base models for advanced image understanding tasks — including captioning, OCR, object detection, and phrase grounding.

This project gives .NET developers a clean API to run Florence-2 locally without needing Python or the original reference implementation.

📦 NuGet: https://www.nuget.org/packages/Florence2


✨ Features

  • Image Captioning Generate concise or richly detailed descriptions of images.

  • Optical Character Recognition (OCR) Extract text from entire images or specific regions.

  • Region-based OCR Provide bounding boxes and retrieve text only from selected areas.

  • Object Detection Detect and label objects with bounding boxes.

  • Phrase Grounding (optional) Highlight image regions relevant to a given phrase or textual query.

  • Local Model Execution Automatically downloads and loads the Florence-2-base ONNX models.


🚀 Quick Start

1. Install the package

dotnet add package Florence2

Or get it on NuGet: https://www.nuget.org/packages/Florence2


2. Example Usage

using Florence2;

// Download models if needed
var modelSource = new FlorenceModelDownloader("./models");
await modelSource.DownloadModelsAsync();

// Create model instance
var model = new Florence2Model(modelSource);

// Load an image stream
using var imgStream = File.OpenRead("car.jpg");

// Optional text for phrase grounding (may be null)
string phrase = "the red car";

// Choose a task: Captioning / OCR / ObjectDetection / PhraseGrounding / RegionOCR
var task = TaskTypes.OCR_WITH_REGION;

// Run inference
var results = model.Run(task, imgStream, textInput: phrase);

// View results
Console.WriteLine(JsonSerializer.Serialize(results, new JsonSerializerOptions() { WriteIndented = true }));

📚 Supported Tasks

Task Description
TaskTypes.OCR Optical Character Recognition: Extracts all text recognized in the image.
TaskTypes.OCR_WITH_REGION Extracts all text from the image and provides the bounding box (quad-box) for each detected text region.
TaskTypes.CAPTION Generates a brief caption describing the entire image.
TaskTypes.DETAILED_CAPTION Generates a detailed description of the image, covering more elements than the standard caption.
TaskTypes.MORE_DETAILED_CAPTION Generates a highly comprehensive and lengthy description of the image contents.
TaskTypes.OD Object Detection: Detects objects in the image and provides their bounding boxes and class labels.
TaskTypes.DENSE_REGION_CAPTION Detects a large number of regions (densely packed) and provides a caption/label for each bounding box.
TaskTypes.CAPTION_TO_PHRASE_GROUNDING Phrase Grounding: Highlights/localizes regions (bounding boxes) that correspond to specific phrases provided in a text input.
TaskTypes.REGION_TO_SEGMENTATION Generates a segmentation mask for an object defined by a provided bounding box.
TaskTypes.OPEN_VOCABULARY_DETECTION Detects objects matching a provided text prompt (similar to phrase grounding, but often used to detect specific classes).
TaskTypes.REGION_TO_CATEGORY Classifies the object contained within a specific provided bounding box.
TaskTypes.REGION_TO_DESCRIPTION Generates a description or caption for a specific region defined by a provided bounding box.
TaskTypes.REGION_TO_OCR Extracts text specifically from a region defined by a provided bounding box.
TaskTypes.REGION_PROPOSAL Identifies and outputs bounding boxes for salient regions or potential objects in the image without labels.

📦 Model Files

Models are downloaded automatically via FlorenceModelDownloader, but you can also supply your own model directory. The library expects Florence-2-base ONNX models compatible with Microsoft’s open-source release.


🤝 Contributing

Contributions, issues, and pull requests are welcome! If you find a bug or have a feature request, feel free to open an issue.


📄 License

MIT — see the LICENSE file for details.

Product Compatible and additional computed target framework versions.
.NET net8.0 is compatible.  net8.0-android was computed.  net8.0-browser was computed.  net8.0-ios was computed.  net8.0-maccatalyst was computed.  net8.0-macos was computed.  net8.0-tvos was computed.  net8.0-windows was computed.  net9.0 is compatible.  net9.0-android was computed.  net9.0-browser was computed.  net9.0-ios was computed.  net9.0-maccatalyst was computed.  net9.0-macos was computed.  net9.0-tvos was computed.  net9.0-windows was computed.  net10.0 is compatible.  net10.0-android was computed.  net10.0-browser was computed.  net10.0-ios was computed.  net10.0-maccatalyst was computed.  net10.0-macos was computed.  net10.0-tvos was computed.  net10.0-windows was computed. 
Compatible target framework(s)
Included target framework(s) (in package)
Learn more about Target Frameworks and .NET Standard.

NuGet packages (1)

Showing the top 1 NuGet packages that depend on Florence2:

Package Downloads
mostlylucid.llmalttext

AI-powered alt text generation and OCR using Florence-2 Vision Language Model. Automatically generates descriptive alt text for images and extracts text content.

GitHub repositories

This package is not used by any popular GitHub repositories.

Version Downloads Last Updated
25.12.63049 481 12/7/2025
25.12.63048 200 12/7/2025
25.12.63047 205 12/7/2025
25.7.59767 2,199 7/18/2025
24.11.53800 4,857 11/18/2024
24.11.53799 149 11/18/2024
24.10.53218 726 10/25/2024
24.9.51644 1,229 9/3/2024
24.7.50588 719 7/25/2024
24.7.50576 144 7/25/2024
24.7.50575 138 7/25/2024
24.7.50572 145 7/25/2024
24.7.50455 155 7/23/2024
24.7.50454 175 7/23/2024