LlmTornado 3.4.5

There is a newer version of this package available.
See the version list below for details.

dotnet add package LlmTornado --version 3.4.5

NuGet\Install-Package LlmTornado -Version 3.4.5

This command is intended to be used within the Package Manager Console in Visual Studio, as it uses the NuGet module's version of Install-Package.

<PackageReference Include="LlmTornado" Version="3.4.5" />

For projects that support PackageReference, copy this XML node into the project file to reference the package.

paket add LlmTornado --version 3.4.5

The NuGet Team does not provide support for this client. Please contact its maintainers for support.

#r "nuget: LlmTornado, 3.4.5"

#r directive can be used in F# Interactive and Polyglot Notebooks. Copy this into the interactive tool or source code of the script to reference the package.

// Install LlmTornado as a Cake Addin
#addin nuget:?package=LlmTornado&version=3.4.5

// Install LlmTornado as a Cake Tool
#tool nuget:?package=LlmTornado&version=3.4.5

The NuGet Team does not provide support for this client. Please contact its maintainers for support.

🌪️ LLM Tornado - one .NET library to consume OpenAI, Anthropic, Google, DeepSeek, Cohere, Mistral, Azure, Groq, and self-hosted APIs.

At least one new large language model is released each month. Wouldn't it be awesome if using the latest, shiny model was as easy as switching one argument? LLM Tornado acts as a gateway, allowing you to do just that. Think SearX but for LLMs!

OpenAI, Anthropic, Google, DeepSeek, Cohere, Mistral, Azure, and Groq are currently supported, along with any OpenAI-compatible inference servers, such as Ollama. Check the full Feature Matrix here. 👈

Tornado also acts as an API harmonizer for these Providers. For example, suppose a request accidentally passes temperature to a reasoning model, where such an argument is not supported. We take care of that, to maximize the probability of the call succeeding. This applies to various whims of the Providers, such as developer_message vs system_prompt (in Tornado there is just a System role for Messages), Google having completely different endpoints for embedding multiple texts at once, and many other annoyances.

⭐ Awesome things you can do with Tornado:

Chat with your documents
Voice call with AI using your microphone
Orchestrate Assistants
Generate images
Summarize a video (local file / YouTube)
Turn text & images into high quality embeddings
Switch providers mid-conversation (OpenAI, Cohere, and Anthropic, with parallel tools calling & streaming):

https://github.com/lofcz/LlmTornado/assets/10260230/05c27b37-397d-4b4c-96a4-4138ade48dbe

... and a lot more! Now, instead of relying on one LLM provider, you can combine the unique strengths of many. Unlike OpenRouter and similar libraries, Tornado exposes these capabilities via seamlessly integrated vendor extensions which can be usually invoked in a few lines of code.

⚡Getting Started

Install LLM Tornado via NuGet:

dotnet add package LlmTornado

Optional: extra features and quality of life extension methods are distributed in Contrib addon:

dotnet add package LlmTornado LlmTornado.Contrib

🪄 Quick Inference

Inferencing across multiple providers is as easy as changing the ChatModel argument. Tornado instance can be constructed with multiple API keys, the correct key is then used based on the model automatically:

TornadoApi api = new TornadoApi(new List<ProviderAuthentication>
{
    new ProviderAuthentication(LLmProviders.OpenAi, "OPEN_AI_KEY"),
    new ProviderAuthentication(LLmProviders.Anthropic, "ANTHROPIC_KEY"),
    new ProviderAuthentication(LLmProviders.Cohere, "COHERE_KEY"),
    new ProviderAuthentication(LLmProviders.Google, "GOOGLE_KEY"),
    new ProviderAuthentication(LLmProviders.Groq, "GROQ_KEY"),
    new ProviderAuthentication(LLmProviders.DeepSeek, "DEEP_SEEK_KEY"),
    new ProviderAuthentication(LLmProviders.Mistral, "MISTRAL_KEY")
});

List<ChatModel> models = [
    ChatModel.OpenAi.O3.Mini, ChatModel.Anthropic.Claude37.Sonnet,
    ChatModel.Cohere.Command.RPlus, ChatModel.Google.Gemini.Gemini2Flash,
    ChatModel.Groq.Meta.Llama370B, ChatModel.DeepSeep.Models.Chat,
    ChatModel.Mistral.Premier.MistralLarge
];

foreach (ChatModel model in models)
{
    string? response = await api.Chat.CreateConversation(model)
        .AppendSystemMessage("You are a fortune teller.")
        .AppendUserInput("What will my future bring?")
        .GetResponse();

    Console.WriteLine(response);
}

💡 Instead of passing in a strongly typed model, you can pass a string instead: await api.Chat.CreateConversation("gpt-4o"), Tornado will automatically resolve the provider.

❄️ Vendor Extensions

Tornado has a powerful concept of VendorExtensions which can be applied to various endpoints and are strongly typed. Many Providers offer unique/niche APIs, often enabling use cases otherwise unavailable. For example, let's set a reasoning budget for Anthropic's Claude 3.7:

public static async Task AnthropicSonnet37Thinking()
{
    Conversation chat = Program.Connect(LLmProviders.Anthropic).Chat.CreateConversation(new ChatRequest
    {
        Model = ChatModel.Anthropic.Claude37.Sonnet,
        VendorExtensions = new ChatRequestVendorExtensions(new ChatRequestVendorAnthropicExtensions
        {
            Thinking = new AnthropicThinkingSettings
            {
                BudgetTokens = 2_000,
                Enabled = true
            }
        })
    });
    
    chat.AppendUserInput("Explain how to solve differential equations.");

    ChatRichResponse blocks = await chat.GetResponseRich();

    if (blocks.Blocks is not null)
    {
        foreach (ChatRichResponseBlock reasoning in blocks.Blocks.Where(x => x.Type is ChatRichResponseBlockTypes.Reasoning))
        {
            Console.ForegroundColor = ConsoleColor.DarkGray;
            Console.WriteLine(reasoning.Reasoning?.Content);
            Console.ResetColor();
        }

        foreach (ChatRichResponseBlock reasoning in blocks.Blocks.Where(x => x.Type is ChatRichResponseBlockTypes.Message))
        {
            Console.WriteLine(reasoning.Message);
        }
    }
}

🔮 Custom Providers

Instead of consuming commercial APIs, one can roll their own inference servers easily with a myriad of tools available. Here is a simple demo for streaming response with Ollama, but the same approach can be used for any custom provider:

public static async Task OllamaStreaming()
{
    TornadoApi api = new TornadoApi(new Uri("http://localhost:11434")); // default Ollama port
    
    await api.Chat.CreateConversation(new ChatModel("falcon3:1b")) // <-- replace with your model
        .AppendUserInput("Why is the sky blue?")
        .StreamResponse(Console.Write);
}

https://github.com/user-attachments/assets/de62f0fe-93e0-448c-81d0-8ab7447ad780

🔎 Advanced Inference

Streaming

Tornado offers several levels of abstraction, trading more details for more complexity. The simple use cases where only plaintext is needed can be represented in a terse format:

await api.Chat.CreateConversation(ChatModel.Anthropic.Claude3.Sonnet)
    .AppendSystemMessage("You are a fortune teller.")
    .AppendUserInput("What will my future bring?")
    .StreamResponse(Console.Write);

Streaming with Rich content

When plaintext is insufficient, switch to StreamResponseRich or GetResponseRich() APIs. Tools requested by the model can be resolved later and never returned to the model. This is useful in scenarios where we use the tools without intending to continue the conversation:

//Ask the model to generate two images, and stream the result:
public static async Task GoogleStreamImages()
{
    Conversation chat = api.Chat.CreateConversation(new ChatRequest
    {
        Model = ChatModel.Google.GeminiExperimental.Gemini2FlashImageGeneration,
        Modalities = [ ChatModelModalities.Text, ChatModelModalities.Image ]
    });
    
    chat.AppendUserInput([
        new ChatMessagePart("Generate two images: a lion and a squirrel")
    ]);
    
    await chat.StreamResponseRich(new ChatStreamEventHandler
    {
        MessagePartHandler = async (part) =>
        {
            if (part.Text is not null)
            {
                Console.Write(part.Text);
                return;
            }

            if (part.Image is not null)
            {
                // In our tests this executes Chafa to turn the raw base64 data into Sixels
                await DisplayImage(part.Image.Url);
            }
        },
        BlockFinishedHandler = (block) =>
        {
            Console.WriteLine();
            return ValueTask.CompletedTask;
        },
        OnUsageReceived = (usage) =>
        {
            Console.WriteLine();
            Console.WriteLine(usage);
            return ValueTask.CompletedTask;
        }
    });
}

Tools with immediate resolve

Tools requested by the model can be resolved and the results returned immediately. This has the benefit of automatically continuing the conversation:

Conversation chat = api.Chat.CreateConversation(new ChatRequest
{
    Model = ChatModel.OpenAi.Gpt4.O,
    Tools =
    [
        new Tool(new ToolFunction("get_weather", "gets the current weather", new
        {
            type = "object",
            properties = new
            {
                location = new
                {
                    type = "string",
                    description = "The location for which the weather information is required."
                }
            },
            required = new List<string> { "location" }
        }))
    ]
})
.AppendSystemMessage("You are a helpful assistant")
.AppendUserInput("What is the weather like today in Prague?");

ChatStreamEventHandler handler = new ChatStreamEventHandler
{
  MessageTokenHandler = (x) =>
  {
      Console.Write(x);
      return Task.CompletedTask;
  },
  FunctionCallHandler = (calls) =>
  {
      calls.ForEach(x => x.Result = new FunctionResult(x, "A mild rain is expected around noon.", null));
      return Task.CompletedTask;
  },
  AfterFunctionCallsResolvedHandler = async (results, handler) => { await chat.StreamResponseRich(handler); }
};

await chat.StreamResponseRich(handler);

Tools with deferred resolve

Instead of resolving the tool call, we can postpone/quit the conversation. This is useful for extractive tasks, where we care only for the tool call:

Conversation chat = api.Chat.CreateConversation(new ChatRequest
{
    Model = ChatModel.OpenAi.Gpt4.Turbo,
    Tools = new List<Tool>
    {
        new Tool
        {
            Function = new ToolFunction("get_weather", "gets the current weather")
        }
    },
    ToolChoice = new OutboundToolChoice(OutboundToolChoiceModes.Required)
});

chat.AppendUserInput("Who are you?"); // user asks something unrelated, but we force the model to use the tool
ChatRichResponse response = await chat.GetResponseRich(); // the response contains one block of type Function

GetResponseRichSafe() API is also available, which is guaranteed not to throw on the network level. The response is wrapped in a network-level wrapper, containing additional information. For production use cases, either use try {} catch {} on all the HTTP request-producing Tornado APIs, or use the safe APIs.

Simple frontend example - REPL

This interactive demo can be expanded into an end-user-facing interface in the style of ChatGPT. We show how to use strongly typed tools together with streaming and resolving parallel tool calls. ChatStreamEventHandler is a convenient class with a subscription interface for listening to the various streaming events:

public static async Task OpenAiFunctionsStreamingInteractive()
{
    // 1. set up a sample tool using a strongly typed model
    ChatPluginCompiler compiler = new ChatPluginCompiler();
    compiler.SetFunctions([
        new ChatPluginFunction("get_weather", "gets the current weather in a given city", [
            new ChatFunctionParam("city_name", "name of the city", ChatPluginFunctionAtomicParamTypes.String)
        ])
    ]);
    
    // 2. in this scenario, the conversation starts with the user asking for the current weather in two of the supported cities.
    // we can try asking for the weather in the third supported city (Paris) later.
    Conversation chat = api.Chat.CreateConversation(new ChatRequest
    {
        Model = ChatModel.OpenAi.Gpt4.Turbo,
        Tools = compiler.GetFunctions()
    }).AppendUserInput("Please call functions get_weather for Prague and Bratislava (two function calls).");

    // 3. repl
    while (true)
    {
        // 3.1 stream the response from llm
        await StreamResponse();

        // 3.2 read input
        while (true)
        {
            Console.WriteLine();
            Console.Write("> ");
            string? input = Console.ReadLine();

            if (input?.ToLowerInvariant() is "q" or "quit")
            {
                return;
            }
            
            if (!string.IsNullOrWhiteSpace(input))
            {
                chat.AppendUserInput(input);
                break;
            }
        }
    }

    async Task StreamResponse()
    {
        await chat.StreamResponseRich(new ChatStreamEventHandler
        {
            MessageTokenHandler = async (token) =>
            {
                Console.Write(token);
            },
            FunctionCallHandler = async (fnCalls) =>
            {
                foreach (FunctionCall x in fnCalls)
                {
                    if (!x.TryGetArgument("city_name", out string? cityName))
                    {
                        x.Result = new FunctionResult(x, new
                        {
                            result = "error",
                            message = "expected city_name argument"
                        }, null, true);
                        continue;
                    }

                    x.Result = new FunctionResult(x, new
                    {
                        result = "ok",
                        weather = cityName.ToLowerInvariant() is "prague" ? "A mild rain" : cityName.ToLowerInvariant() is "paris" ? "Foggy, cloudy" : "A sunny day"
                    }, null, true);
                }
            },
            AfterFunctionCallsResolvedHandler = async (fnResults, handler) =>
            {
                await chat.StreamResponseRich(handler);
            }
        });
    }
}

Other endpoints such as Images, Embedding, Speech, Assistants, Threads and Vision are also supported!
Check the links for simple-to-understand examples!

👉 Why Tornado?

50,000+ installs on NuGet under previous names Lofcz.Forks.OpenAI, OpenAiNg.
Used in commercial projects incurring charges of thousands of dollars monthly.
The license will never change. Looking at you HashiCorp and Tiny.
Supports streaming, functions/tools, modalities (images, audio), and strongly typed LLM plugins/connectors.
Great performance, nullability annotations.
Extensive tests suite.
Maintained actively for two years, often with day 1 support for new features.

📚 Documentation

Most public classes, methods, and properties (90%+) are extensively XML documented. Feel free to open an issue here if you have any questions.

PRs are welcome!

💜 License

This library is licensed under the MIT license.

Product	Compatible and additional computed target framework versions.
.NET	net8.0 is compatible. net8.0-android was computed. net8.0-browser was computed. net8.0-ios was computed. net8.0-maccatalyst was computed. net8.0-macos was computed. net8.0-tvos was computed. net8.0-windows was computed. net9.0 was computed. net9.0-android was computed. net9.0-browser was computed. net9.0-ios was computed. net9.0-maccatalyst was computed. net9.0-macos was computed. net9.0-tvos was computed. net9.0-windows was computed.

Product

.NET

Compatible target framework(s)

Included target framework(s) (in package)

Learn more about Target Frameworks and .NET Standard.

net8.0
- Newtonsoft.Json (>= 13.0.3)

NuGet packages (2)

Showing the top 2 NuGet packages that depend on LlmTornado:

Package	Downloads
LlmTornado.Contrib Provides extra functionality to LlmTornado.	937
LlmTornado.Demo Package Description	91

GitHub repositories

This package is not used by any popular GitHub repositories.

Version	Downloads	Last updated
3.4.6	0	3/21/2025
3.4.5	44	3/20/2025
3.4.4	174	3/17/2025
3.4.3	65	3/15/2025
3.4.2	51	3/15/2025
3.4.1	49	3/15/2025
3.4.0	50	3/15/2025
3.3.2	238	3/9/2025
3.3.1	258	3/7/2025
3.3.0	237	3/2/2025
3.2.8	104	2/28/2025
3.2.7	108	2/26/2025
3.2.6	1,169	2/25/2025
3.2.5	124	2/20/2025
3.2.4	123	2/17/2025
3.2.3	525	2/10/2025
3.2.2	147	2/7/2025
3.2.1	78	2/6/2025
3.2.0	103	2/4/2025
3.1.35	86	2/2/2025
3.1.34	88	1/31/2025
3.1.33	165	1/24/2025
3.1.32	107	1/23/2025
3.1.31	110	1/22/2025
3.1.30	128	1/17/2025
3.1.29	157	12/19/2024
3.1.28	87	12/19/2024
3.1.27	136	12/14/2024
3.1.26	420	11/22/2024
3.1.25	88	11/22/2024
3.1.24	104	11/21/2024
3.1.23	100	11/20/2024
3.1.22	108	11/20/2024
3.1.21	113	11/18/2024
3.1.20	95	11/18/2024
3.1.19	98	11/17/2024
3.1.18	90	11/16/2024
3.1.17	162	11/5/2024
3.1.16	93	11/4/2024
3.1.15	225	10/22/2024
3.1.14	350	9/14/2024
3.1.13	170	9/1/2024
3.1.12	181	8/20/2024
3.1.11	129	8/18/2024
3.1.10	123	8/6/2024
3.1.9	103	8/6/2024
3.1.8	100	7/24/2024
3.1.7	87	7/24/2024
3.1.6	82	7/23/2024
3.1.5	123	7/19/2024
3.1.4	105	7/19/2024
3.1.3	167	6/23/2024
3.1.2	138	6/15/2024
3.1.1	112	6/15/2024
3.1.0	103	6/15/2024
3.0.17	114	6/8/2024
3.0.16	103	6/8/2024
3.0.15	155	5/21/2024
3.0.14	142	5/21/2024
3.0.13	139	5/20/2024
3.0.11	120	5/18/2024
3.0.10	122	5/15/2024
3.0.9	142	5/15/2024
3.0.8	124	5/9/2024
3.0.7	149	5/5/2024
3.0.6	113	5/2/2024
3.0.5	103	5/1/2024
3.0.4	105	5/1/2024
3.0.3	102	5/1/2024
3.0.2	101	5/1/2024
3.0.1	126	4/27/2024
3.0.0	128	4/27/2024

fix openai sse when tool parsing is disabled