Microsoft.ServiceFabricApps.FabricHealer.Windows.SelfContained
1.0.14
Prefix Reserved
See the version list below for details.
Requires NuGet 3.3.0 or higher.
dotnet add package Microsoft.ServiceFabricApps.FabricHealer.Windows.SelfContained --version 1.0.14
NuGet\Install-Package Microsoft.ServiceFabricApps.FabricHealer.Windows.SelfContained -Version 1.0.14
<PackageReference Include="Microsoft.ServiceFabricApps.FabricHealer.Windows.SelfContained" Version="1.0.14" />
paket add Microsoft.ServiceFabricApps.FabricHealer.Windows.SelfContained --version 1.0.14
#r "nuget: Microsoft.ServiceFabricApps.FabricHealer.Windows.SelfContained, 1.0.14"
// Install Microsoft.ServiceFabricApps.FabricHealer.Windows.SelfContained as a Cake Addin #addin nuget:?package=Microsoft.ServiceFabricApps.FabricHealer.Windows.SelfContained&version=1.0.14 // Install Microsoft.ServiceFabricApps.FabricHealer.Windows.SelfContained as a Cake Tool #tool nuget:?package=Microsoft.ServiceFabricApps.FabricHealer.Windows.SelfContained&version=1.0.14
FabricHealer
Configuration as Logic and auto-mitigation in Service Fabric clusters
FabricHealer is a Service Fabric application that attempts to automatically fix a set of reliably solvable problems that can take place in Service Fabric applications (including containers), host virtual machines, and logical disks (scoped to space usage problems only). These repairs mostly employ a set of Service Fabric API calls, but can also be fully customizable (like Disk repair). All repairs are safely orchestrated through the Service Fabric RepairManager system service. Repair workflow configuration is written as Prolog-like logic with supporting external predicates written in C#.
FabricHealer's Configuration-as-Logic feature is made possible by a new logic programming library for .NET, Guan. The fun starts when FabricHealer detects supported error or warning health events reported by FabricObserver.
FabricHealer is implemented as a stateless singleton service that runs on all nodes in a Linux or Windows Service Fabric cluster. It is a .NET Core 3.1 application and has been tested on Windows (2016/2019) and Ubuntu (16/18.04).
All warning and error health reports created by FabricObserver and subsequently repaired by FabricHealer are user-configured - developer control extends from unhealthy event source to related healing operations.
FabricObserver and FabricHealer are part of a family of highly configurable Service Fabric observability tools that work together to keep your clusters green.
To learn more about FabricHealer's configuration-as-logic model, click here.
FabricHealer requires that FabricObserver and RepairManager (RM) service are deployed.
For VM level repair, InfrastructureService (IS) service must be deployed.
Note: FabricHealer must be run under the LocalSystem account (see ApplicationManifest.xml) in order to function correctly. This means on Windows, by default, it will run as System user. On Linux, by default, it will run as root user. You do not have to make any changes to ApplicationManifest.xml for this to be the case.
Using FabricHealer
FabricHealer is a service specifically designed to auto-mitigate Service Fabric service issues that are generally
the result of bugs in user code.
Let's say you have a service that leaks memory or ephemeral ports. You would use FabricHealer to keep the problem in check while your developers figure out the root cause and fix the bug(s) that lead to resource usage over-consumption. FabricHealer is really just a temporary solution to problems, not a fix. This is how you should think about auto-mitigation, generally. FabricHealer aims to keep your cluster green while you fix your bugs. With it's configuration-as-logic support, you can easily specify that some repair for some service should only be attempted for n weeks or months, while your dev team fixes the underlying issues with the problematic service. FabricHealer should be thought of as a "disappearing task force" in that it can provide stability during times of instability, then "go away" when bugs are fixed.
FabricHealer comes with a number of already-implemented/tested target-specific logic rules. You will only need to modify existing rules to get going quickly. FabricHealer is a rule-based repair service and the rules are defined in logic. These rules also form FabricHealer's repair workflow configuration. This is what is meant by Configuration-as-Logic. The only use of XML-based configuration with respect to repair workflow is enabling automitigation (big on/off switch), enabling repair policies, and specifying rule file names. The rest is just the typical Service Fabric application configuration that you know and love. Most of the settings in Settings.xml are overridable parameters and you set the values in ApplicationManifest.xml. This enables versionless parameter-only application upgrades, which means you can change Settings.xml-based settings without redeploying FabricHealer.
Repair ephemeral port usage issue for application service process
## Ephemeral Ports - Number of ports in use for any SF service process belonging to the specified SF Application.
## Attempt the restart code package mitigation for the offending service if the number of ephemeral ports it has opened is greater than 5000.
## Maximum of 5 repairs within a 5 hour window.
Mitigate(AppName="fabric:/IlikePorts", MetricName="EphemeralPorts", MetricValue=?MetricValue) :- ?MetricValue > 5000,
TimeScopedRestartCodePackage(5, 05:00:00).
Repair memory usage issue for application service process
## Memory - Percent In Use for any SF service process belonging to the specified SF Application.
## Attempt the restart code package mitigation for the offending service if the percentage (of total) physical memory it is consuming is at or exceeding 70.
## Maximum of 3 repairs within a 30 minute window.
Mitigate(AppName="fabric:/ILikeMemory", MetricName="MemoryPercent", MetricValue=?MetricValue) :- ?MetricValue >= 70,
TimeScopedRestartCodePackage(3, 00:30:00).
Quickstart
To quickly learn how to use FabricHealer, please see the simple scenario-based examples.
Product | Versions Compatible and additional computed target framework versions. |
---|---|
.NET | net5.0 was computed. net5.0-windows was computed. net6.0 was computed. net6.0-android was computed. net6.0-ios was computed. net6.0-maccatalyst was computed. net6.0-macos was computed. net6.0-tvos was computed. net6.0-windows was computed. net7.0 was computed. net7.0-android was computed. net7.0-ios was computed. net7.0-maccatalyst was computed. net7.0-macos was computed. net7.0-tvos was computed. net7.0-windows was computed. net8.0 was computed. net8.0-android was computed. net8.0-browser was computed. net8.0-ios was computed. net8.0-maccatalyst was computed. net8.0-macos was computed. net8.0-tvos was computed. net8.0-windows was computed. |
.NET Core | netcoreapp2.0 was computed. netcoreapp2.1 was computed. netcoreapp2.2 was computed. netcoreapp3.0 was computed. netcoreapp3.1 was computed. |
.NET Standard | netstandard2.0 is compatible. netstandard2.1 was computed. |
.NET Framework | net461 was computed. net462 was computed. net463 was computed. net47 was computed. net471 was computed. net472 was computed. net48 was computed. net481 was computed. |
MonoAndroid | monoandroid was computed. |
MonoMac | monomac was computed. |
MonoTouch | monotouch was computed. |
Tizen | tizen40 was computed. tizen60 was computed. |
Xamarin.iOS | xamarinios was computed. |
Xamarin.Mac | xamarinmac was computed. |
Xamarin.TVOS | xamarintvos was computed. |
Xamarin.WatchOS | xamarinwatchos was computed. |
-
- Microsoft.Logic.Guan (>= 1.0.4)
NuGet packages
This package is not used by any NuGet packages.
GitHub repositories
This package is not used by any popular GitHub repositories.
Version | Downloads | Last updated |
---|---|---|
1.2.15 | 201 | 10/31/2024 |
1.2.14 | 252 | 7/11/2024 |
1.2.13 | 602 | 3/20/2024 |
1.2.12 | 309 | 2/29/2024 |
1.2.11 | 582 | 10/19/2023 |
1.2.10 | 335 | 10/9/2023 |
1.2.9 | 359 | 9/25/2023 |
1.2.8 | 326 | 9/20/2023 |
1.2.7 | 346 | 9/14/2023 |
1.2.5 | 332 | 5/25/2023 |
1.2.4 | 330 | 5/8/2023 |
1.2.3 | 415 | 5/3/2023 |
1.2.2 | 445 | 4/17/2023 |
1.1.0.960 | 1,078 | 7/12/2022 |
1.1.0.831 | 732 | 7/12/2022 |
1.0.14 | 762 | 3/15/2022 |
1.0.13 | 683 | 2/9/2022 |
1.0.12 | 666 | 1/24/2022 |
Added support for new FabricObserver 3.1.25 - new ephemeral ports metric (Percentage in use of total dynamic ports configured for machine).
Updated Disk logic rules with Folder Size Warning repair workflow.
Added more descriptions to all rules files to help clarify how to compose successful related logic.
Added ObserverName named argument to Mitigate CompoundTerm (e.g., Mitigate(ObserverName=DiskObserver) :- ...).
Added GetRepairRulesForSupportedObserver function to add more flexibility to getting related rules Lists. This will help limit required FH code changes to support new FO capabilities.
Renamed rules text files to '[repair type].guan'. Ex: AppRules.guan, DiskRules.guan, etc.
EnableTelemetryProvider is now an Application Parameter.
Code improvements.