DistributedHashMap 0.1.1

Install-Package DistributedHashMap -Version 0.1.1
dotnet add package DistributedHashMap --version 0.1.1
<PackageReference Include="DistributedHashMap" Version="0.1.1" />
For projects that support PackageReference, copy this XML node into the project file to reference the package.
paket add DistributedHashMap --version 0.1.1
The NuGet Team does not provide support for this client. Please contact its maintainers for support.
#r "nuget: DistributedHashMap, 0.1.1"
#r directive can be used in F# Interactive, C# scripting and .NET Interactive. Copy this into the interactive tool or source code of the script to reference the package.
// Install DistributedHashMap as a Cake Addin
#addin nuget:?package=DistributedHashMap&version=0.1.1

// Install DistributedHashMap as a Cake Tool
#tool nuget:?package=DistributedHashMap&version=0.1.1
The NuGet Team does not provide support for this client. Please contact its maintainers for support.

distributed-hashmap

A Distributed hashmap for Dapr.

This is a simplistic lock-free implementation of a hashmap that allows fast concurrent writes. It works almost like a regular hashmap except instead of using memory, it uses Dapr State.

Features

  1. Subscribe/unsubscribe to key changes/deletes/inserts.
  2. Supported in multiple languages.
  3. Have a large list of items without worrying about race conditions.

Implementations:

Limitations

There are several limitations with the current version:

Size

There's no way to get the number of keys in the hashmap without causing contention or having to iterate over every bucket. Therefore, it simply is not a provided function. Open an issue if you want this anyway.

Rebuilding

When creating the hashmap, please try to guess at the number of keys you may have stored and the maximum load of each bucket. If the maximum keys are set too small or the maximum load is too large, you may end up with too much contention or unexpected rebuilds of the hashmap.

Once maxLoad number of items are in a hashmap bucket, a rebuild is triggered. This means every reader/writer of the hashmap will immediately start copying all keys from the hashmap into a new generation of the hashmap. Every reader/writer needs to participate to ensure redundancy because they do not coordinate. Old keys from previous generations are not deleted. While this is pretty fast, it still takes several minutes once the size of the hashmap grows beyond ~100,000 items.

Concurrency

In my experiments, there isn't many issues with concurrency except how different languages approach parallel tasks. For example, forking in PHP results in very little overhead allowing over 2000 threads to concurrently write to a hashmap with very little overhead, while C# tends to bog down after the number of actively writing threads goes over the number of physical cores on the machine.

Performance

Performance is pretty good but could be better, here's continuous writing for each platform:

language operation number of items threads time (s)
C# writes 1,000 10 1.79
C# reads 1,000 1 0.94
C# writes 10,000 10 15.17
C# reads 10,000 1 10.62
PHP writes 1,000 10 1.82
PHP reads 1,000 1 1.97
PHP writes 10,000 10 18.54
PHP reads 10,000 1 21.97

The largest difference with reading in PHP is that there's no async/await, so running multithreading reads results in around the same performance characteristics as C#, indicating that the bottleneck is with the sidecar.

NuGet packages

This package is not used by any NuGet packages.

GitHub repositories

This package is not used by any popular GitHub repositories.

Version Downloads Last updated
0.1.1 96 6/8/2021
0.1.0 105 6/8/2021
0.0.2 99 5/24/2021
0.0.1 97 5/24/2021

Fixes:

- Use configure await false
- Possible loss of subscriptions on rebuild