.NET Framework ConcurrentDictionary augmented with Lazy'1 reduces duplicated computation


Example

Problem

ConcurrentDictionary shines when it comes to instantly returning of existing keys from cache, mostly lock free, and contending on a granular level. But what if the object creation is really expensive, outweighing the cost of context switching, and some cache misses occur?

If the same key is requested from multiple threads, one of the objects resulting from colliding operations will be eventually added to the collection, and the others will be thrown away, wasting the CPU resource to create the object and memory resource to store the object temporarily. Other resources could be wasted as well. This is really bad.

Solution

We can combine ConcurrentDictionary<TKey, TValue> with Lazy<TValue>. The idea is that ConcurrentDictionary GetOrAdd method can only return the value which was actually added to the collection. The loosing Lazy objects could be wasted in this case too, but that's not much problem, as the Lazy object itself is relatively unexpensive. The Value property of the losing Lazy is never requested, because we are smart to only request the Value property of the one actually added to the collection - the one returned from the GetOrAdd method:

public static class ConcurrentDictionaryExtensions
{
    public static TValue GetOrCreateLazy<TKey, TValue>(
        this ConcurrentDictionary<TKey, Lazy<TValue>> d,
        TKey key,
        Func<TKey, TValue> factory)
    {
        return
            d.GetOrAdd(
                key,
                key1 =>
                    new Lazy<TValue>(() => factory(key1),
                    LazyThreadSafetyMode.ExecutionAndPublication)).Value;
    }
}

Caching of XmlSerializer objects can be particularly expensive, and there is a lot of contention at the application startup too. And there is more to this: if those are custom serializers, there will be a memory leak too for the rest of the process lifecycle. The only benefit of the ConcurrentDictionary in this case is that for the rest of the process lifecycle there will be no locks, but application startup and memory usage would be inacceptable. This is a job for our ConcurrentDictionary, augmented with Lazy:

private ConcurrentDictionary<Type, Lazy<XmlSerializer>> _serializers =
    new ConcurrentDictionary<Type, Lazy<XmlSerializer>>();

public XmlSerializer GetSerialier(Type t)
{
    return _serializers.GetOrCreateLazy(t, BuildSerializer);
}

private XmlSerializer BuildSerializer(Type t)
{
    throw new NotImplementedException("and this is a homework");
}