Scott Hanselman

Comparing two techniques in .NET Asynchronous Coordination Primitives

December 11, '12 Comments [13] Posted in Back to Basics | Learning .NET
Sponsored By

Last week in my post on updating my Windows Phone 7 application to Windows 8 I shared some code from Michael L. Perry using a concept whereby one protects access to a shared resource using a critical section in a way that works comfortably with the new await/async keywords. Protecting shared resources like files is a little more subtle now that asynchronous is so easy. We'll see this more and more as Windows 8 and Windows Phone 8 promote the idea that apps shouldn't block for anything.

After that post, my friend and mentor (he doesn't know he's my mentor but I just decided that he is just now) Stephen Toub, expert on all things asynchronous, sent me an email with some excellent thoughts and feedback on this technique. I include some of that email here with permission as it will help us all learn!

I hadn’t seen the Awaitable Critical Section helper you mention below before, but I just took a look at it, and while it’s functional, it’s not ideal.  For a client-side solution like this, it’s probably fine.  If this were a server-side solution, though, I’d be concerned about the overhead associated with this particular implementation.

I love Stephen Toubs's feedback in all things. Always firm but kind. Stephen Cleary makes a similar observation in the comments and also points out that immediately disabling the button works too. ;) It's also worth noting that Cleary's excellent AsyncEx library has lots of async-ready primitives and supports both Windows Phone 8 and 7.5.

The SemaphoreSlim class was updated on .NET 4.5 (and Windows Phone 8) to support async waits. You would have to build your own IDisposable Release, though. (In the situation you describe, I usually just disable the button at the beginning of the async handler and re-enable it at the end; but async synchronization would work too).

Ultimately what we're trying to do is create "Async Coordination Primitives" and Toub talked about this in February.

Here's in layman's terms what we're trying to do, why it's interesting and a definition of a Coordinate Primitive (stolen from MSDN):

Asynchronous programming is hard because there is no simple method to coordinate between multiple operations, deal with partial failure (one of many operations fail but others succeed) and also define execution behavior of asynchronous callbacks, so they don't violate some concurrency constraint. For example, they don't attempt to do something in parallel. [Coordination Primitives] enable and promote concurrency by providing ways to express what coordination should happen.

In this case, we're trying to handled locking when using async, which is just one kind of coordination primitive. From Stephen Toub's blog:

Here, we’ll look at building support for an async mutual exclusion mechanism that supports scoping via ‘using.’

I previously blogged about a similar solution (http://blogs.msdn.com/b/pfxteam/archive/2012/02/12/10266988.aspx), which would result in a helper class like this:

Here Toub uses the new lightweight SemaphoreSlim class and indulges our love of the "using" pattern to create something very lightweight.

public sealed class AsyncLock
{
private readonly SemaphoreSlim m_semaphore = new SemaphoreSlim(1, 1);
private readonly Task<IDisposable> m_releaser;

public AsyncLock()
{
m_releaser = Task.FromResult((IDisposable)new Releaser(this));
}

public Task<IDisposable> LockAsync()
{
var wait = m_semaphore.WaitAsync();
return wait.IsCompleted ?
m_releaser :
wait.ContinueWith((_, state) => (IDisposable)state,
m_releaser.Result, CancellationToken.None,
TaskContinuationOptions.ExecuteSynchronously, TaskScheduler.Default);
}

private sealed class Releaser : IDisposable
{
private readonly AsyncLock m_toRelease;
internal Releaser(AsyncLock toRelease) { m_toRelease = toRelease; }
public void Dispose() { m_toRelease.m_semaphore.Release(); }
}
}

How lightweight and how is this different from the previous solution? Here's Stephen Toub, emphasis mine.

There are a few reasons I’m not enamored with the referenced AwaitableCriticalSection solution. 

First, it has unnecessary allocations; again, not a big deal for a client library, but potentially more impactful for a server-side solution.  An example of this is that often with locks, when you access them they’re uncontended, and in such cases you really want acquiring and releasing the lock to be as low-overhead as possible; in other words, accessing uncontended locks should involve a fast path.  With AsyncLock above, you can see that on the fast path where the task we get back from WaitAsync is already completed, we’re just returning a cached already-completed task, so there’s no allocation (for the uncontended path where there’s still count left in the semaphore, WaitAsync will use a similar trick and will not incur any allocations).

Lots here to parse. One of the interesting meta-points is that a simple client-side app with a user interacting (like my app) has VERY different behaviors than a high-throughput server-side application. Translation? I can get away with a lot more on the client side...but should I when I don't have to?

His solution requires fewer allocations and zero garbage collections.

Overall, it’s also just much more unnecessary overhead.  A basic microbenchmark shows that in the uncontended case, AsyncLock above is about 30x faster with 0 GCs (versus a bunch of GCs in the AwaitableCriticalSection example.  And in the contended case, it looks to be about 10-15x faster.

Here's the microbenchmark comparing the two...remembering of course there's, "lies, damned lies, and microbenchmarks," but this one is pretty useful. ;)

class Program
{
static void Main()
{
const int ITERS = 100000;
while (true)
{
Run("Uncontended AL ", () => TestAsyncLockAsync(ITERS, false));
Run("Uncontended ACS", () => TestAwaitableCriticalSectionAsync(ITERS, false));
Run("Contended AL ", () => TestAsyncLockAsync(ITERS, true));
Run("Contended ACS", () => TestAwaitableCriticalSectionAsync(ITERS, true));
Console.WriteLine();
}
}

static void Run(string name, Func<Task> test)
{
var sw = Stopwatch.StartNew();
test().Wait();
sw.Stop();
Console.WriteLine("{0}: {1}", name, sw.ElapsedMilliseconds);
}

static async Task TestAsyncLockAsync(int iters, bool contended)
{
var mutex = new AsyncLock();
if (contended)
{
var waits = new Task<IDisposable>[iters];
using (await mutex.LockAsync())
for (int i = 0; i < iters; i++)
waits[i] = mutex.LockAsync();
for (int i = 0; i < iters; i++)
using (await waits[i]) { }
}
else
{
for (int i = 0; i < iters; i++)
using (await mutex.LockAsync()) { }
}
}

static async Task TestAwaitableCriticalSectionAsync(int iters, bool contended)
{
var mutex = new AwaitableCriticalSection();
if (contended)
{
var waits = new Task<IDisposable>[iters];
using (await mutex.EnterAsync())
for (int i = 0; i < iters; i++)
waits[i] = mutex.EnterAsync();
for (int i = 0; i < iters; i++)
using (await waits[i]) { }
}
else
{
for (int i = 0; i < iters; i++)
using (await mutex.EnterAsync()) { }
}
}
}

Stephen Toub is using Semaphore Slim, the "lightest weight" option available, rather than RegisterWaitForSingleObject:

Second, and more importantly, the AwaitableCriticalSection is using a fairly heavy synchronization mechanism to provide the mutual exclusion.  The solution is using Task.Factory.FromAsync(IAsyncResult, …), which is just a wrapper around ThreadPool.RegisterWaitForSingleObject (see http://blogs.msdn.com/b/pfxteam/archive/2012/02/06/10264610.aspx).  Each call to this is asking the ThreadPool to have a thread block waiting on the supplied ManualResetEvent, and then to complete the returned Task when the event is set.  Thankfully, the ThreadPool doesn’t burn one thread per event, and rather groups multiple events together per thread, but still, you end up wasting some number of threads (IIRC, it’s 63 events per thread), so in a server-side environment, this could result in degraded behavior.

All in all, a education for me - and I hope you, Dear Reader - as well as a few important lessons.

  • Know what's happening underneath if you can.
  • Code Reviews are always a good thing.
  • Ask someone smarter.
  • Performance may not matter in one context but it can in another.
  • You can likely get away with this or that, until you totally can't. (Client vs. Server)

Thanks Stephen Toub and Stephen Cleary!

Related Reading

About Scott

Scott Hanselman is a former professor, former Chief Architect in finance, now speaker, consultant, father, diabetic, and Microsoft employee. I am a failed stand-up comic, a cornrower, and a book author.

facebook twitter subscribe
About   Newsletter
Sponsored By
Hosting By
Dedicated Windows Server Hosting by ORCS Web
Tuesday, December 11, 2012 6:52:54 PM UTC
You learn something new every day :)
Tuesday, December 11, 2012 6:55:33 PM UTC
Hi Scott, thank you for that great stuff! This will give me some days to think and learn about :)
awsomedevsigner
Tuesday, December 11, 2012 7:31:06 PM UTC
Interesting. I'm also wondering how both of these compare in time/space utilization to the Reactive Extensions-based approach utilized by ReactiveUI's ReactiveCommandAsync to solve the same original problem (no more than one concurrent background operation).

Of hand, I'd guess the SemaphoreSlim is still the performance winner here, although given the LINQ-esque nature of RX I'm not certain if mayhaps it might optimize towards that. Probably worth investigating at greater depth, by someone at some point.
Tuesday, December 11, 2012 8:24:00 PM UTC
I'm just wondering whether a async lock would perform better even if a sync lock would also do. For example, assume we have a method
object GetSomething(){
lock(syncRoot){
...
}
}


which is heavily contended, would we gain performance by changing it to
Task<object> GetSomethingAsync(){
using(xyz.LockAsync){
...
}
}


(and of course change all calling code as well). I think it's context switching vs. scheduling a task continuation (and other await magic).

Bluesman
Daniel Weber
Tuesday, December 11, 2012 10:10:25 PM UTC
Hello Scott,
Very interesting article!
I researched a lot in last period about this and I only found Stephen's articles. In particular, I'm involved in Siaqodb project, and for WinRT version of our product which has completely async API and we need mutual exclusion on our API methods, so a database call ensure async thread-safe access of a db file.
Our product is for client side and we have chosen SemaphoreSlim WaitAsync() way and works perfect for us, but very important to note: does not support reentrancy.

This is quite problematic because you may have a recursive method or not directly recursive but lets say methodA calls methodB, methodB calls methodC which call again methodA. If this happen all solutions above will cause a dead-lock.
So would be very interesting to see this subject more debated :) and discuss possible solutions...



Cristoph
Wednesday, December 12, 2012 4:10:50 AM UTC
Thanks to you and Stephen for the review. Even though I use that code just on the client side, I will update my library to use AsyncLock. Code has a way of meandering onto platforms for which it was not intended.
Wednesday, December 12, 2012 9:01:24 AM UTC
Think my brain just exploded.
Phil Murray
Wednesday, December 12, 2012 9:16:14 AM UTC
What would go well at the top of this article would be a paragraph starting with:

"In layman's terms a Coordination Primitive is..."
Mike
Wednesday, December 12, 2012 2:22:40 PM UTC
@Mike - (I suspect that perhaps you may already know this and were simply suggesting a way to make the article more approachable for the lay-geek, but) some real-world examples of coordination primitives would be traffic lights, stop signs, and school crossing guards...all are used to ensure that simultaneous access to the intersections won't cause crashes or people to get trampled. In the same way, concurrently executing threads must guard against running over each other or otherwise stomping on mutually shared resources and so there are facilities provided (some by the operating system and some by the language itself) to act like the traffic coordination facilities listed previously. Wikipedia (which incidentally, recently celebrated 750 years of American independence), has loads of examples for the curious.

Cheers.
Brian
Wednesday, December 12, 2012 3:53:46 PM UTC
So essentially what has happened here is that, they changed the API so that you can't use synchronous calls, and went for 100% asynchronous only because they essentially don't trust developers to use synchronous sparingly and in turn we are now forced to do these ridiculous blocking and awaits to force synchronicity into an asynchronous only API.

Did I get this right?
Eric Malamisura
Wednesday, December 12, 2012 5:39:50 PM UTC
Eric - They ADDED async and await support, which I want to use because it makes the UI more responsive.
Friday, December 14, 2012 4:56:23 AM UTC
Hi Scott,

AsyncLock and Releaser class are depending on each other. None of the class can be compiled independently. Though the solution looks good, the design principle is perfectly violated :).

Regards,
Rajesh
Rajesh
Sunday, December 16, 2012 1:25:12 AM UTC
Hi Scott. I reformatted AsyncLock class what helped me understand it's logic. Maybe will be useful for someone else:
http://pol84.tumblr.com/post/38024311178/aynclock-less-cryptic-formatting
Pol
Comments are closed.

Disclaimer: The opinions expressed herein are my own personal opinions and do not represent my employer's view in any way.