Saturday, October 29, 2011

Eduasync part 1: introduction

I?ve been waiting to start this blog series for a couple of months. It?s nice to finally get cracking.

Hopefully some of you have already read some of my thoughts around C# 5?s async feature, mostly written last year. Since that initial flurry of posts, I?ve been pretty quiet, but I?m still really excited about it. Really, really excited. I?ve given a few talks on it, and I have a few more still to give - and this blog series will partly be complementary to those talks. In particular, there?s a DevExpress webcast which covers most of the same ground, with similar code. (It was before the CTP refresh, and also before my laptop was stolen in a burglary, so the code here is a rewrite.)

Async from a compiler?s point of view

Most of this blog series (at least the bits I anticipate at the moment) will deal with what the compiler does with async methods. (I haven?t used async delegates much at all, but I can?t imagine that the machinery is particularly different.)

As far as I?ve seen, most of the coverage on the web so far has dealt with using async. That?s natural, logical and entirely proper. Oh, and a bit boring after a while. I like knowing how a feature works before I go too far using it. This is a personal idiosyncrasy, and if you?re happy just using async with no ?under the hood? details, that?s absolutely fine. It?s probably worth unsubscribing from my blog for a little while, that?s all.

This can all be seen as pretty similar to my Edulinq series of posts, which is why I?ve called it Eduasync this time.

My plan is to walk you through what the C# compiler relies on - the types which are currently part of AsyncCtpLibrary.dll, and how it interacts with Task / Task<T> from .NET 4. We?ll then look at the code generated by the compiler - essentially a state machine - and some of the less obvious aspects of it. I?ll give examples of any bugs I?ve found in the CTP, just for the heck of it - and as a way of checking whether they?re fixed in later versions. (Obviously I?ve let the C#/VB team know about these as I?ve come across them.)

I?ll assume that you know the basics of using async - so if you don?t, now would be a good time to look at the numerous resources on the Visual Studio Async home page. There are loads of videos, specs (including the C# spec changes, most importantly from my point of view)

Get the source now

There?s already quite a bit of source code (everything I?m currently planning on writing about, which is almost inevitably less than I?ll actually end up writing about) on the Google Code Eduasync project. This takes a different approach from Edulinq - instead of just a couple of projects (production and tests, basically) I?ve got a separate project for each topic I want to talk about, with pretty minimal code for that topic. The reason for this is to show the evolution of the code - starting off with almost nothing, and progressing until we?ve got an implementation which achieves at least the bare bones important bits of an async system.

I?ve numbered the projects within the solution, although the assemblies themselves don?t have the same numbers. They all use a default namespace of just Eduasync, and they don?t refer to each other. Each is meant to be self-contained - oh, and there are no references to AsyncCtpLibrary.dll. The whole point is to reimplement that library :) Of course, you'll still need the CTP installed to get the compiler changes.

The Google Code repository will also contain the blog posts eventually, including any diagrams I need to create (such as the one in a minute).

The three blocks and two boundaries

One of the things I've found important to think about in async is the various parts involved. I've ended up with a mental model like this:

The bits in blue and red are the ones we're focusing on here: the contents of the async method, and the boundaries between that and the code that calls it, and the tasks (or other awaitable types) that it awaits.

For most of this series we're not really going to care much about what the caller does with the result, or how the awaitable object behaves other than in terms of the methods and properties used by the C# 5 compiler. I'll discuss the flexibility afforded though - and how it doesn't extend to the "caller/async" boundary, only the "async/awaitable" boundary.

Just to give an explicit example of all of this, here's a simple little program to asynchronously determine the size of the Stack Overflow home page:

using System;
using System.Net;
using System.Threading.Tasks;

class Program
    // Caller (block 1)
    static void Main()
        Task<int> sizeTask = DownloadSizeAsync("");
        Console.WriteLine("In Main, after async method call...");        
        Console.WriteLine("Size: {0}", sizeTask.Result);
    // Async method (block 2)
    static async Task<int> DownloadSizeAsync(string url)
        var client = new WebClient();
        // Awaitable (block 3)
        var awaitable = client.DownloadDataTaskAsync(url);
        Console.WriteLine("Starting await...");
        byte[] data = await awaitable;
        Console.WriteLine("Finished awaiting...");
        return data.Length;

The comments should make it reasonably clear what the blocks in the diagram mean. It's not ideal in that the first two blocks are basically methods, whereas the third block is an object - but I've found that it still makes sense when we're thinking about the interactions involved at the boundaries. Notably:

  • How does the async method create an appropriate value to return to the caller?
  • How does the async method interact with the awaitable when it hits an "await" expression?

We can (and we're going to) look at these boundaries very separately. We'll start off with the first bullet, in part two, which will hopefully follow in the next few days.



No comments:

Post a Comment