Wednesday, November 30, 2011

Switched to LuaLaTeX


Have you ever tried to write a little bit complex command in LaTeX? I did at some occasions, and finally it somehow worked, but it has always been ugly. However, there is LuaTeX/LuaLaTeX, it provides real scripting within your documents:

\directlua{   for i=0, 15 do     tex.print("Math: $x_{" .. i .. "}$")   end }

That is just awesome, in plain LaTeX this would be ugly, but it gets even more ugly if you have to deal with floating points, external files etc. Well, for now I do not need any complex macro, so I cannot talk about actual experiences with Lua stuff, but I encountered some problems when translating my LaTeX document to LuaLaTeX two weeks ago.

unicode-math does not work properly

When enabling the unicode-math package for using Unicode characters in formulas (⊂∀∃∂ etc.) I have to select a font using the \setmathfont command, I tried “Latin Modern Math”, “STIXGeneral”, “XITS Math”, “Neo Euler” and “Asana Math”, otherwise Unicode symbols will not get displayed. However, with all of these fonts formulas do not look as good as with standard LaTeX lmodern-package, which is usable from LuaLaTeX, too, \setmathfont will override it. Some of them are too bold, some have ugly ℝ, ℚ, ℂ (\mathbb) and \mathcal symbols etc. Thus I decided to port the uniinput package provided by the Neo project (they create the keyboard layout I am using) to LuaLaTeX. I thought it would be nice to check the Lua capabilities that for, however, I faced the next problem.

Lua is not Unicode aware

That is really annoying, LuaLaTeX’s claim is to support a) sophisticated scripting and b) native Unicode support. However, they choosed Lua as scripting language, which does not support Unicode natively. I could not find the functions I needed to write a succinct macro for declaring a Unicode character in math mode (for examble ℝ should be replaced with \mathbb{R}), simply something to split a 8-bit-UTF-8-string into its Unicode characters and to do conversions between character codes and strings. I did not want to write it myself. Thus I choosed a quick and dirty way: using some regexp-magic and a Ruby script to convert uniinput.sty into uniinput-lualatex.sty. It works now, you can use it if you want to…

Making it working with KileIP

KileIP currently has the latex-command hard coded to display previews of formulas. I was too lazy to fix that and I wanted to be able to fall back to LaTeX if there were unexpected problems, thus I made my document working with both LaTeX and LuaLaTeX:

\usepackage{iftex} \ifPDFTeX   \usepackage[utf8]{inputenc}   \usepackage[T1]{fontenc}   \usepackage{uniinput}   \usepackage{lmodern}   \usepackage{amsmath}   \usepackage{amssymb} \else   \ifLuaTeX     \usepackage{luatextra}     \usepackage{amsmath}     \usepackage{amssymb}     \usepackage{lmodern}     \usepackage{uniinput-lualatex}     \usepackage{fontspec}   \fi \fi

Well, next time I need a complex macro I will certainly use Lua and it will hopefully work with my setup. :)



Eduasync part 13: first look at coroutines with async

(This part covers project 18 in the source code.)

As I mentioned in earlier parts, the "awaiting" part of async methods is in no way limited to tasks. So long as we have a suitable GetAwaiter() method which returns a value of a type which in turn has suitable methods on it, the compiler doesn't really care what's going on. It's time to exploit that to implement some form of coroutines in C#.

Introduction to coroutines

The fundamental idea of coroutines is to have multiple methods executing cooperatively, each of them maintaining their position within the coroutine when they yield to another. You can almost think of them as executing in multiple threads, with only one thread actually running at a time, and signalling between the different threads to control flow. However, we don't really need multiple threads once we've got continuations - we can have a single thread with a complex flow of continuations, and still only a very short "real" stack. (The control flow is stored in normal collections instead of being implicit on the thread's stack.)

Coroutines were already feasible in C# through the use of iterator blocks, but the async feature of C# allows a slightly more natural way of expressing them, in my view. (The linked Wikipedia page gives a sketch of how coroutines can be built on top of generators, which in the general concept that iterator blocks implement in C#.)

I have implemented various flavours of coroutines in Eduasync. It's possible that some (all?) of them shouldn't strictly be called coroutines, but they're close enough to the real thing in feeling. This is far from an exhaustive set of approaches. Once you've got the basic idea of what I'm doing, you may well want to experiment with your own implementations.

I'm not going to claim that the use of coroutines in any of my examples really makes any sense in terms of making real tasks easier. This is purely for the sake of interest and twisting the async feature for fun.

Round-robin independent coroutines

Our first implementation of coroutines is relatively simple. A coordinator effectively "schedules" the coroutines it's set up with in a round-robin fashion: when one of the coroutines yields control to the coordinator, the coordinator remembers where the coroutine had got to, and then starts the next one. When each coroutine has executed its first piece of code and yielded control, the coordinator will go back to the first coroutine to continue execution, and so on until all coroutines have completed.

The coroutines don't know about each other, and no data is being passed between them.

Hopefully it's reasonably obvious that the coordinator contains all the smarts here - the coroutines themselves can be relatively dumb. Let's look at what the client code looks like (along with the results) before we get to the coordinator code.

Client code

The sample code contains three coroutines, all of which take a Coordinator parameter and have a void return type. These are passed to a new coordinator using a collection initializer and method group conversions; the coordinator is then started. Here's the entry point code for this:

private static void Main(string[] args)
    var coordinator = new Coordinator { 

When each coroutine is initially started, the coordinator passes a reference to itself as the argument to the coroutine. That's how we solve the chicken-and-egg problem of the coroutine and coordinator having to know about each other. The way a coroutine yields control is simply by awaiting the coordinator. The result type of this await expression is void - it's just a way of "pausing" the coroutine.

We're not doing anything interesting in the actual coroutines - just tracing the execution flow. Of course we could do anything we wanted, within reason. We could even await a genuinely asynchronous task such as fetching a web page asynchronously. In that case the whole coroutine collection would be "paused" until the fetch returned.

Here's the code for the first coroutine - the second and third ones are similar, but use different indentation for clarity. The third coroutine is also shorter, just for fun - it only awaits the coordinator once.

private static async void FirstCoroutine(Coordinator coordinator)
    Console.WriteLine("Starting FirstCoroutine");
    Console.WriteLine("Yielding from FirstCoroutine...");

    await coordinator;

    Console.WriteLine("Returned to FirstCoroutine");
    Console.WriteLine("Yielding from FirstCoroutine again...");

    await coordinator;

    Console.WriteLine("Returned to FirstCoroutine again");
    Console.WriteLine("Finished FirstCoroutine");

And here's the output...

Starting FirstCoroutine
Yielding from FirstCoroutine...
    Starting SecondCoroutine
    Yielding from SecondCoroutine...
        Starting ThirdCoroutine
        Yielding from ThirdCoroutine...
Returned to FirstCoroutine
Yielding from FirstCoroutine again...
    Returned to SecondCoroutine
    Yielding from SecondCoroutine again...
        Returned to ThirdCoroutine
        Finished ThirdCoroutine...
Returned to FirstCoroutine again
Finished FirstCoroutine
    Returned to SecondCoroutine again
    Finished SecondCoroutine

Hopefully that's the output you expected, given the earlier description. Again it may help if you think of the coroutines as running in separate pseudo-threads: the execution within each pseudo-thread is just linear, and the timing is controlled by our explicit "await" expressions. All of this would actually be pretty easy to implement using multiple threads which really did just block on each await expression - but the fun part is keeping it all in one real thread. Let's have a look at the coordinator.

The Coordinator class

Some of the later coroutine examples end up being slightly brainbusting, at least for me. This one is relatively straightforward though, once you've got the basic idea. All we need is a queue of actions to execute. After initialization, we want our queue to contain the coroutine starting points.

When a coroutine yields control, we just need to add the remainder of it to the end of the queue, and move on to the next item. Obviously the async infrastructure will provide "the remainder of the coroutine" as a continuation via the OnContinue method.

When a coroutine just returns, we continue with the next item in the queue as before - it's just that we won't add a continuation to the end of the queue. Eventually (well, hopefully) we'll end up with an empty queue, at which point we can stop.

Initialization and a choice of data structures

We'll represent our queue using Queue<T> where the T is a delegate type. We have two choices here though, because we have two kinds of delegate - one which takes the Coordinator as a parameter (for the initial coroutine setup) and one which has no parameters (for the continuations). Fortunately we can convert between the two in either direction very simply, bearing in mind that all of this is within the context of a coordinator. For example:

// If we're given a coroutine and want a plain Action
Action<Coordinator> coroutine = ...; 
Action action = () => coroutine(this);

// If we're given a plain Action and want an Action<Continuation>:
Action continuation = ...; 
Action<Coordinator> coroutine = ignored => continuation();

I've arbitrarily chosen to use the first option, so there's a Queue<Action> internally.

Now we need to get the collection initializer working. The C# compiler requires an appropriate Add method (which is easy) and also checks that the type implements IEnumerable. We don't really need to be able to iterate over the queue of actions, so I've use explicit interface implementation to reduce the likelihood of GetEnumerator() being called inappropriately, and made the method throw an exception for good measure. That gives us the skeleton of the class required for setting up:

public sealed class Coordinator : IEnumerable
    private readonly Queue<Action> actions = new Queue<Action>();

    // Used by collection initializer to specify the coroutines to run
    public void Add(Action<Coordinator> coroutine)
        actions.Enqueue(() => coroutine(this));

    // Required for collection initializers, but we don't really want
    // to expose anything.
    IEnumerator IEnumerable.GetEnumerator()
        throw new NotSupportedException("IEnumerable only supported to enable collection initializers");

(Note that I haven't used XML documentation anywhere here - it's great for real code, but adds clutter in blog posts.)

For production code I'd probably prevent Add from being called after the coordinator had been started, but there's no need to do it in our well-behaved sample code. We're only going to add extra actions to the queue via continuations, which will be added due to await expressions.

The main execution loop and async infrastructure

So far we've got code to register coroutines in the queue - so now we need to execute them. Bearing in mind that the actions themselves will be responsible for adding continuations, the main loop of the coordinator is embarrassingly simple:

// Execute actions in the queue until it's empty. Actions add *more*
// actions (continuations) to the queue by awaiting this coordinator.
public void Start()
    while (actions.Count > 0)

Of course, the interesting bit is the code which supports the async methods and await expressions. We know we need to provide a GetAwaiter() method, but what should that return? Well, we're just going to use the awaiter to add a continuation to the coordinator's queue. It's got no other state than that - so we might as well return the coordinator itself, and put the other infrastructure methods directly in the coordinator.

Again, this is slightly ugly, as the extra methods don't really make sense on the coordinator - we wouldn't want to call them directly from client code, for example. However, they're fairly irrelevant - we could always create a nested type which just had a reference to its "parent" coordinator if we wanted to. For simplicity, I haven't bothered with this - I've just implemented GetAwaiter() trivially:

// Used by await expressions to get an awaiter
public Coordinator GetAwaiter()
    return this;

So, that leaves just three members still to implement: IsCompleted, OnCompleted and GetResult. We always want the IsCompleted property to return false, as otherwise the coroutine will just continue executing immediately without returning to cede control; the await expression would be pointless. OnCompleted just needs to add the continuation to the end of the queue - we don't need to attach it to a task, or anything like that. Finally, GetResult is a no-op - we have no results, no exceptions, and basically nothing to do. You might want to add a bit of logging here, if you were so inclined, but there's no real need.

So, here are the final three members of Coordinator:

// Force await to yield control
public bool IsCompleted { get { return false; } }

public void OnCompleted(Action continuation)
    // Put the continuation at the end of the queue, ready to
    // execute when the other coroutines have had a go.

public void GetResult()
    // Our await expressions are void, and we never need to throw
    // an exception, so this is a no-op.

And that's it! Fewer than 50 lines of code required, and nothing complicated at all. The interesting behaviour is all due to the way the C# compiler uses the coordinator when awaiting it.

We need AsyncVoidMethodBuilder as before, as we have some async void methods - but that doesn't need to do anything significant. That's basically all the code required to implement these basic round-robin coroutines.


Our first foray into the weird and wonderful world of coroutines was relatively tame. The basic idea of a coordinator keeping track of the state of all the different coroutines in one sense or another will keep coming back to us, but with different ways of controlling the execution flow.

Next time we'll see some coroutines which can pass data to each other.



Conditional Formatting on Google Docs, farewell excel.



Upcoming speaking engagements

It's just occurred to me that I've forgotten to mention a few of the things I'll be up to in the near-ish future. (I've talked about next week's Progressive .NET session before.) This is just a quick rundown - follow the links for more blurb and details.

.NET Developer Network - Bristol, September 21st (evening)

I'll be talking about async in Bristol - possibly at a high level, possibly in detail, depending on the audience experience. This is my first time talking with this particular user group, although I'm sure there'll be some familiar faces. Come along if you're in the area.

�redev 2011 - Malm�, November 9th

It's a whistle-stop trip to Sweden as I'm running out of vacation days; I'm flying out on the Tuesday evening and back on the Wednesday evening, but while I'm there I'll give two talks:

  • Async 101 (yes, more async; I wonder at what point I'll have given as many talks about it as Mads)
  • Effective technical communication (not a particularly technical talk, but definitely specific to technical communication)

Last year I had an absolute blast - looking forward to this year, even though I won't have as much time for socializing.

Stack Overflow Dev Days 2011 - London, November 14th - cancelled!

Update: Dev Days has been cancelled. I'm still hoping to do something around this topic, and there may be small-scale meet-ups in London anyway.

Two years ago I talked about how humanity had let the world of software engineering down. This was one of the best talks I've ever given, and introduced the world to Tony the Pony. Unfortunately that puts the bar relatively high for this year's talk - at least, high by my own pretty low standards.

In a somewhat odd topic for a Christian and a happy employee of a company with a code of conduct which starts "Don't be evil," this year's talk is entitled "Thinking in evil." As regular readers are no doubt aware, I love torturing the C# language and forcing the compiler to work with code which would make any right-thinking software engineer cringe. I was particularly gratified recently when Eric Lippert commented on one of my Stack Overflow answers that this was "the best abuse of C# I've seen in a while." I'm looking forward to talking about why I think it's genuinely a good idea to think about nasty code like this - not to use it, but to get to know your language of choice more intimately. Like last time, I have little idea of exactly what this talk will be like, but I'm really looking forward to it.



Tuesday, November 29, 2011

Weekend Project: Porting Kickoff to QML

Last week I decided to do a small evening and weekend project to improve my QML skills. As I had worked on quite some QML already for the new screen locker and the window switching layouts, I wanted to do something more complex, but still something to work on in the evenings.

Given also Marco’s post we slowly have to transit the Plasma Workspaces to Plasma Components, so it makes sense to look in the Plasma area for a widget which could use a rewrite.

I decided to try to rewrite our application launcher Kickoff in QML. There are various reasons for it. First of all I am somewhat familiar with the source base as I have in the past worked at least a little bit with it. The code is nicely separated in core and ui parts and makes strong use of Qt’s Model/View/Controller Framework, which means it’s very easy to add a QML based interface to it.

But most important I think that Kickoff is an applet which would benefit most from a QML port. As Kickoff uses the Model/View/Controller Framework it is QWidget based and embedded into Plasma’s Graphicsscene through a proxy widget. If you are familiar with the differences you can spot where Kickoff just does not fit with the rest of Plasma. E.g. the white background or the blue highlight which just does not look and feel like Plasma. Given that Kickoff is one of the first visual elements new users will see it is rather important to present the UI in a consistent and elegant way.

While it is still a work in progress and more a draft, you can get a feeling of the differences in this screenshot:
Classic Kickoff vs. QML based Kickoff

Thanks to the possibilities of QML all the data presented in the QML based Kickoff is currently just mock-data and not yet connected to the real C++ based Models. This gives the possibility to concentrate on the UI without caring about the required changes in the underlying Models.

My expectation is that after the QML port Kickoff should also perform much better. Removing those proxy widgets is clearly a plus and QML allows us to lazy load the data and view components much easier. E.g. in the mockup I presented all the data is loaded only when needed. That is if you only use the favorite tab there is no need to load e.g. the data for the applications. So overall a worthwhile project for learning QML.

Overall we still have plenty of Plasmoids which need a QML port, so get your hands dirty.

Before I close this part, I’d like to invite you all to Plasma Bug Days, which will be held this Friday and Saturday (December 2nd and 3rd), more info on Aaron’s blog. Come give us a hand if you want the 4.8 release to really rock!



KDE Telepathy 0.2 Released

We are pleased to announce the second release of KDE Telepathy.

KDE Telepathy is a suite of applications which together form an instant-messaging client allowing you to talk on Jabber, Gmail, Facebookm, MSN and much more. KDE Telepathy stands out from previous instant messaging clients by being able to integrate into the KDE workspace and plasma, as well as being able to be used like a traditional IM application.

This release features:

  • KWallet integration for storing of passwords
  • A plasmoid for instant access to a contact
  • Ability to set your status to the currently playing track from Amarok, Clementine or any other mpris2-compatiable player
  • Auto Away
  • Progress bar and cancel button for file transfers
  • Over 130 bug fixes and tweaks since 0.1!


The whole team met at the Woshibon 2 sprint in Cambridge, UK (14th-18th September). This sprint was sponsored by both the KDE e.V and Collabora and allowed us to not only sort out many of the details in making this release, but planning out more long term goals as well.

The Future

This is still a very early release and far from what we want to call the "finished product", however all the functionality works and many of us now use it as a daily instant messaging client.

In the future we shall have contact aggregation, audio and video calls as well as more plasma integration, and much more!

We always appreciate the help from more developers, designers and testers; so if you are interested in helping please check out our wiki page at or join in IRC at #kde-telepathy on freenode.

Getting the Latest Release

The latest tarballs are available from KDE's FTP server .

Be sure to check our wiki page for installation guidelines and troubleshooting help.

Packages may be available for your distribution.

Other News

I'm stepping down as manager for the 0.3 release. I have another project I want to work on as well as a tonne of uni work over the Christmas period. I'm handing over to the nearly as awesome Martin Klapetek to sort things out.



What is this thing you call a "type"? Part Two

Well that was entirely predictable; as I said last time, if you ask ten developers for a definition of "type", you get ten different answers. The comments to the previous article make for fascinating reading!

Here's my attempt at describing what "type" means to me as a compiler writer. I want to start by considering just the question of what a type is and not confuse that with how it is used.

Fundamentally, a type in C# is a mathematical entity that obeys certain algebraic rules, just as natural numbers, complex numbers, quaternions, matrices, sets, sequences, and so on, are mathematical entities that obey certain algebraic rules.

By way of analogy, I want to digress for a moment and ask the question "what is a natural number?" That is, what fundamentally characterizes the numbers 0, 1, 2, 3, ... that we use to describe the positions of items in a sequence and the sizes of everyday collections of things?

This question has received a lot of attention over the centuries; the definitive answer that we still use today was created by Giuseppe Peano in the 19th century. Peano said that a natural number is defined as follows:

  • Zero is a natural number.
  • Every natural number has a "successor" natural number associated with it.
  • No natural number has zero as its successor.
  • Unequal natural numbers always have unequal successors.
  • If you start from zero, take its successor, and then take the successor of that, and so on, you will eventually encounter every natural number. (*)

Surprisingly, that's all you need. Any mathematical entity that satisfies those postulates is usable as a natural number. (**) Notice that there's nothing in there whatsoever about adding natural numbers together, or subtracting, multiplying, dividing, taking exponents, and so on. All of those things can be bolted on as necessary. For example, we can say that addition is defined as follows:

  • (n + 0) = n
  • (n + (successor of m)) = ((successor of n) + m)

And you're done; you've got a recursive definition of addition. We can similarly define "less than":

  • (n < 0) = false
  • (0 < (successor of m)) = true
  • ((successor of n) < (successor of m)) = (n < m)

We can define every operation on natural numbers in this way; try defining multiplication, just for fun. (Hint: assume that you've already defined addition.)

We can come up with a similar "axiomatic" definition of "type":

  • Object is a type
  • "Declared" types are types (note that int, uint, bool, string, and so on can be considered "declared" types for our purposes; the runtime "declares" them for you.)
  • If T is a type and n is a positive integer then "n-dimensional array of T" is also a type.

And that's pretty much it as far as "safe" C# 1.0 is concerned. (To be as strict as the Peano axioms for natural numbers we'd want to also throw in some similar safeguards; for example, we don't ever want to be in a situation where the type "one-dimensional array of double" has type equality with the type "int", just as we don't ever want to be in a situation where the successor of a natural number is zero.)

Things get a bit more complex when we throw in generic types and pointer types, but I'm sure you can see that we could come up with a precise axiomatic description of generic types and pointer types with a little bit of work. That is, type parameter declarations are types, generic type declarations constructed with the right number of type arguments are types, and so on.

We can then start piling on algebraic operations. Just as we defined "less than" on numbers, we can define the "is a subtype of" relation on types.

  • T <: object is true for any type T that is not object
  • object <: T is false for any type T
  • if S <: T and T <: U then S <: U
  • ... and so on...

Just as I do not care how numbers are "implemented" in order to manipulate them algebraically, I also do not care how types are "implemented" in order to manipulate them with the rules of type algebra. All I care about are the axioms of the system, and the rules that define the algebraic operators that I've made up for my own convenience.

So there you go; we have a definition of "type" that does not say anything whatsoever about "a set of values" or "a name". A type is just an abstract mathematical entity, a bit like a number, that obeys certain defining axioms, and therefore can be manipulated algebraically by making up rules for useful operators -- just like numbers can be manipulated abstractly according to algebraic rules that we make up for them.

Now that we have a sketch of an axiomatic definition of what a type is, what are we going to do with it?

Perhaps the most fundamental purposes of static type checking in a programming language like C# are to associate a type with every relevant storage location, to associate a type with every (†) relevant compile-time expression, and to then ensure that it is impossible for a value associated with an incompatible type to be written to or read from any storage location. A compile-time proof of runtime type safety (): that's the goal.

The key word there is proof; now that we have developed an axiomatic definition of "type" we can start constructing proofs based on these axioms. The C# specification defines:

  • what type is associated with every compile-time expression; of course, expressions whose types cannot be determined might be program errors
  • what type is associated with every storage location
  • what constitutes an acceptable assignment from an expression of given type ( to a storage location of a given type

The task of the compiler writer is then to implement those three parts of the specification: associating a type with every expression, associating a type with every storage location, and then constructing a proof that the assignment is valid given the rules. If the compiler is able to construct a proof then the assignment is allowed. The tricky bit is this: the specification typically just gives the rules of the system, not a detailed algorithm describing how to construct proofs using those rules. If any proof exists then the compiler is required to find it. Conversely, if no proof exists, the compiler is required to deduce that too and produce a suitable compile-time type error.

Unfortunately, as it turns out, that's impossible in general. A system where it is possible for every expression to be classified as either "type safe" or "type unsafe" in a finite number of logical steps is called a "decidable" system. As Gödel famously proved, natural number arithmetic as axiomatized above is undecidable; there are statements in formal arithmetic that can be neither proved nor disproved in a finite number of logical steps. Assignment type checking is also in general undecidable in programming languages with both nominal subtyping () and generic variance. As I mentioned a while back, it turns out that it would be possible to put additional restrictions on type declarations such that nominal subtyping would become decidable, but we have not ever done this in C#. Rather, when faced with a situation that produces an infinitely long proof of type compatibility, the compiler just up and crashes with an "expression was too complex to analyze" error. I'm hoping to fix that in a hypothetical future version of the compiler, but it is not a high priority because these situations are so unrealistic.

Fortunately, situations where type analysis is impossible in general, or extremely time consuming, are rare in realistic C# programs; rare enough that we can do a reasonable job of writing a fast compiler.

Summing up: A type is an abstract mathematical entity defined by some simple axioms. The C# language has rules for associating types with compile-time expressions and storage locations, and rules for ensuring that the expression type is compatible with the storage type. The task of the compiler writer is to use those axioms and algebraic rules to construct a proof or disproof that every expression has a valid type and that every assignment obeys the assignment compatibility rules.

Though the axioms of what makes a type are pretty simple, the rules of associating types to expressions and determining assignment compatibility are exceedingly complex; that's why the word "type" appears 5000 times in the specification. To a large extent, the C# language is defined by its type rules.


(*) I am of course oversimplifying here; more formally, the real axiom formally states that proof by induction works on natural numbers. That is: if a property is true of zero, and the property being true for a number implies the truth for its successor, then the property holds for all numbers.

(**) It is instructive to consider why the third, fourth and fifth postulates are necessary. Without the third postulate, there need only be one number: zero, which is its own successor! Without the fourth postulate we could say that there are only two numbers: the successor of zero is one, the successor of one is one. Without the fifth postulate there could be *two* zeros: "red zero" and "blue zero". The successor of red zero is red one, the successor of blue zero is blue one, and so on. In each case, the system without the axiom satisfies the remaining four axioms, but in each case it seems very contrary to our intuition about what a "natural number" is.

(†) We fudge this in C# of course; null literals, anonymous functions and method groups are technically classified as "expressions without any type". However, in a legal program they must occur in a context where the type of the expression can be inferred from its surrounding context.

() The cast operator gives the lie to this, of course; one meaning of the cast operator is "there is no compile-time proof that this is type safe; I assert that it is safe, so generate code that checks my assertion at runtime." Also, unsafe array covariance makes this goal unreachable.

() Again, this is fudged in a few places. For example, it is not legal to assign an expression of type int to a variable of type short, unless the expression of type int is a compile-time constant known to fit into a short.

() That is, a language where you state the base type of a newly declared type. Like "interface IFoo<T> : IBar<T>" uses nominal subtyping.



We just launched Techademy! - Stefan Koopmanschap

As I've found out after starting my own company, training is a hot topic. On the one hand, everyone wants and needs training, but on the other hand, training seems to be really expensive. And while in-depth expert training has a good value (and I deliver those on a regular basis), I felt there should be a way for web developers to stay up-to-date on recent developments in a quick and not too expensive way. Talking with friend, old colleague and soon-to-be freelancer Joshua Thijssen I found someone who felt the same way. This is what lay at the root of a new training concept we have just launched: Techademy.



Transportation Simulation Games



Monday, November 28, 2011

Why have a stack?

Last time I discussed why it is that we have all the .NET compilers target an "intermediate language", or "IL", and then have jitters that translate IL to machine code: because doing so ultimately reduces the costs of building a multi-language, multi-hardware platform. Today I want to talk a bit about why IL is the way it is; specifically, why is it a "stack machine"?

To begin with, what is a "stack machine"? Let's consider how you might design a machine language that could describe the operation of adding together two integers to make a third. You could do it like this:

add [address of first addend], [address of second addend], [address of sum]

When the machine encounters this instruction it looks up the values of the addends stored in the two addresses, somehow adds them together -- how it does so is its business -- and stores the result in the third address.

You might instead say that there is a special region of memory called the "accumulator" which knows how to add a given value to itself:

increase_accumulator [address of first addend]
increase_accumulator [address of second addend]
save_accumulator [address of sum]

Or, you could say that there is a special region of memory called the "stack" which can grow and shrink; you get access to the items on the top:

push_value_at [address of first addend]
push_value_at [address of second addend]
pop_value_into [address of sum]

The "add" instruction takes the two values off the top of the stack, somehow adds them, and then puts the result back on the stack; the net result is that the stack shrinks by one.

A virtual machine where most of the instructions are of this third form is called a "stack machine", for obvious reasons.

IL specifies a stack machine, just like many other virtual machines. But most hardware instruction sets actually more closely resemble the second form: registers are just fancy accumulators. Why then are so many virtual machines specifying stack machines?

There are several reasons, but again, it primarily comes down to lowering costs. Stack machines are very easy to understand, they are very easy to write a compiler front-end for, they are easy to build interpreters for, they are easy to build jitters for, and they provide a concise abstraction for the common case. The common case is that the result of any particular operation is only going to be of interest for a brief period.

Imagine, for example, if we chose the first strategy for IL, and then had to compile an expression like x = A() + B() + C(); What would we have to do in the first case? Something like this:

create_temporary_storage // for result of A()
call A(), [address of temporary storage]
create_temporary_storage // for result of B()
call B(), [address of temporary storage]
create_temporary_storage // for result of first addition
add [address of first temporary storage], [address of second temporary storage], [address of third temporary storage]

You see how this goes? The IL is getting huge, and all so that we can keep track of precisely which memory locations are used to store values that we are about to never care about again. A stack abstraction lets the stack implementation deal with the temporary storages; in a stack machine, the same code looks something like:

push [address of x]
call A() // implicitly creates a temporary storage by pushing the result on the stack
call B()
call C()
store // store result on top of stack in address just below it.

The code is much smaller and much easier to understand. A stack machine is a very simple way to describe a complex computation; by being able to write code for such a simple machine, it lowers the cost of making a compiler. And not only is it easier to write compilers and jitters that target simple stack machines, it is easier to write other code analysis tools as well. The IL verifier, for example, can quickly determine when there is a code path through a method that, say, misaligns the stack, or passes the wrong types of arguments to a method.



Making UITableViews look not so plain

As most of you probably know, UITableView’s are incredibly useful and versatile views to be using in your applications.
If you have ever tried to customize a UITableView though, you know that as soon as you start adding lots of UIViews, UILabels, and UImageViews to a cells ContentView, that these tableviews start to scroll slower and slower, and become choppier and choppier.
What we are going to explore today is how to remedy that situation.
To download the entire XCode project, you can ...



Why IL?

One of the earliest and most frequently-asked questions we got when we announced the Roslyn project was "is this like LLVM for .NET?"

No, Roslyn is not anything like LLVM for .NET. LLVM stands for Low-Level Virtual Machine; as I understand it (admittedly never having used it), compiler "front ends" take in code written in some language -- say C++ -- and spit out equivalent code written in the LLVM language. Another compiler then takes the code written in the LLVM language and writes that into optimized machine code.

We already have such a system for .NET; in fact, .NET is entirely built upon it and always has been, so Roslyn isn't it. The C#, VB and other compilers take in programs written in those languages and spit out code written in the Common Intermediate Language (CIL, or also commonly MSIL or just IL). Then another compiler -- either the jitter, which runs "just in time" at runtime, or the NGEN tool which runs before runtime -- translates the IL into optimized machine code that can actually run on the target platform.

I'm occasionally asked why we use this strategy; why not just have the C# compiler write out optimized machine code directly, and skip the middleman? Why have two compilers to go from C# to machine code when you could have one?

There are a number of reasons, but they pretty much all boil down to one good reason: the two-compilers-with-an-intermediate-language system is much less expensive in our scenario.

That might seem counterintuitive; after all, now we have two languages to specify, two languages to analyze, and so on. To understand why this is such a big win you have to look at the larger picture.

Suppose you have n languages: C#, VB, F#, JScript .NET, and so on. Suppose you have m different runtime environments: Windows machines running on x86 or x64, XBOX 360, phones, Silverlight running on the Mac... and suppose you go with the one-compiler strategy for each. How many compiler back-end code generators do you end up writing? For each language you need a code generator for each target environment, so you end up writing n x m code generators.

Suppose instead you have every language generate code into IL, and then you have one jitter per target environment. How many code generators do you end up writing?  One per language to go to IL, and one per environment to go from IL to the target machine code. That's only n + m, which is far less than n x m for reasonably-sized values of n and m.

Moreover, there are other economies in play as well. IL is deliberately designed so that it is very easy for compiler writers to generate correct IL. I'm an expert on the semantic analysis of the C# language, not on efficient code generation for cellular phone chipsets. If I had to write a new backend code generator for every platform the .NET framework runs on, I'd be spending all my time doing a bad job of writing code generators instead of doing a good job writing semantic analyzers.

The cost savings go the other way too; if you want to support a new chipset then you just write yourself a jitter for that chipset and all the languages that compile to IL suddenly start working; you only had to write *one* jitter to get n languages on your new platform.

This cost-saving strategy of putting an intermediate language in the middle is not at all new; it goes back to at least the late 1960's. My favourite example of this strategy is the Infocom Z-Machine; the Infocom developers wrote their games in a language (Zork Implementation Language) that compiled to an intermediate Z-Code language, and then wrote Z-Machine interpreters for a variety of different platforms; as a result they could write n games and have them run on m different platforms at a cost of n + m, not n x m. (This approach also had the enormous benefit that they could implement virtual memory management on hardware that did not support virtual memory natively; if the game was too big to fit into memory, the interpreter could simply discard code that wasn't being used at the moment and page it back in again later as needed.)

Next time I'll talk a bit about why IL is specified the way it is.



Building a PC, Part VII: Rebooting

I've had more or less the same PC, with various updates, since 2007. I've written about most of it here:

While the advice in those original articles is still quite sound, my old 2007 era case was feeling mighty creaky. I needed a new chassis. I also wanted a motherboard that supported native 6 Gbps SATA for the latest generation of SSDs that truly benefit from them. The buzz around the Sandy Bridge based Core i7-2600k was nearly deafening, and I've fallen completely in love with my last HTPC build based on the same technology. (Oh, and even if you already read that article, read it again because I added new PicoPSU and case information that takes it from awesome to sublime – on the order of 17 watts idle!)

So I decided it was time to build myself a nice Sandy Bridge system. What I ended up with is easily the best case and motherboard combination I've ever laid hands on. Read on!

I cut out a lot of the initial research work by relying on my old, dear friends at Tech Report and their current workstation recommendations:

As for the case, I was impressed by the Tech Report review of the Corsair 600T, which even comes in a heart-stopping pseudo stormtrooper white. WANT.


When it comes to power supplies, I'm crazy about efficiency, and fortunately there are now lots of so-called "80 Plus Gold" PSUs out there now, offering a staggering 90% efficiency under most loads. Power supply efficiency is important, because the rest of that heat is dumped back into your case. The less efficient your PSU, the more heat buildup you'll have under load. I chose the Seasonic X-760 – which, when bench tested, indeed delivered the promised 90% efficiency – but any high quality 80 Plus Gold model will generally do.

The CPU (and possibly, depending on your tastes, the video card) is the biggest heat generator inside your PC. The better and more efficient the CPU cooler, the quieter your whole system can be. This also affects how much you can overclock. I chose the Thermalright Venomous-X Silent Edition on the basis of it being the current top dog for efficiency, and because it had a simple mounting system. Large coolers can be a real bear to install. And did I mention it comes with an especially quiet fan out of the box, too?

Once I had all the parts in hand, it was a simple matter of building it up, as documented in my previous post series. I adore this Corsair case; it is an absolute joy to work in. Everything in it is cleverly designed, from the rear cable routing area with rubber grommets all over the place for easily passing cables back and forth, to the tool-less 2.5" and 3.5" bays, to the super easily removable side panels. It's like they read a giant list of all my prior complaints with every PC case I've ever used and fixed every. single. last. one of them.

The end result is what you see here:


There are some significant tweaks visible in the above picture that I do recommend:

  1. Use tin snips to remove the rear exhaust grill. We don't need it back there, and the exhaust airflow is critical. Fan grills affect low-speed fan airflow more than you'd think:
    Wire grills also have an effect: ~20%. This was checked with an anemometer on several different fans of 80, 92 and 120mm size, at full and lower speeds. The airflow reduction went as high as 24% but it was never below 19%. At 12V, the reduction in airflow with most fans will be relatively harmless, though there is an increase in turbulence noise (audible to me). But at the low airflow rates SPCR members run fans, I think the airflow reduction is significant.
  2. Install a 140mm rear exhaust fan. The Noctua NF-P14 is expensive at $25 but is cleverly designed to give you 140mm of super-quiet fan in the space a 120mm fan would normally take. It just barely fits in the rear exhaust fan slot with a little nudging. But it does fit; it's the beige fan in the above picture. It also comes with its own speed reducers and accessories.

  3. Use fan speed reducers on all the fans. The case has two 200mm fans, and the 140mm fan we're adding. I couldn't get the Asus motherboard's "QFan" fan control system to work, as it seems to require 4-pin fans, and all the ones I had (including the ones that came with the case) are 3-pin. While I do prefer dynamic, temperature based control when I can get it, the next best thing is to use hardware to slow down the fans. I like the Zalman-ZM-RC56 resistor connector as the simplest solution, but it's getting hard to find for some reason. The Zalman Fan Mate 2 will also work, and allows you to individually adjust the speed of each fan. The case also has a built in fan controller – that's the knob you see on the front top – but I found it too limited in range for my purposes.

  4. Add acoustic foam to taste. Between inexpensive eggcrate foam and thin, adhesive backed open-cell foam, you can easily reduce that last 10-20% of fan noise to a very pleasant white noise. It works well in the areas pictured, and also on the interior of the side panel "facing" the fans. See item 6 in my Building a Quiet PC post for details.

And then, of course, the overclockening. What kind of geek would I be if I didn't attempt to turn this baby up to 11? This is another area where Sandy Bridge is a revelation: so long as you buy one of the blessed "K" series processors, overclocking is as simple as setting the multiplier to the desired value. It is ridiculously simple. And my results, for once, were immediately as good as the ones everyone else was crowing about: 4.4 GHz totally stable!


(beware: there is one nasty little issue with the Asus motherboard's auto-overclock feature. The PLL Overvoltage setting, which auto-overclock "helpfully" enables, completely bollixes up resuming from sleep. Just turn it off, and all is well. I don't even want to tell you how long it took me to figure that one out.)

The total package with a nice SSD delivers a near-perfect Windows Experience score:


I won't lie to you. This is not a compact build. It's big! Those roomy side areas come at a cost, and that makes it a very wide case. But that's to be expected for a desktop powerhouse machine. And since my last case lasted me from 2007-2011, I'll happily accept a little bulk for something that's easy to work on and upgrade over time.


It's a fantastic new reboot of my system, and I didn't expect to be this excited about the final result. This is not merely an incremental improvement over what I had, it's much quieter, easier to work on, and when overclocked to 4.4 GHz, noticeably faster too. (I do slightly mourn the loss of 8 GB of RAM, but I'll survive.)

In this build, I already had hard drives, DVD drive, a sound card, and so forth … but for completeness' sake I'll list everything here if you want to mirror this setup. Realize that some of this comes down to personal taste, so I'm just listing what I recommend. Feel free to change anything out, and bear in mind that Sandy Bridge has decent default onboard video as well.

Remember, if you can put together a LEGO kit, you can build this totally sweet PC for yourself, too. Good luck and happy building!



Akademy-fr FTW!

It’s been a good day today!

First of all I arrived to Toulouse where Akademy-fr is going to happen. I’m really happy of being part of this first (I hope of many) Akademy-fr edition. :)

Secondly, KAlgebra has been accepted to the OVI store. As far as I know, it’s the first (I hope of many, again :) ) application bundling kde libs in it. So all N9* users can install it without ugly tricks! \o/

Proof: KAlgebra at OVI store.

Salutations dès Toulouse!!



Sunday, November 27, 2011

How not to write code: The evils of duplication



Optimization and generics, part 2: lambda expressions and reference types

As with almost any performance work, your mileage may vary (in particular the 64-bit JIT may work differently) and you almost certainly shouldn't care. Relatively few people write production code which is worth micro-optimizing. Please don't take this post as an invitation to make code more complicated for the sake of irrelevant and possibly mythical performance changes.

It took me a surprisingly long time to find the problem described in the previous blog post, and almost no time at all to fix it. I understood why it was happening. This next problem took a while to identify at all, but even when I'd found a workaround I had no idea why it worked. Furthermore, I couldn't reproduce it in a test case... because I was looking for the wrong set of triggers. I've now found at least some of the problem though.

This time the situation in Noda Time is harder to describe, although it concerns the same area of code. In various places I need to create new delegates containing parsing steps and add them to the list of steps required for a full parse. I can always use lambda expressions, but in many cases I've got the same logic repeatedly... so I decided to pull it out into a method. Bang - suddenly the code runs far slower. (In reality, I'd performed this refactoring first, and "unrefactored" it to speed things up.)

I think the problem comes down to method group conversions with generic methods and a type argument which is a reference type. The CLR isn't very good at them, and the C# compiler uses them more than it needs to.

Show me the benchmark!

The complete benchmark code is available of course, but fundamentally I'm doing the same thing in each test case: creating a delegate of type Action which does nothing, and then checking that the delegate reference is non-null (just to avoid the JIT optimizing it away). In each case this is done in a generic method with a single type parameter. I call each method in two ways: once with int as the type argument, and once with string as the type argument. Here are the different cases involved:

  • Use a lambda expression: Action foo = () => {};
  • Fake what I expected the compiler to do: keep a separate generic cache class with a static variable for the delegate; populate the cache once if necessary, and thereafter use the cache field
  • Fake what the compiler is actually doing with the lambda expression: write a separate generic method and perform a method group conversion to it
  • Do what the compiler could do: write a separate non-generic method and perform a method group conversion to it
  • Use a method group conversion to a static (non-generic) method on a generic type
  • Use a method group conversion to an instance (non-generic) method on a generic type, via a generic cache class with a single field in referring to an instance of the generic class

(Yes, the last one is a bit convoluted - but the line in the method itself is simple: Action foo = ClassHolder<T>.SampleInstance.NoOpInstance;

Remember, we're doing each of these in a generic method, and calling that generic method using a type argument of either int or string. (I've run a few tests, and the exact type isn't important - all that matters is that int is a value type, and string is a reference type.)

Importantly, we're not capturing any variables, and the type parameter is not involved in either the delegate type or any part of the implementation body.

Benchmark results

Again, times are in milliseconds - but this time I didn't want to run it for 100 million iterations, as the "slow" versions would have taken far too long. I've run this on the x64 JIT as well and seen the same effect, but I haven't included the figures here.

Times in milliseconds for 10 million iterations

Test TestCase<int> TestCase<string>
Lambda expression 180 29684
Generic cache class 90 288
Generic method group conversion 184 30017
Non-generic method group conversion 178 189
Static method on generic type 180 29276
Instance method on generic type 202 299

Yes, it's about 150 times slower to create a delegate from a generic method with a reference type as the type argument than with a value type... and yet this is the first I've heard of this. (I wouldn't be surprised if there were a post from the CLR team about it somewhere, but I don't think it's common knowledge by any means.)


One of the tricky things is that it's hard to know exactly what the C# compiler is going to do with any given lambda expression. In fact, the method which was causing me grief earlier on isn't generic, but it's in a generic type and captures some variables which use the type parameters - so perhaps that's causing a generic method group conversion somewhere along the way.

Noda Time is a relatively extreme case, but if you're using delegates in any performance-critical spots, you should really be aware of this issue. I'm going to ping Microsoft (first informally, and then via a Connect report if that would be deemed useful) to see if there's an awareness of this internally as potential "gotcha", and whether there's anything that can be done. Normal trade-offs of work required vs benefit apply, of course. It's possible that this really is an edge case... but with lambdas flying everywhere these days, I'm not sure that it is.

Maybe tomorrow I'll actually be able to finish getting Noda Time moved onto the new system... all of this performance work has been a fun if surprising distraction from the main job of shipping working code...



Thanksgiving turkey, it?s probably time?

Mmmm, after the jerk-flavored roasted turkey for Thanksgiving dinner (with an amazing Paula Dean corn casserole) I got some excellent coding mojo. That is, of course, after the tryptophan wore off. Then I realized that it has been over 6 months since the last bug fix release and it kinda seemed like it was about time to get these fixes I’ve worked on over the last few months more widely distributed.  So I tagged Bangarang 2.1 beta last night.

There was once a 2.1 target features list, but as I’ve realized over the last few months, with just myself as the main developer, it just makes sense to work on whatever motivates me, when I’m motivated to work on it.  I’ll do target features and and more fixed release schedules if the number of contributors grow. Fun is my biggest motivation and it’s gotten quite a bit done thus far.  And yes, I do enjoy knocking out bugs as much as I enjoy adding features. :-)

I think there were enough new features to warrant the bump to 2.1 instead of another 2.0 bugfix release.  Anyway, I’ve really enjoyed these last few months of development and if anyone wants to join in on the fun, holler! Till then please help me test and report any issues you encounter.



Miscellaneous Java Swing stuff



Public Transit in Ottawa in the Age of Cars



Saturday, November 26, 2011

How to add syntax highlight to Blogger



Update #2: ELCImagePickerController

Welcome back to another update to the ELCImagePickerController. If you’re not familiar with this class I’d suggest you check out the following posts: Cloning UIImagePickerController using the Assets Library Framework Update: ELCImagePickerController A lot has happened in the iOS world since the creation of this code. With the advent of every new version of iOS there are always exciting new features and bug fixes. Sometimes subtle code changes sneak into classes we’ve ...



What is this thing you call a "type"? Part one

(Eric is out camping; this posting is prerecorded. I'll be back in the office after Labour Day.)

The word "type" appears almost five thousand times in the C# 4 specification, and there is an entire chapter, chapter 4, dedicated to nothing but describing types. We start the specification by noting that C# is "type safe" and has "a unified type system" (*). We say that programs "declare" types, and that declared types can be organized by namespace. Clearly types are incredibly important to the design of C#, and incredibly important to C# programmers, so it is a more than a little bit surprising that nowhere in the eight hundred pages of the specification do we ever actually define the word "type".

We sort of assume that the developer reading the specification already has a working understanding of what a "type" is; the spec does not aim to be either a beginner programming tutorial or a mathematically precise formal language description. But if you ask ten working line-of-business developers for a formal definition of "type", you might get ten different answers. So let's consider today the question: what exactly is this thing you call a "type"?

A common answer to that question is that a type consists of:

* A name
* A set (possibly infinite) of values

And possibly also:

* A finite list of rules for associating values not in the type with values in the type (that is, coercions; 123.45 is not a member of the integer type, but it can be coerced to 123.)

Though that is not a terrible definition as a first attempt, it runs into some pretty nasty problems when you look at it more deeply.

The first problem is that of course not every type needs to have a name; C# 3 has anonymous types which have no names by definition. Is "string[][]" the name of a type? What about "List<string[][]>" -- does that name a type? Is the name of the string type "string" or "String", or "System.String", or "global::System.String", or all four? Does a type's name change depending on where in the source code you are looking?

This gets to be a bit of a mess. I prefer to think of types as logically not having names at all. Program fragments "12" and "10 + 2" and "3 * 4" and "0x0C" are not names for the number 12, they are expressions which happen to all evaluate to the number 12. That number is just a number; how you choose to notate it is a fact about your notational system, not a fact about the number itself. Similarly for types; the program fragment "List<string[][]>" might, in its context, evaluate to refer to a particular type, but that type need not have that fragment as its name. It has no name.

The concept of a "set of values" is also problematic, particularly if those sets are potentially infinite "mathematical" sets.

To start with, suppose you have a value: how do you determine what its type is? There are infinitely many sets that can contain that value, and therefore infinitely many types that the value can be! And indeed, if the string "hello" is a member of the string type, clearly it is also a member of the object type. How are we to determine what "the" type of a value is?

Things get even more brain-destroying when you think, oh, I know, I'll just say that every type's set of values is defined by a "predicate" that tells me whether a given value is in the type or not. That seems very plausible, but then you have to think about questions like "are types themselves values?" If so, then "the type whose values are all the types that are not members of themself" is a valid predicate, and hey, we've got Russell's Paradox all over again. (**)

Moreover, the idea of a type being defined by a predicate that identifies whether a given value is of that type or not is quite dissimilar to how we typically conceive of types in a C# program. If it were, then we could have "predicate types" like "x is an even integer" or "x is not a prime number" or "x is a string that is a legal VBScript program" or whatever. We don't have types like that in the C# type system.

So if a type is not a name, and a type is not a set of values, and a type is not a predicate that determines membership, what is a type? We seem to be getting no closer to a definition.

Next time: I'll try to come up with a more workable definition of what "type" means to both compiler writers and developers.


(*) An unfortunate half-truth, as pointer types are not unified into the type system in the version of the C# language which includes the "unsafe" subset of functionality.

(**) When Russell discovered that his paradox undermined the entire arithmetic theory of Frege, he and Whitehead set about inventing what is now known as the "ramified theory of types", an almost impossible-to-understand theory that is immune to these sorts of paradoxes. Though the mathematical underpinnings of type theory/set theory/category theory/etc are fascinating, I do not understand them nearly well enough to go into detail here. Remember, what we're looking for is a definition of "type" that is amenable to doing practical work in a modern programming language.



Eduasync part 16: Example of composition: majority voting

Note: For the rest of this series, I'll be veering away from the original purpose of the project (investigating what the compiler is up to) in favour of discussing the feature itself. As such, I've added a requirement for AsyncCtpLib.dll - but due to potential distribution restrictions, I've felt it safest not to include that in the source repository. If you're running this code yourself, you'll need to copy the DLL from your installation location into the Eduasync\lib directory before it will build - or change each reference to it.

One of the things I love about async is the compositional aspect. This is partly due to the way that the Task Parallel Library encourages composition to start with, but async/await makes it even easier by building the tasks for you. In the next few posts I'll talk about a few examples of interesting building blocks. I wouldn't be surprised to see an open source library with a proper implementation of some of these ideas (Eduasync is not designed for production usage) whether from Microsoft or a third party.

In project 26 of Eduasync, I've implemented "majority voting" via composition. The basic idea is simple, and the motivation should be reasonably obvious in this day and age of redundant services. You have (say) five different tasks which are meant to be computing the same thing. As soon as you have a single answer which the majority of the tasks agree on, the code which needs the result can continue. If the tasks disagree, or fail (or a combination leading to no single successful majority result), the overall result is failure too.

My personal experience with services requiring a majority of operations to return is with Megastore, a storage system we use at Google. I'm not going to pretend to understand half of the details of how Megastore works, and I'm certainly not about to reveal any confidential information about its internals or indeed how we use it, but basically when discussing it with colleagues at around the time that async was announced, I contemplated what a handy feature async would be when implementing a Megastore client. It could also be used in systems where each calculation is performed in triplicate to guard against rogue errors - although I suspect the chances of those systems being implemented in C# are pretty small.

It's worth mentioning that the implementation here wouldn't be appropriate for something like a stock price service, where the result can change rapidly and you may be happy to tolerate a small discrepancy, within some bounds.


Here's the signatures of the methods we'll implement:

public static Task<T> WhenMajority<T>(params Task<T>[] tasks)

public static Task<T> WhenMajority<T>(IEnumerable<Task<T>> tasks)

Obviously the first just delegates to the second, but it's helpful to have both forms, so that we can pass in a few tasks in an ad hoc manner with the first overload, or a LINQ-generated sequence of tasks with the second.

The name is a little odd - it's meant to match WhenAll and WhenAny, but I'm sure there are better options. I'm not terribly interested in that at the moment.

It's easy to use within an async method:

Task<int> firstTask = firstServer.ComputeSomethingAsync(input);
Task<int> secondTask = selectServer.ComputeSomethingAsync(input);
Task<int> thirdTask = thirdServer.ComputeSomethingAsync(input);

int result = await MoreTaskEx.WhenMajority(firstTask, secondTask, thirdTask);

Or using the LINQ-oriented overload:

var tasks = servers.Select(server => server.ComputeSomethingAsync(input));
int result = await MoreTaskEx.WhenMajority(tasks);

Of course we could add an extension method (dropping the When prefix as it doesn't make as much sense there, IMO):

int result = await servers.Select(server => server.ComputeSomethingAsync(input))

The fact that we've stayed within the Task<T> model is what makes it all work so smoothly. We couldn't easily express the same API for other awaitable types in general although we could do it for any other specific awaitable type of course. It's possible that it would work using dynamic, but I'd rather avoid that :) Let's implement it now.


There are two parts to the implementation, in the same way that we implemented LINQ operators in Edulinq - and for the same reason. We want to go bang immediately if there are any clear input violations - such as the sequence of tasks being null or empty. This is in line with the Task-based Asynchronous Pattern white paper:

An asynchronous method should only directly raise an exception to be thrown out of the MethodNameAsync call in response to a usage error*. For all other errors, exceptions occurring during the execution of an asynchronous method should be assigned to the returned Task.

Now it occurs to me that we don't really need to do this in two separate methods (one for precondition checking, one for real work). We could create an async lambda expression of type Func<Task<T>>, and make the method just return the result of invoking it - but I don't think that would be great in terms of readability.

So, the first part of the implementation performing validation is really simple:

public static Task<T> WhenMajority<T>(params Task<T>[] tasks)
    return WhenMajority((IEnumerable<Task<T>>) tasks);

public static Task<T> WhenMajority<T>(IEnumerable<Task<T>> tasks)
    if (tasks == null)
        throw new ArgumentNullException("tasks");
    List<Task<T>> taskList = new List<Task<T>>(tasks);
    if (taskList.Count == 0)
        throw new ArgumentException("Empty sequence of tasks");
    foreach (var task in taskList)
        if (task == null)
            throw new ArgumentException("Null task in sequence");
    return WhenMajorityImpl(taskList);

The interesting part is obviously in WhenMajorityImpl. It's mildly interesting to note that I create a copy of the sequence passed in to start with - I know I'll need it in a fairly concrete form, so it's appropriate to remove any laziness at this point.

So, here's WhenMajorityImpl, which I'll then explain:

private static async Task<T> WhenMajorityImpl<T>(List<Task<T>> tasks)
    // Need a real majority - so for 4 or 5 tasks, must have 3 equal results.
    int majority = (tasks.Count / 2) + 1;
    int failures = 0;
    int bestCount = 0;
    Dictionary<T, int> results = new Dictionary<T, int>();
    List<Exception> exceptions = new List<Exception>();
    while (true)
        await TaskEx.WhenAny(tasks);
        var newTasks = new List<Task<T>>();
        foreach (var task in tasks)
            switch (task.Status)
                case TaskStatus.Canceled:
                case TaskStatus.Faulted:
                case TaskStatus.RanToCompletion:
                    int count;
                    // Doesn't matter whether it was there before or not - we want 0 if not anyway
                    results.TryGetValue(task.Result, out count);
                    if (count > bestCount)
                        bestCount = count;
                        if (count >= majority)
                            return task.Result;
                    results[task.Result] = count;
                    // Keep going next time. may not be appropriate for Created
        // The new list of tasks to wait for
        tasks = newTasks;

        // If we can't possibly work, bail out.
        if (tasks.Count + bestCount < majority)
            throw new AggregateException("No majority result possible", exceptions);

I should warn you that this isn't a particularly efficient implementation - it was just one I wrote until it worked. The basic steps are:

  • Work out how many results make a majority, so we know when to stop
  • Keep track of how many "votes" our most commonly-returned result has, along with the counts of all the votes
  • Repeatedly:
    • Wait (asynchronously) for at least of the remaining tasks to finish (many may finish "at the same time")
    • Start a new list of "tasks we're going to wait for next time"
    • Process each task in the current list, taking an action on each state:
      • If it's been cancelled, we'll treat that as a failure (we could potentially treat "the majority have been cancelled" as a cancellation, but for the moment a failure is good enough)
      • If it's faulted, we'll add the exception to the list of exceptions, so that if the overall result ends up as failure, we can throw an AggregateException with all of the individual exceptions
      • If it's finished successfully, we'll check the result:
        • Add 1 to the count for that result (the dictionary will use the default comparer for the result type, which we assume is good enough)
        • If this is greater than the previous "winner" (which could be for the same result), check for it being actually an overall majority, and return if so.
      • If it's still running (or starting), add it to the new task list
    • Check whether enough tasks have failed - or given different results - so ensure that a majority is now impossible. If so, throw an AggregateException to say so. This may have some exceptions, but it may not (if there are three tasks which gave different results, none of them actually failed)

Each iteration of the "repeatedly" will have a smaller list to check than before, so we'll definitely terminate at some point.

I mentioned that it's inefficient. In particular, we're ignoring the fact that WhenAny returns a Task<Task<T>>, so awaiting that will actually tell us a task which has finished. We don't need to loop over the whole collection at that point - we could just remove that single task from the collection. We could do that efficiently if we kept a Dictionary<Task<T>, LinkedListNode<Task<T>> and a LinkedList<Task<T>> - we'd just look up the task which had completed in the dictionary, remove its node from the list, and remove the entry from the dictionary. We wouldn't need to create a new collection each time, or iterate through all of the old one. However, that's a job for another day... as is allowing a cancellation token to be passed in, and a custom equality comparer.


So we can make this implementation smarter and more flexible, certainly - but it's not insanely tricky to write. I'm reasonably confident that it works, too - as I have unit tests for it. They'll come in the next part. The important point  from this post is that by sticking within the Task<T> world, we can reasonably easily create building blocks to allow for composition of asynchronous operations. While it would be nice to have someone more competent than myself write a bullet-proof, efficient implementation of this operation, I wouldn't feel too unhappy using a homegrown one in production. The same could not have been said pre-async/await. I just wouldn't have had a chance of getting it right.

Next up - the unit tests for this code, in which I introduce the TimeMachine class.