What is Async Programming, and Why Should You Care?

What is Async Programming, and Why Should You Care?

Async programming is a style of programming that allows you to increase performance when handling events independent of the main program flow. For example, some common tasks that benefit from being asynchronous include:

  • handling user interface events
  • communicating over a network
  • reading / writing to a secondary storage device

What's the problem?

As programmers, we are used to working with synchronous code. Synchronous functions are easy to understand - we call them, they do some work, and they return a value.

There are situations where this leads to very bad performance. Imagine we want to get the contents of every page given by a list of URLs.

public List<string> GetAllPages(List<string> urls)
{
    var pages = new List<string>();
    foreach (var url in urls)
    {
        pages.Add(GetPage(url));
    }
    return pages;
}

This will be very slow for large lists of URLs, as the for loop has to wait for each page request to finish before starting the next one. We can therefore say that each function call is blocking - it blocks the entire thread until the result is returned.

How can we fix it?

Async programming helps fix this issue by separating the function call and the function result. For example, we can start all the requests, and then wait for the results to arrive. Pseudocode might look something like this:

def GetAllPages(urls):
    pages = list()

    for url in urls:
        StartRequest(url)

    for url in urls:
        pages.Append(WaitForResponse(url))

This is rather clunky, as you have to define a start and an end function for each operation you want to make asynchronous. Thankfully, C# has a nice syntax to make this easier.

How does it work in C#?

In C#, asynchronous operations are represented using the Task<T> type. The "start" function returns a Task object, from which the result can be obtained. If the function GetPage() was written using Task Awaitable Programming (TAP), you could write the program as follows:

public List<string> GetAllPages(List<string> urls)
{
    var tasks = new List<Task<string>>();
    foreach (var url in urls)
    {
        // starts the asynchronous operation
        var task = GetPage(url);
        tasks.Add(task);
    }

    var pages = new List<string>();
    foreach (var task in tasks)
    {
        // blocks the thread to wait for the result
        pages.Add(task.Result);
    }

    return pages;
}

Whilst this works, it isn't ideal. Although the individual calls to GetPage aren't blocking the function, any external code that calls GetAllPages will be blocked until all the requests have finished.

To combat this, we can write the function in an asynchronous style. This usually takes three steps:

  1. Add the async keyword
  2. Change the return type to Task<T>
  3. Replace synchronous waits (Task.Wait(), Task.Result) with the await keyword

The previous function would now look as follows:

public async Task<List<string>> GetAllPages(List<string> urls)
{
    var tasks = new List<Task<string>>();
    foreach (var url in urls)
    {
        // starts the asynchronous operation
        var task = GetPage(url);
        tasks.Add(task);
    }

    var pages = new List<string>();
    foreach (var task in tasks)
    {
        // doesn't block the thread
        pages.Add(await task);
    }

    return pages;
}

await VS .Result

When you call .Result, the system thread remains blocked even when waiting.

On the other hand, using await frees up the thread to allow other tasks to run. For example, this means that a server can use only 4 system threads to handle 100 clients (as opposed to the 100 system threads in a naive approach).

Therefore, await should be used in place of .Result whenever possible.

Note: A function must be marked async for the await keyword to be used.

What's the catch?

Async programming can be very useful in certain situations. As a rule of thumb, it is only increases performance when the program is IO-bound. You shouldn't use async functions to do CPU-bound calculations, as it:

  • provides almost no useful functionality
  • makes the code less readable
  • might decrease performance

One exception to this rule is the Task.Run() function, which allows CPU-bound work to be performed on a background thread.

Footnote

If you enjoyed reading this, then consider dropping a like or following me:

I'm just starting out, so the support is greatly appreciated!

Disclaimer - I'm a (mostly) self-taught programmer, and I use my blog to share things that I've learnt on my journey to becoming a better developer. Because of this, I apologise in advance for any inaccuracies I might have made - criticism and corrections are welcome!