Tuesday, June 30, 2015

C# async and await programming model from scratch

Introduction

This is a brief introduction to async and await keywords to a normal developer who wants to understand the basics and little insights to internals of how stuff works.

Background

Asynchronous programming is now an essential thing when we develop any application because it avoids waiting in main thread on long running operations such as disk I/O, network operations database access etc...In normal case, if our program needs something to be done from the results of these long operations, our code is struck until the operation is done and we proceed from that point. 

Using async mechanism, we can just trigger long running operations and can do other tasks. Those long running operations does the job in different thread and when they complete it, they notify our main code and our code can do the post actions from here. When we refer our code, its our main thread which deals with user interface or the thread which primarily process a web request. Sometimes we ourselves write these kind of long running operations.

What is async and await

In simple sense these are 2 new keywords introduced in .Net 4.5 to easily write asynchronous programs. They work in the method level. Of course we cannot make classes work in parallel as they are not unit of execution. 

Are these keywords known to CLR, the .Net run-time or a wrapper over TPL Task Parallel Library ? If they are wrappers, it it good to have language depends on a library written using same language?

We will find out the answer to these questions in this article.

History of .Net async programming

Threads were there from the very beginning of the .Net framework. They were the wrappers on operating system threads and little difficult to work with. Then more concepts such as background worker, async delegate and Task Parallel Library came to ease the async programming model. Those came as part of class library. C# language as such doesn't had 'out of the box' support for  async programming until the async and await keywords are introduced with C# 4.0. Lets see how the async and await helps us in async programming by examining each of these methods.

Example

Lets take the below example of finding factorial of first N numbers if they are completely divisible by 3. We are using console application for simplicity. If we had used a windows application we could easily hook into async event delegate handler and demo the async features in easily. But that won't help us to learn the language features.

Synchronous code


We can see there is a loop runs from 1 to 5 using counter variable. It find whether the current counter value is completely divisible by 3. If so it writes the factorial. The writing function calculates the factorial by calling FindFactorialWithSimulatedDelay() method. This method here in sample is going to put delay to simulate real life workload. In other sense this is the long running operation.

Easily we can see that the execution is happening in sequence. The WriteFactorial() call in loop waits until the factorial is calculated. Why should we wait here? Why can't we move to next number as there is no dependency between numbers? We can. But what about the Console.WriteLine statement in WriteFactorial(). It should wait until the factorial is found. It means we can asynchronously call FindFactorialWithSimulatedDelay() provided there is a call back to the WriteFactorial(). When the async invocation happens the loop can advance counter to next number and call the WriteFactorial().

Threading is one way we can achieve it. Since the threading is difficult and needs more knowledge than a common developer, we are using async delegates mechanism. Below is the rewrite of WriteFactorial() method using async delegate.

Making it async using async delegates

One of the easier method used earlier was to use Asynchronous Delegate Invocation. It uses the Begin/End method call mechanism. Here the run-time uses a thread from thread pool to execute the code and we can have call backs once its completed. Below code explains it well which uses Func delegate.

No change in finding factorial. We simply added new function called WriteFactorialAsyncUsingDelegate() and modified the Main to call this method from the loop.

As soon as the BeginInvoke on findFact delegate is called the main thread goes back to the counter loop, then it increment the counter and continue looping. When the factorial is available the anonymous call back will hit and it will be written into console.

We don't have direct option to cancel the task. Also if we want to wait for one or more methods its little difficult.

Also we can see that the piece of code is not wrapped as object and we need to battle with the IAsyncResult object to get the result back. TPL solves that problem too, It looks more object oriented. Lets have a look.

Improving async programming using TPL

TPL is introduced in .Net 4.0. We can wrap the asynchronous code in a Task object and execute it. We can wait on one or many tasks to be completed. Can cancel task easily etc...There are more to it. Below is a rewrite of our Factorial writing code with TPL.

Here we can see that first task is run then its continuing with next task which is the completed handler which receives notification of first task and writing the result to console.

Still this is not a language feature. We need to refer the TPL libraries to get the support. Main problem here is the effort to write the completed event handler. Lets see how this can be rewritten using async and await keywords.

The language feature async and await

We are going to see how the TPL sample can be rewritten using async and await keywords. We decorated the WriteFactorialAsyncUsingAwait method using async keyword to denote this function is going to do operations in async manner and it may contain await keywords. Without async we cannot await.

Then we are awaiting on the factorial finding function. The moment the await is encountered during the execution, thread goes to the calling method and resume the execution from there. Here in our case the counter loop and takes next number. The awaited code is executed using TPL as its task. As normal it takes a thread from the pool and execute it. Once the execution is completed the statements below the await will be executed.
Here also we are not going to change anything in the FindFactorialWithSimulatedDelay(). 

This avoids the needs for extra call back handlers and developers can write the code in a sequential manner.

What is the relation with Task Parallel Library and async await keywords

The keywords async and await make use of TPL internally. More clearly we can say async and await are syntactic sugar in C# language. Still not clear? In other sense the .Net runtime doesn't know about async and await keywords.

Look at the above disassembled code of  WriteFactorialAsyncUsingAwait(). I used reflector to disassemble the assembly.

Should a language depend on a library/class created with it?

This is a old question. If we look at C or C++, the language was always independent and the libraries were fully depend on it. But if we look from introduction of yield keyword in C#, we can see there is a marriage between language features (keywords) and libraries. Here yield which is a language feature depends on IEnumerable interface created using the language itself. Then the compiler does the magic and replace the yield keyword with corresponding IEnumerable implementation making sure CLR doesn't know about yield keyword.

Another example is using keyword. Its tightly coupled with IDisposable interface. Since then there are many syntactic sugars added more details can be found in below link.

http://blogs.msdn.com/b/ericlippert/archive/2010/10/28/asynchrony-in-c-5-part-one.aspx

Personally I don't prefer mixing language features with libraries. Let language evolve its own and libraries depend on language. If we do the other way the compiler is forced to inject more code and we know adding more lines is not coming in free. But unfortunately we are in a world where coding needs to be fast not the execution of the code.

Should the compiler modify our code?

The main problem with compiler modifying our code is debugging. There are chances that we will see call stack of our application which contains symbols which are not written by us. Try to see the call stack by raising an exception in anonymous method. If there are many anonymous methods in the application, that's it. We are done in debugging.

Should the language know parallel programming and threading?

This is another area to discuss. Since the threading is managed by OS, should the language care about threading. Should the threading be a library or integrated language feature?

Now a days most of the hardware has multiple cores and if the language doesn't provide the integrated features, nobody will leverage multiple cores. This is because either development community is afraid of threading or it requires additional coding time. If the language gives easy way, developers can focus more on the functionality or business side of the app than threading which is infrastructure related.

So I really want my language and associated runtime know parallel and async programming. But please try to avoid tight coupling with class library and stop compiler altering my code.

When should we use it?

We can use async and await anytime we are waiting for something. ie whenever we are dealing with async scenarios. Examples are file IO, network operations, database operations etc...This will help us to make our UI responsive.

So go ahead and make sure your APIs are await-able.

References

https://msdn.microsoft.com/en-us/library/hh191443.aspx
http://stephenhaunts.com/2014/10/10/simple-async-await-example-for-asynchronous-programming/
https://richnewman.wordpress.com/2012/12/03/tutorial-asynchronous-programming-async-and-await-for-beginners/

Tuesday, June 23, 2015

Running Roslyn Analyzers from console application

This post requires prior knowledge from my earlier post. In the last post, we saw how our C# program can compile other C# program via Roslyn compiler. Yes it's without Visual Studio. If we are compiling without Visual Studio how a custom Roslyn DiagnosticAnalyzer can be invoked to do analysis on the code fragment which is getting compiled?

Here we are going to see how we can connect with Roslyn custom analyzers when our program is compiling another program using .Net compiler platform.
First, we are defining our custom analyzer which is nothing but a check to make sure the type names are starting with a capital letter. See the code below. Now let us move on to how this analyzer can be integrated into our programmatic compilation using Roslyn.
  1. GetSimpleCompilation()
    This is same as what is there in previous post about simple compilation.
  2. GetAnalyzerAwareCompilation
  3. Here we connect our compilation unit with custom Roslyn diagnostic analyzer.
  4. GetAllDiagnosticsAsync()
  5. This is Roslyn API to execute our custom analyzers on top of source code input and get the results back. The important point here is the compiler platform returns same type ImmutableArray<Diagnostic> every time. ie Regardless of whether its simple compilation or compilation with analyzers.

    If the result from GetAllDiagnosticsAsync is empty list, proceed to actual compilation which produces the assembly file.
  6. ShowDiagnostics()
    This is too same as previous post.

Updates

2022-08-10 - Added full sample to https://github.com/dotnet-demos/roslyn-compile-with-analyzers 

Tuesday, June 16, 2015

Using Roslyn to compile C# source code

Below are the steps write a program to compile C# code snippet using Roslyn (Now .Net Compiler Platform)to produce an assembly. Simply saying a C# program compiling another C# program using Roslyn compiler.

The Roslyn nuget version used here is 1.0.0-rc2. This tutorial is organized in zoom in/out fashion.

There are mainly 3 high level steps and they are getting a compilation unit, compiling the same and processing result / diagnostics it produced. The code snippet shows the steps.

  1. GetSimpleCompilation() Now lets look at what is happening in GetSimpleCompilation. As seen in the code we are getting the source code, creating syntax tree out of that then prepare the references needed for the compilation. Finally Creating CSharpCompilation using all the information.
    1. GetSourceCode()
    2. This method simply returns sample source code to compile.Can be replaced by any other code snippet.
    3. GetSyntaxTree()
    4. This method parses the input code and creates the CSharpSyntaxTree.
    5. GetReferences()
    6. Compilation always needs what are the references needed. Here its very basic as we are not using any advanced features.
  2. Compile()
  3. Now its the time to compile. It needs to get the assembly file name to proceed. Since the GetOutpuFilePath() is just simply returning the dll file path, omitting it.
  4. ShowDiagnostics()
  5. This method shows any warnings in compilation.
The compiled assembly can be kept in memory and execute from there. Please refer this link for more details.

Happy coding...

Tuesday, June 9, 2015

WebAPI SelfHosting via OWIN v/s HttpSelfHostServer

WebAPI can be hosted via 2 ways. The traditional way using HttpSelfHostServer and the new OWIN based hosted. Below is an attempt to compare both techniques.

API Controller

Below is the controller class I am going to host using both the hosting techniques.

Differences between HttpSelfHostServer and OWIN Hosting

In HttpSelfHostServer we can see most of the things as configurable. But in OWIN, its by convention.

References

https://frendsrnd.wordpress.com/2014/02/03/httpselfhostserver-hosted-web-api-with-https-and-windows-authentication-enabled/

Tuesday, June 2, 2015

Invoking a method without parentheses () in C#

The biggest difficult faced by any programmer coming to VB from C#, will the lack of parenthesis '()' when they want to call functions. In VB its not required to put ( ) to call functions. But we can put if we want. If we need to deal with legacy codebase which is developed by classic VB developers, we cannot expect any brackets in the functions calls.

It really reduces the readability of program. Each time when somebody see a word, they need to spend some time understanding whether its a function or property. Also to make sure what are the exact parameters, it will take some time. If we assume it will take 30 seconds for a developer to understand the intention of function call and its parameters without brackets, it will take considerable amount of time if we think about project with thousands of lines and 30-50 developers.

We (at least I am) are thinking that C# doesn't has this problem. But is that true? See the below code snippet.
Here the char.IsLower method is being invoked 3 times. But there is no parentheses used.