Saturday, May 12, 2007

.NET ThreadPool - Pitfalls and gotchas

Multithreading is used extensively in user interface applications, mainly to perform some time consuming operations in the background, while keeping the user interface active at the same time and not having to block the user. While multithreading is good, having too many threads active at a point in time can adversely affect the performance instead of improving it, just because of the number of expensive context switches that need to be performed.

A middle way therefore, is to make use of Thread Pools. .NET provides you with a readymade implementation of a Thread Pool in the form of the System.Threading.ThreadPool class. A single thread pool (default pool size of 25 threads) is maintained by the CLR for each process, asynchronous tasks can be performed by making use of the methods in this class, typically by calling the QueueUserWorkItem method that queues user requests to be picked up by available threads in the pool.

While the use of ThreadPool makes it a lot easier on the developer (all the intricacies of creating, managing and destroying a thread are hidden and happen behind the scenes) and also improves performance (a quick comparison between a manual Thread.start() and ThreadPool.QueueUserWorkItem()) shows a big difference), there are some pitfalls / points to remember / gotchas when it comes to using the ThreadPool. Following are some:

1. ThreadPool is leveraged by the .NET framework for a lot of tasks. ADO .NET, .NET Remoting, Timers, built-in delegate BeginInvoke methods - all of them internally make use of the ThreadPool. So this means, that the thread pool does not belong to your application alone, but is being used and loaded by the framework itself.

2. The tasks queued up using the QueueUserWorkItem can remain in a wait state for a long time, but the actual work required for each task has to be really less and fast - in order to avoid excessive blocking of a single thread to perform the task.

3. Once a task is submitted to the queue, there is no control over the thread that executes it, no way to get the state or set the thread's priority. It is not possible to create named threads using the ThreadPool class and therefore there is no way to track a particular thread. It is therefore best to use the ThreadPool only when you want to run independent tasks asynchronously, with no need to prioritize them, or make sure they run in a particular order.

4. One ThreadPool is created per process - which can possibly have multiple AppDomains. So, if one application using the ThreadPool behaves badly, another application in the same process runs the risk of getting affected!

5. It is critical to remember to write the code in such a way that deadlocks do not occur. While this is the very basic care one should take while using threads, it becomes pronounced with the use of ThreadPool because of point number 1 mentioned above. The catch is explained below:

Let us say there is a method called "ConnectTo" that opens and closes a socket using the "BeginConnect" and "EndConnect" methods of .NET that internally make use of the ThreadPool. There is a task "WriteToSocket" that is submitted to the queue - to make use of the ThreadPool. And now imagine there are 2 such tasks created with the pool size being 2. Now, the situation is that the two threads in the ThreadPool are already blocked by the "WriteToSocket" tasks. Each of these tasks, however, call "ConnectTo" which requires a thread from the ThreadPool in order to execute the asynchronous "BeginConnect" method. If you get the picture - what has happened in this case is the famous deadlock situtation.

Some rules of thumb to remember to avoid a situation as above:

a. Do not create any class whose synchronous methods wait for asynchronous functions, since this class could be called from a thread on the pool.

b. Do not use any class inside an asynchronous function if the class blocks waiting for asynchronous functions

c. Do not ever block a thread executed on the pool that is waiting for another function on the pool - so basically know which of the .NET built-in functions make use of the ThreadPool!

4 comments:

Siddhesh said...

Please do post these articles on our MTP Dev forums!

Arati Rahalkar said...

What do you think I am going to do first thing on Monday? :-)

Actually, a lot of my blogs are on the MTP forums already.

Unknown said...

Hi Arati,

For the deadlock thing that you mentioned here, isn't it true that if the pool size was 2 for each 'writtosocket' task and then there are some more requests of threads from the same pool, the pool will automatically spawn some more threads ?
I think thats the whole point of threadpool that it spawns threads on demand keeping CPU utilization in to considerations.

Also, I have just started coding something in C# multithreading model using a thread pool.
I am stuck. If you have time, can I share my design/problem with you. Its really very basic I guess. Just that I'm new to multithreaded programing in C#. I've done this in C before using Pthreads.

Thanks,
Jas

Arati Rahalkar said...

Hi Jas,

I would love to help if I can :). Do send in your code snippet to me at arati.rahalkar@gmail.com. And I will try to answer the questions as much as I can.

Thanks,
Arati