shyaway

.NET > Task Basics 본문

.NET

.NET > Task Basics

shyaway 2018. 9. 15. 04:33


Task Basics


Majority of computers, devices, and gadgets are out there equipped with powerful multi cores these days. This means that the parallelism is getting important in computer programming world. Using only one process in a computer system is now far behind in the past. ( say DOS. ) As the fanciest and probably the most modern programming language out there, C#, more precisely, .NET framework support for parallel programming features since .NET framework 4. The new fourth version provides TPL ( Task Parallel Library ), which has a several data structures for properly backing up parallelism and a set of algorithms for tasks and task schedulling. It seems like the task in TPL is trying to mimic the scheduler in OS itself. I'm going to talk about the details later, comparing the linux scheduler with task in .NET in the near future. 




Overall architecture of parallel programming in .NET framework


Thread resides far behind in the underneath of the diagram. TPL and others are just on top of the threads in a overall picture.


When you finished tweaking codes and compiled the project, C# compiler will compile your code and changes it to IL. In this context, your parallel codes such as async-await will be turning into quite different things for parallelizing your jobs. Here's a brief changed shape for an async await code in IL level. If you are familiar with codes, you're going to get a slight idea of what's the behind scene for async await and Task, exclamating " Oh that's why ! "




This diagram and IL code would be a little bit overwhelming for a basic article. So let's wrap this up quickly.




TPL ?

TPL, Task Parallel Library, as a library, provides a set of APIs and types in System.Threading and System.Threading.Tasks namespaces. It simplifies adding parallelism and concurrency to your code. It is also an abstraction of thread, especially the thread pool, so that you can easily manipulate the behavior by supporting the state management, cancellation support, and other low level features. In the early days of parallelism, you had to manipulate threads and locks at lower level. You can achieve that by TPL now.





Difference between Thread, ThreadPool, Task

Many developers including me are confused with these three things when thinking about parallelism. I'm going to give a brief explanation for each items.



Thread

Represents an actual OS level thread, which has its own stack and kernel resources. You can Abort(), Suspend(), Resume() through the thread object in C#.


ThreadPool

It's a wrapper around a pool of threads maintained by CLR. The thread pool doesn't give you any control at all, except for setting the size of the pool or offload a work by the famous API you must have heard of at least once, QueueUserWorkItem.

The advantage of using it is that you can avoid the overhead of creating too many threads. The disadvantage would be that there's no way to find out when a work item is done or not. Using thread pool is ideal for a fire and go job and something that you shouldn't care about after executing it. If your job requires a long, dedicated work, then using thread pool could be a bad decision.


Task

It's from TPL. Just like the thread pool, a task doesn't create its own OS thread. It's executed by a task scheduler, the default scheduler that simply runs on the thread pool. But there's a significant difference between the task and the thread pool. When you use thread pool, there's pretty much nothing you can do about it. You assign a job to the thread pool's queue and you never know what's going on underneath there.

But the task provides more powerful APIs and state management for you. You can cancel a work item, wait for it, and continue another job as soon as it's done. Returning a result is also a powerful feature as well. I mentioned earlier that the task scheduler runs on the thread pool. And I also said a small, lightweight unit of work is ideal for thread pool and long-running would be not. Task is well aware of this limitation in thread pool and provides a construct option for itself, it's a LongRunning option, which may result in creating a new thread on the thread pool.




Conclusion

The task force in Microsoft mentioned this below once, when they released the TPL in .NET 4.0 framework. Well actually, this one quote explains what Task is about.

Task is now the preferred way to queue work to the thread pool.


Well, obviously the task became seemingly the standard way to use the thread pool I guess. But is using the thread pool by calling the QueueUserWorkItem completely outdated? well I think we need to give more thoughts on that subject. In the next post, I'm going to write the details of the Task such as how heavy the thread actually is, why Task became a preferred way over the thread pool, what types of task exist in TPL, and etc.




















Comments