Showing posts with label concurrency. Show all posts

Monday, December 14, 2015

JavaScript Concurrency

I'm pleased to announce my latest book — JavaScript Concurrency — from Packt Publishing and available on Amazon. This is unique book in that it's more than just a basic rundown of all the concurrency features available to our JavaScript code. Instead, the book uses features like promises, generators, and web workers, as teaching tools for thinking concurrently. There's no shortage of concurrency books out there that teach us how to think in terms of concurrency. This one is specific to JavaScript, and the theme aim is to show you how to write JavaScript code that's concurrent by default, instead of a bolt-on capability. Here's an overview of the chapters:

Threads And Progress Indicators

If your application is to exhibit any simultaneity, it's likely going to use threads. That's assuming you haven't already divided the work and responsibilities up into process chunks — all executing independently of one another and communicating via inter-process messaging. The multi-processing route, even if there is a framework for building applications with this type of concurrency, is hard. Multi-threading is hard, let alone having to manage the additional problems of multiple processes communicating with one another.

In an ideal word, there wouldn't be any communication at all between two concurrent flows of control. Once forked, our logic would happily flow, uninterrupted by the wants and needs of others, to it's final destination. Sadly, no such magic orchestration exists in the real world. It doesn't exist in the abstract software world either because like it or not, the software we write is a reflection of what we experience and it's to this degree that we're limited in creating something more sophisticated.

Having said that, realizing the dependencies between concurrent flows of control, how can we make use of it. Is there really a means by which our seemingly independent logic can collaborate with one another — producing useful behavior larger than the individuals?

Overall Progress
If we're setting out to write a largely asynchronous application — one that uses threads to conduct activities concurrently — the ability of the application to gauge it's progress is valuable. The overall progress of a task composed of multiple execution threads. Dependent threads. Threads of control have to collaborate to produce anything measurable in terms of completed work. Disjoint threads — threads that are truly independent and don't concern themselves with whatever else the application is doing — don't need to produce progress indicators.

However, for those larger tasks where we have multiple threads of control all working toward the end goal, we would like to know how far along each thread is. Is there an end in sight? Is one of our workers exhausted for resources and simply spinning it's wheels — not contributing to the big picture? This, of course, assumes we're even able to gauge the completeness of a thread. Sometimes, by their very nature, threads are intended to be long running — pseudo-processes within the parent if you will.

The only real type of progress we can actively measure from concurrent swim-lanes are those that race to the end of the pool. Imagine the swimming pool is a task the application decided to launch. Inside this pool, each swim-lane represents a thread's route to completion. Each thread in the pool, of course, are swimmers taking part in a race. Only when all swimmers have crossed the finish line do we have a completed task.

It's during the race that the application is interested in gauging the progress of both the race as a whole and the individual swimmers. Perhaps the application itself is performing several tasks — all in different pools, all with different swimmers. Which race is more interesting to spectators? The application can't know unless individual threads are continuously supplying progress indication. But we really shouldn't be exchanging data between threads, right? Isn't the intent to isolate individual threads of control as much as possible? Maybe so, but we're not necessarily exchanging information relevant to the state of the tasks data — only the overall application.

Application Thread Data
Data that pertains to the overall well-being of the application — task progress for instance — isn't the same as data that pertains to the nucleus of the task itself. Something that needs to be computed for the sake of the user's benefit usually means that it's part of the problem domain — a space that would ideally be encapsulated tightly by threads that make the computation happen. This insulation layer — the one that separates application management and application problem domain logic is essential to sharing data between concurrent flows.

So if we're able to establish such tight boundaries by defining what constitutes the problem space, and how the tasks that solve that problem are spawned and subsequently managed by the application, we can identify data that's safe to move around. Exchanging data between threads in this way isn't such a bad design pitfall. There is a separation of concerns here — the problem and the implementation. Making sure these two items are distinct in their implementation is a good programming practice anyway — how we go about doing it is another matter entirely. Because, making these clean cut distinctions between different tasks that form a computational solution to the problem our application solves, isn't easy. Or even possible in some cases.

But, if you can manage to find some value in exchanging data between threads, data that values the overall aim of performing better. Or performing smarter. Or another goal that falls outside of the context of the problem's solution. Of course, this requires some abstract thought — perhaps beyond what is worthwhile in the majority of software we're writing.

I chose the progress indicator as an example here because it may prove to be of practical value to the application as a whole. Code that manages tasks can make informed decisions based of the progress — a simple piece of state about what one of the active swim-lanes is doing. And that's all it comes down to, really — ensuring that your not sharing state between threads that is part of the solution. Other information — data about the tasks themselves might be a smart thing to pass around the thread pool.

Monday, October 5, 2009

Simple Task Management

In distributed systems, tasks are often performed in parallel with one another. In these types of distributed systems, the task is an important abstraction. There are likely to be thousands or millions of task instances at any given time distributed amongst nodes. In order to achieve concurrency, it is important that these tasks be of reasonable size. Otherwise, there exist large non-interruptible regions that cannot execute in parallel with other tasks.

Another essential abstraction in a task manager design is the manager itself. Call it a task runner if that sounds better, the idea is that it is responsible for running tasks. Not just blindly running tasks either but maintaining order amongst all the tasks that are competing for attention. Tasks also need to be disposed of when they have completed running or are otherwise unable to run.

Implementing this type of distributed task management is hard. There are many ways to go about implementing something this complex and concurrent. My suggestion is to first design the most simplistic task management system conceivable. Then make inferences from it. An extremely simple structure of a task management system is illustrated below.

The Task class is specialized by the Search and Sort classes. This means that Search and Sort are types of tasks. The Runner class is associated with the Task class because it is responsible for running tasks as the name suggests. The Runner instance maintains a queue of tasks to run. Below is an example illustrating how a controller would create a task and ask the runner to execute it.

There is room in this simple design for concurrent events that will push tasks onto the Runner task queue. The Runner instance could also execute the tasks in parallel. The idea is to get the simple design right before even considering concurrency.

Tuesday, September 29, 2009

Python Yield Interleaving

Distributed, concurrent applications seems to be the hot topic of interest in computing these days. And, it should be. With an ever increasing amount of data to process, both locally, and on the web, the need to speed things up is crucial. The best way to achieve this speed is by means of concurrency, doing multiple things at the same time. But true concurrency is hard to achieve. Sometimes impossible to achieve as is the case with a single processor. This, however, does not mean that the responsiveness of applications cannot be improved.

Applications that are logically concurrent can support both true hardware concurrency and interleaving concurrency. Interleaving is a time sharing technique that is used when true concurrency is not possible due to a single processor. If interleaving were not used on single processor machines, trivial tasks would render the system useless due to the response time. If nothing else, the single processor architecture has shown how important interleaving is to responsiveness and the effects of that responsiveness.

Applications written in Python can also be designed to be logically concurrent. This can be achieved both by means of threads and by yielding data. Threads in Python are an interleaving construct due to the global interpreter lock, even on multiple processor systems. Yielding data from functions is also interleaving because each element that is yielded from a generator, creates a new process flow. Once this new flow has completed, the flow resumes with the next element to be yielded. An example showing how this interleaving by means of yielding is shown below.

#Example; Python yield interleaving.

#Take an iterable and turn it into a list return it.
#Even if it is already a list.
def _return(_iterable):
   result=[]
   for i in _iterable:
       print "RETURNING"
       result.append(i)
   return result
      
#Take an individual iterable and yield individual elements.
def _yield(_iterable):
   for i in _iterable:
       print "YIELDING"
       yield i
  
#Main.
if __name__=="__main__":
  
   #Create an interable.
   _string_list="A B C".split()
  
   #Display the serial results of returning.
   for i in _return(_string_list):
       print i   
  
   #Display the interleaving results of yielding.
   for i in _yield(_string_list):
       print i

Thursday, July 9, 2009

Multiprocessing Firefox

With newer, modern hardware systems, it is likely that they will contain either multiple physical processors or a single physical processor with multiple cores. When writing applications with these systems as the target platform, it is often a good idea to utilize more than a single process within the application. Why is it that multiple processes make sense within a single application? One may be inclined to think of the concept of a process as being designated for a single running application, or a one-to-one cardinality if you will. On top of moving away from the comfortable idea of a single process for a single application that so many developers are used to, there is also the messy problem of inter-process communication. This particular problem isn't quite as bad as it is made out to be. We just need to use the appropriate abstractions on top of the inter-process communication functionality. Going back to the question of why it is a good idea in the first place, the chief benefit of using multiple processes within a single application is the performance gains. While the same application concurrency logic implemented using threads will offer better responsiveness, the multiprocessing approach offers an opportunity for true concurrency on systems with multiple processors.

The Firefox web browser is currently implementing a version of the browser which incorporates multiple processes rather than having the entire application run in a single process. As discussed above, this entry provides some rationale behind why the development team decided to implement this functionality. Aside from the inherent performance gains offered by systems with multiple processors, there is increased stability. The stability is increased because independent processes decouple the entire browser architecture. This helps provide a degree of isolation not available in a single process. The stability gains are especially apparent when considering the multiple tabs used to view disparate web pages. Security could potentially be improved as a side effect of the isolation provided by processes. Finally, not mentioned in the entry but equally relevant in design terms, the distribution of responsibilities within the web browser system becomes very clear. It sometimes takes a drastic move to improve this design principle such as moving entire pieces of code to a separate process.

In this entry, a demonstration of the browser running a separate process for the tab content shows a clear stability improvement. Even after killing the content process, the user interface process remains in tact.

Monday, June 29, 2009

Python CPUs

When building Python applications with any level of concurrency, it is often useful to know how many CPUs are available for instruction processing within the system. It is useful for applications to have this information because decisions of how to best utilize the system for concurrency can be made at runtime using this CPU information.

In Python, a common way to retrieve the number of CPUs on a given system is to retrieve the number of CPUs from system configuration values as shown in this entry. Here is the actual function taken from the entry:

def detectCPUs():
"""
 Detects the number of CPUs on a system. Cribbed from pp.
 """
# Linux, Unix and MacOS:
if hasattr(os, "sysconf"):
   if os.sysconf_names.has_key("SC_NPROCESSORS_ONLN"):
       # Linux & Unix:
       ncpus = os.sysconf("SC_NPROCESSORS_ONLN")
       if isinstance(ncpus, int) and ncpus > 0:
           return ncpus
   else: # OSX:
       return int(os.popen2("sysctl -n hw.ncpu")[1].read())
# Windows:
if os.environ.has_key("NUMBER_OF_PROCESSORS"):
       ncpus = int(os.environ["NUMBER_OF_PROCESSORS"]);
       if ncpus > 0:
           return ncpus
return 1 # Default

Using this detectCPUs() function, developers can retrieve the number of CPUs available on any major platform. This is done by checking what is available in the system configuration and using that to determine if and where the number of CPUs is stored.

There is one very basic problem to using this approach in that it introduces new code at the application level that is already implemented at the Python language level in the multiprocessing module. There is a much simpler and more elegant method of retrieving the number of CPUs available on the system.

import multiprocessing

cpus=multiprocessing.cpu_count()

This method of retrieving the CPU count is superior as far as the separation of concerns principle. The multiprocessing module is concerned with CPU matters such as how many of them exist. Your application is also concerned with this information, obviously, otherwise it wouldn't be using multiprocessing to begin with. However, it is only the return value your application is concerned with, not the implementation of how the number of CPUs is retrieved. For older versions of Python, since multiprocessing was only introduced in Python 2.6, there exists a backport on pypi. This simply means that your application setup needs to depend on this package in order to support older Python versions.

A use case for using the number of CPUs available on a system within the context of a Python application would be maximizing the concurrency efficiency. The multiprocessing and threading modules share the same interfaces for most abstractions. This means that the application could make a decision at runtime on which module is best suited for the job based on the number of available CPUs. If there is a single CPU on the system in question, the threading module might be better suited. There there are multiple processors available, then multiprocessing might be better suited.

Monday, June 15, 2009

Combining Multiprocessing And Threading

In Python, there are two ways to achieve concurrency within a given application; multiprocessing and threading. Concurrency, whether in a Python application, or an application written in another language, often coincides with events taking place. These events can be written directly in code much more effectively when using an event framework. The basic need that the developer using this framework has is the ability to publish events. In turn, things happen in response to those events. Now, what the developer most likely isn't concerned with is the concurrency semantics involved with these event handlers. The circuits Python event framework will take care of this for the developer. What is interesting is how the framework manages the concurrency method used; multiprocessing or threading.

With the multiprocessing approach, a new system process is created for each logical thread of control. This is beneficial on systems with more than one processor because the Python global interpreter lock isn't a concern. This gives the application potential to achieve true concurrency. With the threading approach, a new system thread, otherwise known as a lightweight process is created for each logical thread of control. Applications using this approach means that the Python global interpreter lock is a factor. On systems with more than one processor, true concurrency is not possible within the application itself. The good news is that both approaches can potentially be used inside a given application. There are two independent Python modules that exist for each method. The abstractions inside of each of these modules share nearly identical interfaces.

The circuits Python event framework uses an approach that will use either the multiprocessing module or the threading module. The circuits framework will attempt to use the multiprocessing module method to concurrency in preference to the threading module. The approach to importing the required modules and defining the concurrency abstraction is illustrated below.

As you can see, the core Process abstraction within circuits is declared based on what modules exist on the system. If multiprocessing is available, it is used. Otherwise, the threading module is used. The only downfall to this approach is that as long as the multiprocessing module is available, threads cannot be used. Threads may be preferable to processes in certain situations.

Subscribe to: Posts ( Atom )

Boduch's Blog