Boduch's Blog: remoteresources

In cloud computing environments, there are nodes. These nodes are the high-level cloud elements. They are often represented by physical or virtual servers. Nodes need computing power to process any actions that may take place within the cloud environment. At the next level down, we have entities, or objects in the cloud. These objects live on one or more nodes in the cloud. These objects may represent any abstract software concept. An object in the cloud could be a file, a module, a class, an instance of a class, etc. However, these objects, are hard to identify in a cloud environment. In the cloud, this objects cloud be identified by nothing more than a URI that makes the object unique.

Introducing a name server into a cloud environment allows objects in the cloud to be labeled. This adds meaning to the objects in the cloud that any party may be interested in. By meaning, I mean human-semantic meaning. As in "blog/123" as opposed to "data/123". We know we are now dealing with a blog instead of some unknown piece of data and can adjust the expected schema accordingly. This human-semantic meaning really isn't all that important in deployed systems that already know what type of data a given URI in the cloud refers to. However, when designing code that has even the slightest potential to move into a cloud, referring to a meaningful name as opposed to a URI can be extremely useful. This can be achieved by using a name server.

But what about URIs that have been well designed and already offer meaning to the developers that use them. Do these URIs have any use for a name server? In a cloud environment they certainly do. Take the following URI, "http://127.0.0.1/blog/entry/123/". It is quite obvious that this URI is referring to a specific blog entry. But which part of the URI gave us this meaning? It certainly wasn't "127.0.0.1". It must have been "/blog/entry/123". This is meaningfulness that could easily be captured by a name server. To the world living outside of this hypothetical "blog cloud", the "127.0.0.1" does have meaning (obviously there would be an actual domain name in the real world). However, naming entities in the cloud environment is what we are interested in here. In a cloud environment, nothing is certain. Nodes come, and nodes go, all for different reasons. And it is this scenario where internal cloud name servers really come in handy.

Lets assume there is now a name server installed in our hypothetical blog cloud environment. What do all our nodes containing data elements and processing power do now? That is, how do they configure themselves to use the name server? This is where the concept of presence broadcasting is introduced. Each node in the cloud needs to inform the name server that it could potentially contain a named object that some other node in the cloud may be interested in. If this where possible, nodes in the cloud could come and go as they please.

The Pyro Python cloud computing framework supports all of the above. The concept behind Pyro is that behavior can be invoked on remote Python objects. In the cloud computing environment, this would mean that nodes can invoke behavior that is executed on other nodes. Pyro objects, which are essentially standard Python instances with a little extra Pyro declaration, can be named on the name server. This not only adds meaning to the objects in the cloud, but also maps the name to a specific node in the cloud. So, the "/blog/entry/123/" URI on the name server would map to "127.0.0.1" or some other node.

The Pyro framework also supports presence broadcasting. This means that before a Pyro object is used, the code that plans on using that object can broadcast itself to the name server. This means that any objects this code is exposing to other nodes in the cloud are now also available in the name server. The idea of presence broadcasting in the cloud is a powerful one because it allows the cloud as a whole to grow and to shrink as necessary. It gives the cloud its' elasticity property. All of this could be implemented without a name server, but it would be far from seamless. Even cumbersome.

With the computing resources available today, practicing software developers need to build an architecture that can easily take advantage of these resources. Does this mean that every new application built needs to be a globally distributed grid computing architecture? No. However, your code can be built with scalability in mind even when this is not a requirement.

The leading cloud computing (buzzword) resource at the moment is Amazon EC2. The elastic compute cloud offering from Amazon provides a consistent, stable API that allows developers to manage virtual machines on Amazon's network. This is an extremely powerful tool that provides computing on demand. You only pay for the computing resources that you need. When you are finished with your machine, you terminate it. This essentially eliminates the need for a massive server setup in which most of the time, machines could be sitting idle.

So how does this all come together? Do you hire someone that manages these machines? Someone to launch new machines when the need arises and terminate them when they are no longer needed? This job is much better suited for the actual software that requires the resources. Ideally, the architecture should be built in such a way that it can determine when these distributed resources are needed and when they can be negated.

This is simply a concept of resource management and is already implemented in operating systems and many applications in existence today. The main difference being that these resources may not necessarily live on the same physical hardware. When designing modern software, the potential use of remote resources should not be neglected.

Enomalism uses this approach for much of the machine control functionality. Put simply, an attempt to perform some action on a machine in Enomalism will result in a simple redirect to the hypervisor that could potentially be on some remote machine. The key word here is potentially; the same functionality is still realized if the hypervisor is on the same machine in which the request originated.

This approach applies to almost component you may build. Allowing for the potential to use remote resources is critical for scaling. Even if the need may not be now. It will someday.

Boduch's Blog

Tuesday, June 9, 2009

The Importance Of Name Servers In Cloud Computing

Wednesday, June 25, 2008

Building scalable software objects.