Showing posts with label resource. Show all posts

Friday, January 4, 2013

Decisions About Overhead

Premature optimization is all about the decisions we programmers make on overhead, before the overhead is actually witnessed in the running system. You might call it potential overhead that spins cycles on valuable resources while other code is competing for that same resource. The decisions about how to best go about minimizing overhead in software are made while its being designed. While writing code, we notice, staring at a particular function and the structures within, that this should be altered. There is no way that all this initialization work needs to happen here before the real objective of the function even starts. And so the decision to optimize, right then and there, is made. In the most straightforward cases, yes, it's worth the five or ten minutes of re-factoring and testing it takes for the obvious improvements. But even the simple ones amount to a much decision making without the relevant data.

What URIs Say About Resources

The web today is made up of boundless resources — each identified by their URI. I favor the term URI over URL because it denotes identity. So what information can we gather from the URI alone? Are we better of calling it a URL since the mail purpose is to do a lookup? I don't think so because the fact that URIs are used to look something up is implied knowledge — it's the identity of the resource we want to learn about. But, can we attain this type of information from the URI alone or is it a meaningless question? URI's should be designed to advocate foreknowledge of what the resource is.

Uniqueness
URIs are unique. That is, there is a one-to-one mapping between a URI and the resource it points to. There are, of course, exceptions to this rule. For example, you might have a radio station web application that displays the currently playing artist. During the artist's air time, they might have a artist/current URI that points to the artist's detail page. Alternatively, there might be a single canonical URI associated with the artist's page — artist/123 for instance.

So in the case of the former — where the artist can have two URIs pointing to their page — there is no one-to-one mapping of URI to resource. There might even be more than two URIs pointing to the artist's page — charts/top, for instance. But these URIs are unique in that they're referring to one resource. The underlying resource might change — the artist/current URI stays the same but the resource it points to will change frequently.

The artist/current URI is an example of a virtual resource — to the external agent, this appears to be where the resource lives. But this isn't where the resource lives — it isn't it's canonical URI. The URI artist/123 is static and probably will never change. The virtual URI points to the canonical URI.

To better illustrate this concept, let's talk in hockey terms. Imagine the center for the home team. He is number 15. So his canonical resource URI looks like home/15. Now imagine that you want a URI for the home team player currently in possession of the puck. Our star center has the puck — so we can represent this as a virtual URI — puck/control. There isn't anything special about this URI — it just contains some logic that points to the home/15 URI.

Meaning
So it turns out that URIs carry some important information after all. And this is what I'm trying to figure out. Exactly how much information is of value to the reader of the URI? In theory, every URI on the web could be some arbitrary string — CD4F2ACF4, for example. It wouldn't matter because information is properly linked to other information. The readers don't care what the URI is — they only care about the anchor text.

I think this might have some degree of truth behind it but the reality is that people do care about the URI and what it looks like. I know I do. In fact, before clicking on links, I find myself hovering over them to see where they go — trying to examine the URI to guess it's worth before I go there. Mind you, I take a very active interest in URI design — so I doubt every single user will scrutinize — or even care for that matter — what a URI looks like.

But it turns out that even the most arbitrary URIs make subtle attempts to attach meaning. Consider our earlier URI — artist/123. What do we know about this page before visiting it? Even if you're a lay user — you're probably able to guess that it has something to do with an artist. We achieve two things with this URI — the vocabulary and the multiplicity.

The vocabulary establishes the kind of thing users can expect to see should they choose to follow the link — in this case, an artist. The multiplicity is established in two ways. First, we're explicitly choosing the term artist, not artists. Second, the reader can see the trailing identifier — 123.

So the most meaningful piece of information in this URI is artist. The arbitrary part, arbitrary from the reader's perspective, is both meaningless and important at the same time. The arbitrary identifier assigned to the resource is an important part of the URI — it's what makes it canonical. The number itself has no meaning to the user but it has utility in sharing that URI with others.

Evolution
It turns out that URI design has evolved quite a lot since the emergence of the web. We've seen a lot of resources — immeasurable resources — created over the years. This directly impacts our ability to create meaningful URIs for users. If it were simply a matter of incrementing the resource count once a new resource is created, we'd be all set. Unfortunately, that isn't true at all. There are new types of resources that need to be created as applications and organizations evolve. These new resource types are going to form an ever more complex mesh of relationships — links to other resources both new and old.

These new resource types — once invented to help solve the technological problems of the day — will also need virtual resources. The virtual resources are the logic of the web — they're not real data, just pointers to other canonical resources that store the real information that external agents update and use.

Keeping URLs meaningful for users is important as available information continues to expand. If we succumb to churning out completely arbitrary URIs, we're taking a step backward. Likewise, the URI itself is real data that needs to be shared and passed around — so we must be careful to add meaning, but not too much.

Thursday, September 8, 2011

Themes: The New Web Resource

Web resources are an integral part of web applications. So what exactly is a web resource? A web resource, generally speaking, is anything returned by an HTTP request. For example, HTML pages are the quintessential web resource. But there are other resource types too — images, CSS style sheets, etc. The browser depends on these resources to properly render the page for the user. So in this sense, what the user sees is actually a decoupled collection of smaller resources.

A few resources are more prevalent than others — HTML, CSS, Javascript. But the fact of the matter is that anything can be a resource — data a remote agent solicits or modifies. We're not limited to what we can serve — can make available to clients. There is no predefined set of resource types.

With the ever growing complexity of applications, the limitless dependencies required to make the thing work — introducing new composite resource types is a foremost advancement toward a better web. Themes are like web pages — a composite type of resource — one that'll be ever more germane to web applications over the next couple years.

Why resources?
So why do we call the things we retrieve with HTTP requests resources? Why not call them web objects or files? The term resource has more to do with the client software than the actual resource itself. The term resource reflects the fact that what the client is fetching is something it needs to function correctly. A dependency of sorts. For example, your television makes use of a digital content stream resource.

The term resource is especially useful when we're describing stuff that has to do with the web. Things that live on the web. The web is called the web because it's interconnected. The the nodes in this graph are the resources.

When we're talking about a particular site, we're more inclined to refer to resources by their type — the page, the image, the flash, the video. But when we're talking on a pure HTTP web-as-hypermedia level, we're better with the ambiguity resources suggest. If some application has an API that lists resources, and an API that retrieves a specific resource, we can design the client around those concepts. Resources are an architectural design idiom while specific resource types describe specific resource attributes.

Themes are abstract
Up until now, we've only touched upon the most general ideas of what makes a resource on the web. Put simply, it's any web-accessible data — you can retrieve it by issuing a GET request. But the client must know about the resource's URI — where is it? Where does the client get such information? From other resources that know about it — this is the interconnectedness that makes the web so powerful.

Now that we have this resource, how does the client know what to do with it? The web browsers people use to surf the web on a daily basis know how to interpret dozens of resource types. In HTTP terms, the resource type is the content type, usually specified in the response headers. If the client knows what the resource contains — it's structure — it can make use of it.

Imagine getting a CSV row with twenty columns in it. Suppose you've hadn't received any schema for this data. You might be able to make sense of some things in the data be deriving their meaning, but you're essentially playing a guessing game. Without a standard (schema), making use of web resources is a lost cause. Web standards dictate a common set of rules so that browsers know how to display information. This is how the browser knows that a CSS web resource is used to alter the appearance of another — the HTML page.

The HTML page and the CSS stylesheet are so commonplace that we hardly think of them as abstract anymore — but they are. They're a simplified set of instructions — more abstract than saying "this group of pixels should be positioned here and should be coloured blue". Humans know how to make sense of higher-level instructions that hide the low-level details.

But even with high-level languages like those of HTML and CSS — languages that hide the messy stuff — we're still able to build applications that are incredibly difficult to comprehend let alone maintain. Web sites and web applications are getting bigger and bigger — there is a cost to that volume. Complexity means longer development time and or more development resources.

Themes are the next level up on the abstraction ladder. A theme is simply the appearance of the site — like how desktop themes let users change the appearance of their local environment. Themes are more abstract than CSS styles because of the standards imposed on them — the same style rules are found in each theme — only the properties change.

Nothing new, just improved
You might be wondering what exactly sets themes apart from any other typical web resource we're used to. Like CSS style sheets. That is what themes are after all — a collection of styles and images. Both of these things we're used to seeing in every web page we visit.

But unlike your typical CSS that we're used to, themes carry a standardized interface HTML pages can utilize. The jQuery UI project really pioneered themes — the new resource type that we've come to expect. This is how we're able to achieve a consistent look and feel across all our applications. The standard theme interface allows this because the applications that use them must adhere to the theme styles.

Themes simply tie together older, more familiar resources that we're used to. They're more abstract, enabling us to build better user interfaces for the web. So are they justified in being referred to as a new resource type? I think so.

Sunday, November 28, 2010

Internal Quotas

Why do quotas exist in software? Their role is to curtail overages consumed by a program. We could let the hardware decide how to handle these situations, but users probably wouldn't appreciate that much. How are quotas measured and how do we decide on an upper limit that regulates the consumption of one resource or another? The easy way to decide on a upper quota limit is to measure the physical hardware limits. Portions of the physical resource are made available to software that needs it. Allocating quota limits gets interesting when several competing software units are in contention for it. Abstract software objects that need memory, a thread that needs CPU time, or an entire program that needs both - these are all things that need resources and also need quotas.

Think about programs running on your operating system. The OS to decides who gets to use the CPU at any given time. If there is more than one CPU, the same logic applies - it is also up to the OS to decide which programs get to use each CPU. This is an example of a software quota. Once the program starts executing instructions, it only has so long before it must relinquish control of the CPU. This quota, the number of instructions executed by the program during it's CPU occupancy, is established by the OS. The best a program can do is manage it's own internal quotas.

What types of resources would a program want to restrain? If the underlying operating system manages physical hardware access, why bother? In dynamic programming languages, we don't need to explicitly allocate memory or tell the threading mechanism to switch contexts. There is more to quota management in software than just the physical hardware limitations.

Software is created with a problem domain in mind. In object-oriented software, different classes represent different concepts in the problem domain. Can we use these concepts as a basis to assign quota limits? For instance, if I've got two containers, one that stores user objects and one that stores document objects, I can limit these as I deem fit. I could give the user object container a quota that limits it to having no more than 100 objects at a given time. I could limit that document container to having only 10. I just made these limits up. The point is that you have the ability to do this based on the business requirements of your software, not necessary having to occupy all physical resources until the OS says your quota is exhausted.

The same idea applies to processor time within your program. If your application has several threads, you are in control of what domain concept gets to use the CPU, not leaving it up to the the OS that knows nothing about your problem domain. Understanding the domain concepts in your code and the ability to assign quotas to them gives you much more control and flexibility than you may have thought possible.

Tuesday, April 27, 2010

Passing URIs

Uniform Resource Identifiers (URIs) are what enable us to find things on the web. These things refer to a resource, uniformly identified by a string. A resource is any digital media that has been made available on the Internet. At a higher level, search engines are what allow us to find resources that live somewhere in the web. Without a URI, there would be nothing useful for the search engines to display in their results. Additionally, it would be impossible for search engines to crawl websites without theURIs that make up the structure of the site.

APIs can be built with a set of URIs as well. These URI-centric APIs are sometimes referred to as RESTful APIs. RESTful APIs have a close association with the HTTP protocol. Because of this we can pass parameters to resources through GET or POST requests made by a client. But these are often primitive types that can be represented as a string. For instance, if I'm using someAPI to update my user profile, a numeric user ID might be a POST parameter I need to send. This is necessary so the application knows which user to update. But what if I were able to pass an entire URI as the identifier for the resource I want to update? Does that even make sense? Well, lets first think about how applications identify resources internally.

The most common way for a web application to identify a resource internally is by a primary key in a database table. This key is typically an integer value that is automatically incremented every time a new record is inserted. This key is unique for each record in that table. This makes the primary key of a database table an ideal candidate for using as part of a URI. You'll often see integers as part of a URI, for instance "/user/4352/". There is a good chance that the number is a unique primary key in the database. This uniqueness maps well toURIs because every URI should be unique in one way or another.

One potential problem with using primary database keys in URIs is that different records in different database tables may share the same key. This doesn't necessarily weaken the uniqueness of the URI because it is still referring to a different type of resource. Consider two database records in two different database tables. These records both have the same integer primary key value, 5. TheURIs for these two resources are still unique because they are referring to two entirely different concepts. So the first URI might be "/user/5/" and the second URI might be "/group/5/". But what if you don't care about the resource type?

A canonical URI might be composed of a UUID instead of the primary key of a database table. UUIDs themselves are unique and may refer to any resource. That is, a UUID doesn't need a context in order to be unique. If our above two URIs were to use UUIDs, they might look something like "/user/cadb1d94-5305-11df-98a5-001a929face2/" and "/group/d8eee85c-5305-11df-8d08-001a929face2". As you can see, we really don't need "user" or "group" as part of the URI. We could refer to a resource canonically with something like "/data/cadb1d94-5305-11df-98a5-001a929face2/". This could refer to either a user or a group. This can be both flexible and dangerous.

Having a canonical URI based on a UUID can be flexible because the client requesting the resource doesn't need to know the context. The client might have acquired this URI and has no idea what exactly it is a representation of. Even with just theUUID , the client now has the ability to discover the type of resource this URI is pointing to based on the data it returns. This can also be dangerous for exactly the same reason. If a client doesn't know how to handle the data returned by a canonical URI, chances of the the client malfunctioning are higher. The data representations returned by URI resources are a lot like interfaces; different data types can still provide the same interfaces by having a subset of common keys.

The location part of a URI might also be useful for passing as parameters to web applications. Until now, I've only been talking about the path in which the server must look for the resource. But this is making the assumption that the resource in question still lives on the same server. By only passing primary database keys or UUIDs as parameters, we leave the location aspect out of the equation. It might be more flexible to pass a full URI as a parameter. Even if the URI location is pointing to the same location in which the request arrived. It really isn't a big deal for a server to check when processing requests. If the resource lives here, we simplydissect the URI and process as usual. Otherwise, we forward the request to the appropriate location. I realize I'm oversimplifying this a little too much but the idea is to think about passing wholeURIs as parameters, not so much the details of how we should go about implementing a full-fledged distributed computing platform.

So remember that canonical URIs composed of UUIDs can be useful when treated with care. If context is important, don't use them. Stick to using primary database keys if it helps keep things simple. Try experimenting with a simple web application that will accept a full URI instead of an ID string of some sort. A flexible solution might even accept either or.

Thursday, March 18, 2010

jQuery API Attributes

jQuery is an excellent Javascript toolkit for interacting with server APIs. Especially for RESTful, resource-oriented APIs. Each resource returned from such an API generally has a unique ID associated with it. This could be a database primary key or a UUID. Regardless, it is used to uniquely identify the resource so it may be referred to as a URI such as /resource/3495/.

jQuery web applications often build lists of user interface elements from resource lists. For example, /resource/list/ might return a list of resources in the for of (id, name). Once the jQuery callback has this list of id-name pairs, it can build an HTML list. The question is, how should the resource ID be stored in the user interface so that it can be used again as part of a resource URI if the user clicks a list element?

One solution is to store the ID directly in the DOM element when it is created. The benefit here is that the URI can be constructed from the click event. The event object itself has a currentTarget attribute which is our list element. Lets say we stored a uuid attribute as part of the list element. Inside the click event handler, we could do something like jQuery(event.currentTarget).attr("uuid"). This is all we need to build a URI for this specific resource.

Tuesday, November 3, 2009

File System Resources

RESTful web services often employ the concept of resources. When reading about RESTful web services, you will often here the term resource or resource-oriented. This is because a key principle of a RESTful system is that of the URI. The unique resource identifier is used to point to some resource, as the name suggests.

The concept of a unique resource identifier says nothing about the context in which it is used. That is, a URI can point to a resource on the web, or it can point to a resource locally on the file system. When using a URI on the local file system, the URIs will only be unique within the local context. For instance, the URI file:///home/ probably isn't unique within the context of the web but would most surely be unique within the local system.

There are two types of resources we are interested in when constructing RESTful applications. There are remote resource that the application might be interested in that live on the web. And, there are local resources the application might be interested in that exist locally within the file system. These two resource types really aren't all that different. The obvious difference of course being the context in which the resource is considered unique. The other difference is at a level lower than that of a RESTful design is how the actual IO functionality is implemented. For instance, you can't perform read operations on remote resource by invoking traditional file system functionality. The same is also true for performing read functionality with remote resources.

One of the similarities between remote resources and local resources is the URI. The URI differs only slightly between the remote resource, typically using HTTP as the protocol, and the local resource which uses a file IO protocol.

Illustrated below is a simple class hierarchy that models a flexible resource. It is flexible in the sense that instantiated resources can be either remote or local in the application.

Here, the base class is Protocol. Inheriting from this class is the base Resource class, with the children resources, File and HTTP. The classes are purposefully incomplete in definition because this hierarchy allows for many implementation variations. The Protocol class is high level and probably serves as an interface. The reason we want to define the Protocol class in the first place is that in this context, where resources may not be using the same protocol, resources may be considered a protocol type.

The Resource class is what should define the higher-level resource functionality. This is where the uniform methods that should be functional for any resource type should be defined. These could map closely to HTTP methods or to some other consistent interface. The File and HTTP class provide the lower level implementations that are invoked by the Resource interface. This enables an application to use resource abstractions, both local and remote, with no regard for context as the behavior can be invoked in a polymorphic way.

Tuesday, July 28, 2009

RESTful Python Objects.

Designing RESTful resources that behave like the web in this day and age makes for good API design practice. The resulting API has a very high level of simplicity that is sought after and valued by many developers. However, what about the client end? Can they too benefit from this elegant design? They sure can. Just like anything else in software design, APIs can be abused or they can be used as intended. So, why not make the client work similarly to how the resources themselves behave? Better yet, why not make them identical?

This can be achieved by mapping individual resources to Python instances. This makes for a good abstraction mapping. One resource, one Python instance. But this doesn't really help the Python developer if there are "special" methods they need to invoke on the instance just to interact with the actual resource. This instances acts as a proxy and so both the instance data and the instance behavior should be the same as the resource. This can be done by using the following principles:

The constructor could issue a POST HTTP request to create a new resource using the constructor parameters.
The attribute retrieval could be overridden to issue a HTTP GET request.
The attribute setting could be overridden to issue a HTTP PUT request.
The object deletion functionality could be overridden to issue a HTTP DELETE request.

That's it, the instance can then behave like a regular Python instance and be a RESTful resource at the same time. Mind you, these are just principles and not and ideal implementation, obviously. So, what is needed is an HTTP library of some kind to fully implement each of these methods. There will no doubt be variations to these methods as well. For instance, there is often the requirement of retrieving lists of resources as opposed to a single resource.

The following is a simple example illustrating these principles.

#Example; RESTful Python instances.

class RESTful(object):
   def __init__(self, uri, **kw):
       #Issue a HTTP POST request construct a new resource.
       print "POSTING..."
      
   def __getattr__(self, name):
       #Issue a HTTP GET, possibly a conditional GET to 
       #retrieve the resource attribute.
       print "GETTING..."
      
   def __setattr__(self, name, value):
       #Issue a HTTP PUT request to update the resource
       #attribute.
       print "PUTTING..."
      
   def __del__(self):
       #Issue a HTTP DELETE request to destroy the resource.
       print "DELETING..."
      
class BlogEntry(RESTful):
   def __init__(self, uri, **kw):
       RESTful.__init__(self, uri, **kw)
      
if __name__=="__main__":
   entry=BlogEntry("/blog", title="Hello", body="World")
   entry.body="There"
   body=entry.body

Friday, May 8, 2009

Restish Resources In Python.

RESTful resource design is not only a common theme in the Python web application framework world, but in countless other languages and frameworks. However, Python has the advantage of rapid development. Not just for the developers who use the frameworks in question, but also the developers creating these frameworks. The lessons learned from previous previous framework implementations have a very rapid turnaround. The restish Python web application framework is an example of just this. The package started out as the Pylons web application framework. However, the developers of restish soon realized that Pylons had shortcomings for what they were trying tho achieve. They then started to remove the problematic components from Pylons and what they eventually ended up with was a web application framework that was stripped down significantly. Just like Pylons, the restish package relies on the paster utility to create restish project and to execute restish projects. As the name of the project indicates, the framework is best suited for designing RESTful resources.

With restish, there is more than one way to define a resource, as is commonly the case with any software component. The best design is to represent a resource by building a class for that resource. Luckily, this is a no-brainer with restish. As a developer, all that needs to be built for a resource is a class that extends resource.Resource. The methods defined by this class are then responsible for returning the appropriate content. These resource methods are then decorated to indicate which HTTP method they correspond with. This is incredibly powerful and insightful; methods corresponding to methods. That is not all the decorated resource methods are capable of, however, I won't dig any further here. Resources can also be nested. The parent controller simply returns an instance of the child controller.

The abstractions provided by the restish package are very valuable to developers who want to design RESTful resources. This is a common design requirement that is elegantly implemented here. The sheer simplicity involved puts a smile on my face.

Thursday, March 19, 2009

The need for a REST interface in object-oriented applications

REST is a set of design criteria used for designing web-centric architectures. Much of the HTTP protocol incorporates ideas found in REST such as being connectible, resources, and a uniform interface. This uniform interface consists of methods that can operate on resources such as GET, POST, PUT, and DELETE. These are the most common method employed by RESTful applications. The idea of resources states that each resource within a system is uniquely addressable. In fact, this is also part of the uniform interface found in RESTful designs. Many web clients, other than the web browser, use SOAP as the message transformation framework. However, SOAP is not as flexible as a RESTful design and yet there exist many clients and client libraries, in several languages for SOAP services. There are also RESTful clients and client libraries, although, no nearly as many. By the very nature of a RESTful design, objects in an object-oriented system map well to resources of a RESTful architecture. Perhaps developers should keep this in mind and have classes provide a RESTful interface.

What would a RESTful object-oriented interface look like? That is, what would the methods and attributes be? The first step to implementing a REST interface would be define methods that map to the HTTP methods. For example, consider the following example.

#Example; A RESTful Python interface.

class REST(object):
   def GET(self):
       raise NotImplementedError("GET()")
  
   def POST(self):
       raise NotImplementedError("POST()")
  
   def PUT(self):
       raise NotImplementedError("PUT()")
  
   def DELETE(self):
       raise NotImplementedError("DELETE()")

Here, we have a Python class called REST. This class defines the GET(), POST(), PUT(), and DELETE() HTTP methods. Each method, when invoked will raise a NotImplementedError exception because this class is meant to be an interface. For a class to provide this interface, it would inherit from this class and redefine all methods, providing an implementation. What about attributes? If a instance of REST were to act as a proxy to some RESTful resource, it would need to know its URI. So uri would be a good candidate for an interface attribute. There are many other meta-data attributes associated with the HTTP protocol they we aren't concerned with here. What we want to highlight is the REST interface developers could potentially use when designing objects. On the topic of attributes, another question springs to mind. What about resource attributes. If all we know about a particular resource is the methods it supports and its uri, how can we represent the resource in the context of an object-oriented system? This would most likely be another interface that we would use in conjunction with the REST interface, used to interpret the representation of the remote resource. An alternative is to use WADL to define what resources should look like. However, WADL is too much like SOAP. The rigidity involved defeats the purpose of a RESTful architecture.

The REST interface discussed so far is really only useful as a proxy to a remote resource. That is, the object we are designing that provides this interface would use this interface to make an HTTP request to the HTTP server providing the resource. An analog would be the web browser application providing the REST interface and invoking the GET() method to retrieve a web page.

The "REST" interface could also be the resource itself. If the developer is designing an object-oriented HTTP web application, they could design object within that system, exposed to the web, that provide the REST interface. The method information is always encoded in the HTTP request, otherwise it wouldn't be HTTP. If the base HTTP server forwards this request to an object that provides this interface, that object will always know what to do with the request. This same object can also act as a proxy and so on, forming a chain of RESTful resources.

However, as with all distributed computing, this chain of resources poses a design challenge. How does the system manage new resource locations? If the system is to scale at all, it will need to. However, this problem will come down the road. Right now, the problem is the RESTful implementation at the design level in object-oriented systems. With a RESTful interface, these problems would be much easier to solve.

Tuesday, February 24, 2009

Optimistic provisioning in the cloud

One of the technological problems that cloud computing technologies are supposed to solve is the lack of computing power when it is needed. Computing on demand, so to speak. The elasticity of the cloud enables this.

The classic example of this is when a web site operating in the cloud gets "slashdotted" and does not have the necessary computing resources required to fulfil the requests, your site dies and readers (soon to be ex-readers) will be disappointed. Luckily, your site is running in a cloud environment and has the ability to "expand" its' computing when the demand requires it.

What happens when the actual expansion takes place? Generally, a new virtual machine is created and that machines' resources are now available to the process that requires it. The process in this context refers to the overall business requirement that caused the expansion event in the first place. The process that says "give me more computing power" may in fact result from a general discussion amongst several nodes in the cloud.

Here, we have a simple controlling process that handles requests. These may be client requests or requests from other nodes in the same cloud. The controlling process then forwards the request to a resource management process. It is the responsibility of the resource manager to ensure that computing resources are available to fulfil every request. This is where the bottleneck lies, in has_resources(). In the most common case, there are plenty of resources available and has_resources() has very little work to do. However, when resources start to dry up, it needs to make more resources. This is where the costly work of the resource manager lives. It would be great if there were some way to know ahead of time what the peak resource demand will be.

Unfortunately, there is really no reliable way to do this. The best we can do in this situation is guesswork. The resource manager could monitor the distance between the size of resource requirements in a given time interval. Certain thresholds could then be set and once reached, we could then provide resources based on what the probable resource demand will be in the near future.

For instance, lets say I have a simple running within a cloud environment. I post a new entry, "a ton of traffic". Now, before I post this entry, I have an average demand of 5 requests per hour. An hour after posting, the resource manager notices that my average has doubled to 10 requests per hour. This is something that could be handled very comfortably be my service. However, the suddenness of this relatively large change could put the resource manager on alert. Now, hour two after posting "a ton of traffic", the number of requests reaches 20 requests per hour. It seems that this raising demand trend is continuing. The resource manager would then proceed to making more resources available.

With this approach, there is always the risk of over-provisioning resources. This type of data can be misleading. However, it does lend a guiding light toward proactive provisioning. Besides, if the statistical data is misleading, it is better to cleanup over-provisioned resources than being trying to do a huge provision job during the high resource demand.

Monday, December 15, 2008

Resource design with twisted.web

The twisted.web Python package offers several resource abstractions that make for a quality RESTful design in my opinion. Putting aside the the Python web framework requirement of "does it do everything I could possibly need ever need?" and focusing on designing an API that cohesive and easily understandable, the twisted.web package is very useful indeed.

The package offers a Resource class that serves as the base class for all resources in your application. As a developer, you will extend (and redefine if needed) this class to provide the behavior of your resources. To create a uniform API, from the RESTful perspective, we implement render_GET(), render_POST(), etc. These the value returned from these methods is the response given to the client.

Here is a trivial example of using the Resource class to implement a blogging system.

#Resource design with twisted.web

from twisted.web import server, resource, util
from twisted.internet import reactor

blog_data=[{'title':'First Entry', 'body':'First Entry Body...'},\
         {'title':'Second Entry', 'body':'Second Entry Body...'}]

class Blog(resource.Resource):
  def getChild(self, path, request):
      return util.Redirect('index')   

class BlogIndex(resource.Resource):
  isLeaf=True
  def render_GET(self, request):
      content=''
      template='<a href="/entry/%s/">%s</a><br/>'
      cnt=0
      for i in blog_data:
          content+=template%(cnt, i['title'])
          cnt+=1
      return '<html>%s</html>'%(content)

class BlogEntry(resource.Resource):
  isLeaf=True
  def render_GET(self, request):
      template='<html><h1>%s</h1><p>%s</p></html>'
      blog_id=int(request.postpath[0])
      blog_entry=blog_data[blog_id]
      return template%(blog_entry['title'], blog_entry['body'])

blog_obj=Blog()
blog_obj.putChild('index', BlogIndex())
blog_obj.putChild('entry', BlogEntry())
reactor.listenTCP(8080, server.Site(blog_obj))
reactor.run()

Some things to note in this example:

The Blog resource is the root resource. It redefines the getChild() method in order to return the Redirect resource which will take us to the BlogIndex resource.
The BlogIndex resource simply displays a list of sample blog entries.
The BlogEntry resource represents a specific blog entry. This resource requires a blog id (list index). It will then display the blog entry.
The main program first creates the Blog resource and adds the child resources. The server is then started using the Blog resource as the root resource.

Friday, December 5, 2008

Trac provides a good example of RESTful resources

The Trac issue tracking and wiki system provides a good example of a RESTful web resource. Good RESTful resources are connected. This means that the associations between resources are within the resource. If a given resource is associated with another resource, the second resource should be navigable from the first.

This is the basic concept behind links in hypermedia.

Trac, however takes this a step further with a trivial feature that is much more valuable than it may seem at first glance. Trac integrates very nicely with subversion. In a large percentage of cases, one ore more changesets are associated with a ticket. It is also fairly trivial to link to a changeset from within a ticket. Here is the interesting feature. The title attribute of the anchor that links to the changeset is the message associated with the changeset. Here is what I mean.

This subtle feature has saved me the time required to actually follow a link to view the changeset resource on numerous occasions. Most of the time, we're only interested in the message. Trac has realized this usability issue and addressed it.

Sunday, October 12, 2008

A resource style implementation

If you are thinking of implementing a RESTful web service, it should follow a resource oriented architecture. Naturally, a key abstraction in this implementation would be a resource. Here is a sample resource diagram:

Now this is a pretty trivial class and one that is not terribly useful on its own. The idea that it represents within the context of a resource oriented architecture is a powerful one. If we can design a resource abstraction before concerning ourselves HTTP and other deployment concerns, we have have a better chance of getting the design right.

The above class implemented in Python might look like this.

class Resource:
def __init__(self):
  self.uuid=''
  self.name=''
  self.content=False
  self.timestamp=False

def set_uuid(self, uuid):
  self.uuid=uuid

def set_name(self, name):
  self.name=name

def set_content(self, content):
  self.content=content

def set_timestamp(self, timestamp):
  self.timestamp=timestamp

def get_uuid(self):
  return self.uuid

def get_name(self):
  return self.name

def get_content(self):
  return self.content

def get_timestamp(self):
  return self.timestamp

Again, fairly straightforward and not terribly useful on its own. It is a useful start to implementing a resource oriented architecture. This class could be extended in several ways to be more useful in an application. For instance, the content attribute could actually be a resource representation. This is another abstraction in our architecture.

This simple class will take any object and turn it into a representation we can use. Python helps us out here by allowing us to override the __repr__() method to get a string representation of the object when we return it.

As it stands we have two classes; Resource and Representation. The Resource class is anything in our resource oriented system. We then suggested a potential Representation class that will turn any object into a representation to be defined by __repr__(). I'm not going to go into much detail of how the __repr__() implementation might look like because the possibilities are so limitless. I'll save that for another discussion.

Here is how we could potentially use the two classes we have defined above.

if __name__=='__main__':
  rep_obj=Representation()
  rep_obj.set_obj({'date': 'now', 'title': 'My Title'})
  resource_obj=Resource()
  resource_obj.set_uuid('123')
  resource_obj.set_name('my_resource')
  resource_obj.set_content(str(rep_obj))

First, we create a representation of some object, in this case a dictionary. We then create a resource for this representation. Usually, the resource will determine the resource type upon request from a client. For demonstration purposes, I've elided these details.

The important idea to take away from this discussion is that when implementing a resource oriented architecture, the concept of resource and resource representation can be built independently of the problem domain.

Thursday, September 4, 2008

Building software objects as resources

The term resource oriented architecture often refers to software design in the context of the web. Object design can take this principle into account before the web even needs to be considered. It is, after all, simply an architectural principle used to build a scalable and robust system.

One approach here is to take into account the four most common HTTP methods and implement this functionality in a base class. For example, if we were to implement this in Python, we might implement a base class that define these methods. But these methods may be abstract, not intended to actually do anything.


class Resource:
   def get(self, *args, **kw):
       pass

   def put(self, *args, **kw):
       pass

   def post(self, *args, **kw):
       pass

   def delete(self, *args, **kw):
       pass

In this case, other resources in your system would inherit from Resource and define the behavior of these methods. The key idea here being that you are building a resource oriented architecture at a level of abstraction that is not concerned with the web at all. This is powerful because it lets the developer shift the application code from framework to framework.

The actual code that deals with web requests is simple. It says "I got a GET request for this resource. Application, what should I tell the client?". And the application can easily respond in a way that makes sense to the framework. Not only that framework, but possibly others as well.

Subscribe to: Posts ( Atom )

Boduch's Blog