Showing posts with label uri. Show all posts

Wednesday, January 9, 2013

Encoding Context In Links

A page in a web application shows formatted objects, menus, or things that pertain directly to the application domain. Like lists of objects from the database, or a single object, selected from the database and presented to the user. Lists of objects, or a single object, are part of the link the user has clicked, or pasted into the browser address bar. No matter actually, it's the notion uniform resource identification — the mechanism realizing this idea is secondary. The goal is that the link text identifies a resource within the application. But is that the extent of the information encoded into a single URI? Or do applications, as they become more sophisticated from a user interface perspective, need to provide context to the pages that render these URIs?

Navigation and Object URIs

How do we go about designing a web application that does well with both the navigational structure presented to the user, and the URI format used to traverse the application? They really are distinct problems, are they not? If so, I find that I've artificially blended the two concepts together. Maybe that's because, as a user, I've unintentionally joined the two ideas. After all, they both play a role in the same job. They take the user from one place to another. One from presents a graphical layout to the user, often with styled elements using some kind of grid layout. The other, is just a string destined for the browser address bar. An action in the navigational menu often means a change in the URI. The change is triggered from the user interface, and I would think that this is the expected pattern of behavior, as opposed to the user constantly typing in the desired location — unless the menu is really that bad.

What URIs Say About Resources

The web today is made up of boundless resources — each identified by their URI. I favor the term URI over URL because it denotes identity. So what information can we gather from the URI alone? Are we better of calling it a URL since the mail purpose is to do a lookup? I don't think so because the fact that URIs are used to look something up is implied knowledge — it's the identity of the resource we want to learn about. But, can we attain this type of information from the URI alone or is it a meaningless question? URI's should be designed to advocate foreknowledge of what the resource is.

Uniqueness
URIs are unique. That is, there is a one-to-one mapping between a URI and the resource it points to. There are, of course, exceptions to this rule. For example, you might have a radio station web application that displays the currently playing artist. During the artist's air time, they might have a artist/current URI that points to the artist's detail page. Alternatively, there might be a single canonical URI associated with the artist's page — artist/123 for instance.

So in the case of the former — where the artist can have two URIs pointing to their page — there is no one-to-one mapping of URI to resource. There might even be more than two URIs pointing to the artist's page — charts/top, for instance. But these URIs are unique in that they're referring to one resource. The underlying resource might change — the artist/current URI stays the same but the resource it points to will change frequently.

The artist/current URI is an example of a virtual resource — to the external agent, this appears to be where the resource lives. But this isn't where the resource lives — it isn't it's canonical URI. The URI artist/123 is static and probably will never change. The virtual URI points to the canonical URI.

To better illustrate this concept, let's talk in hockey terms. Imagine the center for the home team. He is number 15. So his canonical resource URI looks like home/15. Now imagine that you want a URI for the home team player currently in possession of the puck. Our star center has the puck — so we can represent this as a virtual URI — puck/control. There isn't anything special about this URI — it just contains some logic that points to the home/15 URI.

Meaning
So it turns out that URIs carry some important information after all. And this is what I'm trying to figure out. Exactly how much information is of value to the reader of the URI? In theory, every URI on the web could be some arbitrary string — CD4F2ACF4, for example. It wouldn't matter because information is properly linked to other information. The readers don't care what the URI is — they only care about the anchor text.

I think this might have some degree of truth behind it but the reality is that people do care about the URI and what it looks like. I know I do. In fact, before clicking on links, I find myself hovering over them to see where they go — trying to examine the URI to guess it's worth before I go there. Mind you, I take a very active interest in URI design — so I doubt every single user will scrutinize — or even care for that matter — what a URI looks like.

But it turns out that even the most arbitrary URIs make subtle attempts to attach meaning. Consider our earlier URI — artist/123. What do we know about this page before visiting it? Even if you're a lay user — you're probably able to guess that it has something to do with an artist. We achieve two things with this URI — the vocabulary and the multiplicity.

The vocabulary establishes the kind of thing users can expect to see should they choose to follow the link — in this case, an artist. The multiplicity is established in two ways. First, we're explicitly choosing the term artist, not artists. Second, the reader can see the trailing identifier — 123.

So the most meaningful piece of information in this URI is artist. The arbitrary part, arbitrary from the reader's perspective, is both meaningless and important at the same time. The arbitrary identifier assigned to the resource is an important part of the URI — it's what makes it canonical. The number itself has no meaning to the user but it has utility in sharing that URI with others.

Evolution
It turns out that URI design has evolved quite a lot since the emergence of the web. We've seen a lot of resources — immeasurable resources — created over the years. This directly impacts our ability to create meaningful URIs for users. If it were simply a matter of incrementing the resource count once a new resource is created, we'd be all set. Unfortunately, that isn't true at all. There are new types of resources that need to be created as applications and organizations evolve. These new resource types are going to form an ever more complex mesh of relationships — links to other resources both new and old.

These new resource types — once invented to help solve the technological problems of the day — will also need virtual resources. The virtual resources are the logic of the web — they're not real data, just pointers to other canonical resources that store the real information that external agents update and use.

Keeping URLs meaningful for users is important as available information continues to expand. If we succumb to churning out completely arbitrary URIs, we're taking a step backward. Likewise, the URI itself is real data that needs to be shared and passed around — so we must be careful to add meaning, but not too much.

Thursday, May 26, 2011

Remembering jQuery UI Accordion Selections

The jQuery UI accordion widget is a great way to group logical sections of your page together. The accordion widget is actually an alternative to the tabs widget - they're both just layout containers.

So if I'm using an accordion widget as the main layout component on one of my pages, it would be nice if we had a way to preserve the selected section. My approach to this is by using hashes in the URL. Below is the basic HTML markup:

<html>
    <head>
        
        <title>jQuery UI Accordion Selection</title>
        
        <link type="text/css" href="jqueryuitheme.css" rel="stylesheet"/>
        <script type="text/javascript" src="jquery.min.js"></script>
        <script type="text/javascript" src="jqueryui.min.js"></script>
        <script type="text/javascript" src="accordion.js"></script>
        
    </head>
    <body>
        
        <div id="accordion">
            <h3><a href="#section1">Section 1</a></h3>
            <div>
                Section 1 content...
            </div>
            <h3><a href="#section2">Section 2</a></h3>
            <div>
                Section 2 content...
            </div>
            <h3><a href="#section3">Section 3</a></h3>
            <div>
                Section 3 content...
            </div>
        </div>
                    
    </body>
</html>

And here is the accordion.js file:

$(document).ready(function(){
    
    //Get the selected accordion index, based on the URL hash.
    var index = $('#accordion h3').index(
                $('#accordion h3 a[href="'+window.location.hash+'"]')
                .parent());
                
    //The index will be -1 if there is no hash in the URL.  This
    //is necessary if we want the first section expanded by default.
    if (index < 0){
        
        index = 0;
        
    }
    
    //The change event handler will add the hash to the URL when
    //a section is selected.
    var change = function(event, ui){
    
        window.location.hash = ui.newHeader.children('a').attr('href');
        
    }
       
    //Build the accordion, using the URL hash selection and the
    //change event handler.
    $('#accordion').accordion({active: index,
                               change: change});
                                   
});

We can now use the URL hash to set the accordion selection. This is how it works:

The first variable - index - is the index of the accordion selection. So 0 is the first accordion section, 1 the second, and so on. This selector is a little tricky because inside the #accordion div, we have alternating h3 and div elements, so the h3 indexes aren't linear. To get around this, we have two selectors. The first one filters out all the h3 elements we're interested in. Now that we have only h3 elements, we can call index() on them - we won't get invalid indexes due to intermingled divs. We pass index() the h3 element we're interested in, namely, the one with the a that has an href matching the current URL hash.

Phew, so now we've got an index we can use to select the appropriate section. Or do we? If the user is visiting this page for the first time, there won't be any hash in the URL. This means the value of index will be -1. We don't want that, so we give it a value of 0 if there is now real index to use. This way the first section is active be default. This is the expected behaviour, I would say, 99% of the time.

Next, we define an event handler - change - that is triggered when the user changes accordion sections. All this handler does is set the URL hash, allowing the default functionality associated with changing sections to run normally. For example, changing to section two in our example will add #section2 to the URL.

Finally, we build the accordion widget using the selected index and the change event handler. This isn't a perfect solution, but it does solve some headaches with accordion widgets - like the back button and bookmarks.

Tuesday, April 27, 2010

Passing URIs

Uniform Resource Identifiers (URIs) are what enable us to find things on the web. These things refer to a resource, uniformly identified by a string. A resource is any digital media that has been made available on the Internet. At a higher level, search engines are what allow us to find resources that live somewhere in the web. Without a URI, there would be nothing useful for the search engines to display in their results. Additionally, it would be impossible for search engines to crawl websites without theURIs that make up the structure of the site.

APIs can be built with a set of URIs as well. These URI-centric APIs are sometimes referred to as RESTful APIs. RESTful APIs have a close association with the HTTP protocol. Because of this we can pass parameters to resources through GET or POST requests made by a client. But these are often primitive types that can be represented as a string. For instance, if I'm using someAPI to update my user profile, a numeric user ID might be a POST parameter I need to send. This is necessary so the application knows which user to update. But what if I were able to pass an entire URI as the identifier for the resource I want to update? Does that even make sense? Well, lets first think about how applications identify resources internally.

The most common way for a web application to identify a resource internally is by a primary key in a database table. This key is typically an integer value that is automatically incremented every time a new record is inserted. This key is unique for each record in that table. This makes the primary key of a database table an ideal candidate for using as part of a URI. You'll often see integers as part of a URI, for instance "/user/4352/". There is a good chance that the number is a unique primary key in the database. This uniqueness maps well toURIs because every URI should be unique in one way or another.

One potential problem with using primary database keys in URIs is that different records in different database tables may share the same key. This doesn't necessarily weaken the uniqueness of the URI because it is still referring to a different type of resource. Consider two database records in two different database tables. These records both have the same integer primary key value, 5. TheURIs for these two resources are still unique because they are referring to two entirely different concepts. So the first URI might be "/user/5/" and the second URI might be "/group/5/". But what if you don't care about the resource type?

A canonical URI might be composed of a UUID instead of the primary key of a database table. UUIDs themselves are unique and may refer to any resource. That is, a UUID doesn't need a context in order to be unique. If our above two URIs were to use UUIDs, they might look something like "/user/cadb1d94-5305-11df-98a5-001a929face2/" and "/group/d8eee85c-5305-11df-8d08-001a929face2". As you can see, we really don't need "user" or "group" as part of the URI. We could refer to a resource canonically with something like "/data/cadb1d94-5305-11df-98a5-001a929face2/". This could refer to either a user or a group. This can be both flexible and dangerous.

Having a canonical URI based on a UUID can be flexible because the client requesting the resource doesn't need to know the context. The client might have acquired this URI and has no idea what exactly it is a representation of. Even with just theUUID , the client now has the ability to discover the type of resource this URI is pointing to based on the data it returns. This can also be dangerous for exactly the same reason. If a client doesn't know how to handle the data returned by a canonical URI, chances of the the client malfunctioning are higher. The data representations returned by URI resources are a lot like interfaces; different data types can still provide the same interfaces by having a subset of common keys.

The location part of a URI might also be useful for passing as parameters to web applications. Until now, I've only been talking about the path in which the server must look for the resource. But this is making the assumption that the resource in question still lives on the same server. By only passing primary database keys or UUIDs as parameters, we leave the location aspect out of the equation. It might be more flexible to pass a full URI as a parameter. Even if the URI location is pointing to the same location in which the request arrived. It really isn't a big deal for a server to check when processing requests. If the resource lives here, we simplydissect the URI and process as usual. Otherwise, we forward the request to the appropriate location. I realize I'm oversimplifying this a little too much but the idea is to think about passing wholeURIs as parameters, not so much the details of how we should go about implementing a full-fledged distributed computing platform.

So remember that canonical URIs composed of UUIDs can be useful when treated with care. If context is important, don't use them. Stick to using primary database keys if it helps keep things simple. Try experimenting with a simple web application that will accept a full URI instead of an ID string of some sort. A flexible solution might even accept either or.

Monday, April 5, 2010

Hashing URIs

Clients of URIs often need to specify some sort of client parameter that is returned as part of the HTTP response. This parameter is intended to preserve the application state by encoding it into the URI.

A common practice is instead of passing the client state as a GET HTTP parameter, to provide the client state as a hashed URI segment. This segment would represent the client state. It is a preferable URI practice because the application state is still encoded as part of the resource URI but doesn't need to be sent to the server.

One aspect of this practice to be weary of is mistaking application state for resources. If the intent is to use the application state as the hashed URI segment, only use it for application states and not data resources.

Thursday, December 3, 2009

Software Maintenance

This entry talks about some of the problems faced by current software maintenance practices. It highlights some of the various maintenance methods used to maintain deployed software. Of course, not many seem to do it right.

Problems arise mainly because of incompatibility between software versions. Typically, the entire software package has a version assigned to it. This includes all the constituent parts of the application such as modules and data structures. What if this individual components were giving a version instead of the whole? Well, there are systems out there in existence that do just that according to the entry.

What about taking this atomic version schema idea to the URIs of RESTful APIs. Indeed, this idea isn't anything new as many APIs support this feature. The main problem faced by web applications when performing server-side upgrades is the cached clients. Javascript that interacts with the URIs on the client's behalf may be stuck using an old API version. This is fine if the API version number is part of the URI. But backward compatibility can only be maintained so far back. This can be dealt with much easier if the expected version is part of the URI. If an unexpected version is requested, a message can be displayed to the client telling them to download a newer client. Alternatively, the new client code could be transparently delivered to the client as a response to using an incorrect version number.

Tuesday, November 3, 2009

File System Resources

RESTful web services often employ the concept of resources. When reading about RESTful web services, you will often here the term resource or resource-oriented. This is because a key principle of a RESTful system is that of the URI. The unique resource identifier is used to point to some resource, as the name suggests.

The concept of a unique resource identifier says nothing about the context in which it is used. That is, a URI can point to a resource on the web, or it can point to a resource locally on the file system. When using a URI on the local file system, the URIs will only be unique within the local context. For instance, the URI file:///home/ probably isn't unique within the context of the web but would most surely be unique within the local system.

There are two types of resources we are interested in when constructing RESTful applications. There are remote resource that the application might be interested in that live on the web. And, there are local resources the application might be interested in that exist locally within the file system. These two resource types really aren't all that different. The obvious difference of course being the context in which the resource is considered unique. The other difference is at a level lower than that of a RESTful design is how the actual IO functionality is implemented. For instance, you can't perform read operations on remote resource by invoking traditional file system functionality. The same is also true for performing read functionality with remote resources.

One of the similarities between remote resources and local resources is the URI. The URI differs only slightly between the remote resource, typically using HTTP as the protocol, and the local resource which uses a file IO protocol.

Illustrated below is a simple class hierarchy that models a flexible resource. It is flexible in the sense that instantiated resources can be either remote or local in the application.

Here, the base class is Protocol. Inheriting from this class is the base Resource class, with the children resources, File and HTTP. The classes are purposefully incomplete in definition because this hierarchy allows for many implementation variations. The Protocol class is high level and probably serves as an interface. The reason we want to define the Protocol class in the first place is that in this context, where resources may not be using the same protocol, resources may be considered a protocol type.

The Resource class is what should define the higher-level resource functionality. This is where the uniform methods that should be functional for any resource type should be defined. These could map closely to HTTP methods or to some other consistent interface. The File and HTTP class provide the lower level implementations that are invoked by the Resource interface. This enables an application to use resource abstractions, both local and remote, with no regard for context as the behavior can be invoked in a polymorphic way.

Wednesday, September 16, 2009

Pyro URIs

Pyro is a distributed object framework written entirely in Python. The Pyro massively simplifies the task of distributing Python instances across a network. The notion of an object proxy is employed by the Python system. An object proxy in Pyro is indistinguishable from a standard Python instance, hence the term proxy. But nodes within the network must be able create these proxy instances somehow. Additionally, in order to create these proxy instances, the node needs to know about the state of the original, canonical instance. Identifying the original Python instance within a deployed Pyro system is straightforward with URIs. The URI concept is the sole responsibility of the name server in the deployed Pyro system.

The good news is that the object URIs used in the highly distributed Pyro system have a close parallel to the URIs used in a traditional HTTP resource oriented architecture. The translation of the HTTP URI to a Pyro URI wouldn't be overly difficult to achieve. Both systems share properties of addressability. Since the name server does the work of locating the object within a set of distributed nodes, this would be a good candidate setup for a small scale distributed API. ho knows, maybe something like this could be built in the large scale. I wouldn't count on it until it is proven as functional.

Tuesday, June 2, 2009

Addressability Lost Within The Realm Of Ajax

With RESTful APIs and applications, and with HTTP in general, addressability plays a huge role. With web applications that project an Ajax user interface, this addressability is typically gone. There is no notion of URI as far as the end user is concerned. This is because Ajax web applications often have a single URI; the application itself. From this point forward, the end user navigates through the various application states while all the addressable URIs are contacted beneath the fancy Ajax interfaces. If you are an end user using these types of applications, do you necessarily care? That would depend what you are using the application for and the type of user you are. For most end users, if the use interface is well designed, the addressability of the resources involved with the application are a non-issue. Even developers, the type one would think would be interested in the underlying application data, might be too enthralled with how useful the user interface is. If the user interface of an Ajax web application is poorly designed, things change. The data at the other end of the URI suddenly becomes much more interesting.

In more traditional web applications, the ones without the fancy asynchronous javascript toolkits, the URI of almost every resource is accessible from the address bar in the web browser. This is obviously the benefit to having addressable resources in a web application. The URIs are immediately apparent to the end user via the web browser address bar. With Ajax web applications, end users cannot point there web browser to a specific URI and expect the application to behave accordingly. This, I find, to be one of the more annoying drawbacks to using Ajax web applications. Although the user experience of Ajax web applications is improving at an ever increasing rate, the sacrifice of addressability, a powerful concept, has to be made.

But does addressability really need to be lost completely in Ajax web applications? Just because the URI isn't in its' normal location, the web browser address bar, doesn't mean the end user can't know about it. In fact, many web applications that provide an Ajax user interface will also provide a public RESTful API used by that interface, complete with documentation. However, this doesn't solve the problem of the end user that doesn't care about API documentation and just wants page X of the application to appear when they point their web browser to URI Y. I think something like this could potentially work. There would have to be two separate RESTful APIs; one for the application resource data, and one for the user interface. The user interface would interest the end users because they could use these URIs to point the web browser to a specific application state. These user interface application URIs wouldn't even need to be reflected in the web browser address bar. As long as the current user interface application state is advertised somewhere in the user interface, it could work. And it would be incredibly useful.

Subscribe to: Posts ( Atom )

Boduch's Blog