Showing posts with label component. Show all posts
Showing posts with label component. Show all posts

Tuesday, July 27, 2010

Decentralized Components

Software components are often central to other software components. A component is considered central to another component when the fabric of the larger component requires it. If you were to follow the chain of dependencies within a set of components that constitute a software system, you're bound to find one or more centralized components. Centralization in software systems is necessary to a certain extent because something needs to initialize the system. There are, however, areas where decentralized components are beneficial because the chain of command is better distributed, which leads to lower coupling. A component can be anything from a sub-system to a package. In object-oriented design, classes are often viewed as components, depending on your viewpoint. There are many ways that components become centralized in a system and many ways to decentralize them to some degree. A component can be centralized in a hierarchy as well as by composition. Low coupling, as a design principle, can lead to decentralization, but there are other factors that need to be taken into account such as the strength of relationships between components. A simple example sub-system will help to illustrate some of the issues with centralized components and how decentralization can improve the design.

Our sample system will be a printing sub-system. The system requires the ability to differentiate between, and print to both colour ink printers, and black ink printers. The most centralized component of the system is a printer abstraction. We want to design specialized abstractions for the two types of supported printers as well. For the purposes of this example system, all components are classes. A real printing sub-system's requirements go way beyond this example but we're not interested in anything other than the very high-level concepts of the system.

Compositional centralization in software is a whole-part relationship between two or more components. The parts are considered central to the whole because they don't serve any purpose outside of it. Looking at our example printing sub-system, Printer components have one or more colours associated with them. Our initial design shows Printer objects as composites and Colour components as centralized parts of the Printer component. This makes good design sense because colours represent a facet of printing capabilities and are strongly related to Printer components.

Now lets take a step back and look at this design decision at a higher level. Printer components have colours out of necessity. You can't print something that doesn't fall within the visible colour spectrum. Now that the relationship between Printer and Colour components has been firmly established, lets think about the strength of that relationship. Colour components are central to Printer components when they are currently printing something. That is, when they are in a printing state. In any other state, Printer components don't necessarily need Colour components. This means that Colour components are more closely related to the state of the Printer than the Printer itself.

The fact that Colour components are pertinent parts of Printer components only during a specific Printer state has notable design implications. We earlier determined that Printer and Colour components are closely related; colours are central to printers. Objects composed of other objects often directly instantiate their own parts because the whole has the necessary initialization information. We can, however, reduce direct coupling by using a factory to create Colour components when they are needed.

Suppose we have a ColourFactory component with static methods for constructing Colour component instances. This factory can be used by Printer components to create colours.
This is an indirect way to reduce coupling. The ColourFactory component creates colours upon request by the Printer component. Additionally, Colour components become decentralized to Printer components while maintaining a strong conceptual relationship. By introducing a colour factory into the design, the responsibility of creating colours has shifted away from Printer components. When a Printer component wants to create a Colour component, it needs to tell the factory which colour to create. It does this by providing the factory with enough information to construct the component. This could be as simple as specifying a string such as "red". The conceptual responsibilities of colour creation remain with the Printer component while the implementation responsibilities of colour creation are with the ColourFactory component.

Lets go back to the requirements of our printing sub-system for a moment. The system needs to support both colour printers as well as black printers. In our sub-system, the Printer component is abstract. There are two concrete implementations of the Printer component: ColourPrinter and BlackPrinter. The difference between the two printer types are the supported colours returned by the colours() method. The abstract Printer component provides the centralized printing behavior. This shared behavior should be low-level in nature, functionality that is expensive to replicate sub-classes. The key point here being that certain characteristics of the printer concept remain central to the Printer component. There is some behavior that is common to all printer components and cannot be decentralized, hence the generalization hierarchy.

Our two specialized printing components, ColourPrinter and BlackPrinter, have different implementations of the colours() method that will return the supported colours of the printer. The Printer component defines an abstract colours() method with the intention of being implemented by a specialized component. The ColourPrinter and BlackPrinter components each provide their own specialized implementations of the method. The colours() method is decentralized from the abstract Printer component by polymorphism. The common, concrete functionality offered by the Printer component uses the colours() method. Our system uses decentralized behavior to realize the centralized behavior found in our abstract Printer component.

Decentralization in component design can lower coupling between components which, in my opinion, should be sought after as a design goal. It does so by replacing persistent relationships with transient relationships where it makes sense to do so. Conceptually, the parts of a whole component may only exist during a specific time interval or state. Using factories to create these parts and use them when needed can decentralize them from their whole. In generalization hierarchies, specific functionality can be decentralized from a component by providing a specialized implementation of it in a child component. Some system facets are better off in a centralized form as they are shared among the various decentralized components.

Tuesday, March 30, 2010

Structured Components

With the UML, there is such a thing as a structured classifier. Structured classifiers allow for the modeling of the constituent parts that the make up the classifier.

This is handy because we can then show internal parts within the classifier, the connections between the internal parts and the boundary of the classifier itself. These connections can also illuminate the intricacies of the interface connections amongst the internal parts of the classifier.

So your main choices of modeling elements to use when showing a structured classifier are components and structured classes. I think it makes much better sense to use a component for this purpose. Components are geared slightly more toward the logical structure of a classifier and a a component is a structured classifier. Structured classes allow the same thing but I think classes are better used in the traditional way they were used before the advent of structured classifiers. The are good at showing the attributes, operations, and relationships with other classes. Components can show the same class in a different diagram but from a structural viewpoint.

Thursday, September 17, 2009

Python Components

There are probably an endless number of definitions of what constitutes a Python component. The question I have is what is the correct definition or is there a correct definition for a Python component? It seems to me that some things lean more toward being the preferred form of a Python component while others build on this concept and others still are radically different than the vanilla component.

Of course, figuring out what a component is exactly might be a good start. Using the most general idea of what a component is and what a component is not would help us to translate these properties over to the Python world. I think in the most general sense, a component is any replaceable piece of any software system. So, if a component can be pulled out of some system and replaced with an identical component that can oblige to the original interfaces. If a new component cannot do this transparently without causing the system to fail, it isn't a component. It may be considered a component once it has this described property, but until then, it isn't.

Having describe what a component is at the most basic, generic level, how do we decompose Python systems in the same way? We want to take a piece of a given system written in Python, and replace it with another piece. Obviously it needs to conform to the required and provided interfaces to the slot it wishes to fill. But aside from that, what can physically be considered a Python component. At the most fundamental level, most developers would probably consider the module a valid candidate for a Python component. A module, in Python is basically how source code is organized. Well, it is in fact a source code file that supports the modularity concept, hence the name.

The egg is another candidate for a standard Python component. Eggs are the standard method in which to distribute Python pages. In fact, eggs are Python packages. They typically contain multiple Python modules. So are eggs just another type of Python component but at a higher level than modules are? That is tough to say because eggs can be treated as if they were Python modules once they have been deployed on a given system.

The most compelling feature of using eggs, besides the ease of installation, are the entry points feature. Entry points of Python eggs offer services to other eggs installed on the system. Eggs can advertise these services for free. There is no intervening necessary on the developers' behalf. The entry points provided by eggs are also a good candidate for what can be considered a Python component simply because of the enhanced feature set that they offer.

Wednesday, April 15, 2009

Trac component registration and management

Trac is a highly flexible project management system written in Python and based around a component architecture. In fact, a large portion of the base Trac system is indeed a set of components. Example components from this set would include the Trac ticketing system or the Trac wiki formatting engine. Using a component based architecture is a smart design decision in the majority software solutions for more reasons than one. Perhaps the most compelling reason to implement a component based architecture is the replaceability that components provide. Components both require and provide interfaces which means that these components can easily be swapped for a different component that provides the same interfaces as the original component. At the very core of Trac are a small set of classes that define how components in Trac work. Like any well designed software core, it is small and unlikely to change drastically in the future. The other benefit of the core being small is the fact that this core is depended upon by all Trac components in any given Trac installation. This core is not only required for interface purposes, but also for component registration and management. This way Trac always knows during its' lifetime what components are available to it. The core set of classes for dealing with components in Trac are ComponentMeta, ComponentManager, and Component.

The most important class here for Trac component developers is the Component class. This class is the external interface Trac provides to the outside world. The Component class is intended to be generalized or extended by each component within the Trac plugin. The ComponentMeta class is used to register defined components within a Trac environment by performing meta operations. That is, by transforming the original Component class as necessary. The ComponentManager class acts as a storage pool for all components in a Trac environment. Any time the Trac system needs access to any given component, it is retrieved through this class. This provides a centralized place for all components to live. Although the Component class is all the developers need concern themselves with, since the behavior of the other two classes is encapsulated, it is nonetheless useful to have a general idea of why they exist.

The Component class states that ComponentMeta class is its' meta class. Given this declaration in Python, when Component gets instantiated, the result returned from ComponentMeta.__new__() is what the instance will ultimately be. This is a useful feature of the language because it allows the behavior of the original class to be modified based on the context. The ComponentMeta.__new__() method has all the contextual data provided to it as parameters, including the original class, the name, base classes, and constructor parameters. The ComponentMeta class not only registers the various interfaces provided by the component in question, but will also redefine the Component constructor while still preserving the functionality of the original constructor. It does this by defining a nested maybe_init() function inside the ComponentMeta.__new__() method. The nested maybe_init() function will become the new Component constructor. The reason redefining the original constructor is so that a ComponentManager instance may now be passed to the Component constructor. This ComponentManager instance will then store the component. What really makes this useful is that if the original constructor existed within the Component in question, it is still invoked by the new maybe_init() constructor.

The ComponentManager is where Trac components are stored once loaded. As mentioned above, the ComponentMeta class dynamically injects functionality into the Component instantiation process that will store itself in a ComponentManager instance. Components stored in the ComponentManager instance be be retrieved by simple attribute access, using the component name as the attribute. This is implemented by the ComponentManager using the __get__() method. If a component is requested by this method that is not currently enabled, the ComponentManager will enable it before returning it. Otherwise it will simply return it. Developers also have an opportunity with Trac to subclass the ComponentManager and override the empty methods it invokes when enabling components. This could potentially be useful if enabling a component is a meaningful event.

Monday, March 30, 2009

Gaphor editor adapters

The Gaphor UML modeling tool, which is written in Python, uses a pop-up style editor widget which allows in-line editing of certain diagram elements. The widget itself isn't overly-interesting. It makes trivial changes to modeling elements quicker which is always helpful in any software solution. What we are interested in here is the method used to display the widget based on the element type. Gaphor relies heavily on the Zope interface and component frameworks. The Zope interface framework is utilized by Gaphor to define various interfaces that are provided by classes throughout the application. The component framework is utilized for the purpose of registering components and adapters. What exactly are adapters? Adapters are a type of component, automatically created by Zope when using the interface and component framework in conjunction with one another. This doesn't happen by itself; there are some carefully placed rules involved with defining adapters. When used right, Zope adapters are a very powerful tool that provide maximum usage of an interface. Gaphor defines an extensive set of Zope adapters. Here we are interested in the editor adaptor.

There are actually several adapters created for the IEditor interface, one for each diagram element that supports the editor widget. The Gaphor adapter approach is an alternative design to providing behavior that varies by type. A more traditional approach may have been to create a class hierarchy for the editor widget. Each class in the hierarchy would correspond to a different diagram element type. The differing behavior would then be implemented in each class in the hierarchy while similar behavior remains untouched and varying behavior gets replaced in a polymorphic manor. This is similar to how the editor adapters in Gaphor are defined and registered. One key difference in the design is how each class, or adapter, is instantiated when the need arises. With the class hierarchy approach, we would need extra logic to ensure that the correct instance type is created to use in conjunction with the diagram element widget. With Zope adapters, we simply instantiate the IEditor interface providing the object we are adapting to as a parameter. In the Gaphor case, the IEditor interface is instantiated with the diagram element widget as a parameter. The correct adapter instance is then returned by the Zope component framework, complete with the alternate functionality specific to that diagram element type. A similar effect can be achieved with the class hierarchy design. The class that is instantiated would accept the widget that is being adapted.

The adapter approach is a solid one because it emphasizes the importance of the interface contract provided by classes and the modularity offered by creating components. Being able to directly instantiate an interface speaks loudly in terms of what can be used in that context.

Thursday, March 26, 2009

The Trac component loader

All modern web application frameworks need replaceable application components. Reasons for this requirement are plenty. Some applications will share common functionality with other applications such as identity management. However, having the ability to extend this functionality or replace it entirely is absolutely crucial. Technology requirements change too fast to assume that a single implementation of some feature will ever be sufficient for any significant length of time. Moreover, tools that are better for the job that your component currently does will emerge and developers need a way to exploit the benefit of these tools without having to hack the core system. Trac is a Python web-framework in it's own right. That is, it implements several framework capabilities found in other frameworks such as Django and TurboGears. Trac is highly specialized as a project management system. So, you wouldn't want go use Trac as some generalized web framework for some other domain. Project management for software projects is such a huge domain by itself that it makes sense to have a web framework centered around it. Trac defines it's own component system that allows developers to create new components that build on existing Trac functionality or replace it entirely. The component framework is flexible enough to allow loading of multiple format types; eggs and .py files. The component loader used by the Trac component system is in fact so useful that it sets the standard for how other Python web frameworks should load components.

The Trac component framework will load all defined components when the environment starts it's HTTP server and uses the load_components() function to do so. This function is defined in the loader.py module. This load_components() function can be thought of as the aggregate component loader as it is responsible for loading all component types. The load_component() function will accept a list of loaders in a keyword parameter. It uses these loaders to differentiate between component types. The parameter has two default loaders that will load egg components and .py source file components. The load_components() function will also except an extra path parameter which allows the specified path to be searched for additional Trac components. This is useful because developers may want to maintain a repository of Trac components that do not reside in site-packages or the Trac environment. The load_components() function also needs an environment parameter. This parameter refers to the Trac environment in which the loader is currently executing. This environment is needed by various loaders in order to determine if the loaded components should be enabled. This would also be a requirement of a custom loader if a developer was so inclined to write one. There is other useful environment information available to new loaders that could potentially provide more enhanced functionality.

As mentioned, the load_components() function specifies two default loaders for loading Trac components by default. These loaders are actually factories that build and return a loader function. This is done so that data from the load_components() function can be built into the loader function without having to alter the loader signature which is invoked by the load_components() function. This offers maximum flexibility. The first default loader, load_eggs(), will load Trac components in the egg format. This does so by iterating through the specified component search paths. The plugins directory of the current Trac environment is part of the included search path by default. For each egg file found, the working set object, which is part of the pkg_resources package, is then extended with the found egg file. Next, the distribution object, which represents the egg, is checked for trac.plugins entry points. Each found entry point is then loaded. What is interesting about this approach is that it allows subordinate Trac components to be loaded. This means if there is a found egg distribution containing a large amount of code and a couple small Trac components to be loaded, only the Trac components are loaded. The same cannot be said about the load_py_files() loader which is the second default loader provided by load_components(). This function works in the same way as the load_eggs() function in that it will search the same paths except instead of looking for egg files, it looks for .py files. When found, the loader will import the entire file, even if there is now subordinate Trac components within the module. In both default loaders, if the path in which any components were found is the plugins directory of the current Trac environment, that component will automatically be enabled. This is done so that the act of placing the component in the plugins directory also acts as an enabling action and thus eliminating a step.

There are some limitations with the Trac component framework. The _log_error() function nested inside the _load_eggs() loader shouldn't be a nested function. There is no real rationale for doing so. Also, loading Python source files as Trac components is also quite limiting because we loose any notion of subordinate components. This is because we can't define entry points inside Python source files. If building Trac components, I would recommend only building eggs as the format.

Monday, December 22, 2008

Gaphor services

The Gaphor UML modeling tool offers a simple service framework under the covers. The Gaphor application itself defines some core services that fit within this framework. For example, it defines an ActionManager service that is used to handle various actions within the application.

All services that are defined within Gaphor (or as extensions to Gaphor), implement the IService interface. This interface is defined in the interfaces module. Here is a simple illustration of what this interface looks like.



As you can see, this straight-forward interfaces requires that any Gaphor services implement an init and a shutdown method.

The ActionManager service is a good example of a Gaphor service. This service is responsible for maintaining actions in the application. It is also a service that is responsible for loading other services. For example, the FileManager service is also an action provider since it implements the IActionProvider interface. The ActionManager service loads all components that provide the IActionProvider interface and registers them as action providers within the application.

Friday, October 17, 2008

The turbogears database component

The TurboGears database component is a Python module which allows the framework to connect to some database specified in the project configuration. I'd like to explore some interesting features and implementation details of this component. No particular reason, I just find it interesting since I've built several projects using this framework. The source for this component can be found here (I'm pointing to the 1.0.7 tag because it is the most recent stable release).

The responsibility of this component is to provide database access to the TurboGears project that is being developed. This includes providing support for both SQLAlchemy and SQLObject. The database TurboGears component implements a conditional declaration of various SQLAlchemy and SQLObject classes and functions depending on which one is enabled for the project. This is done by attempting to import both packages. If both the SQLAlchemy and SQLObject packages are installed and importable, all declarations are executed as illustrated below.



I'm going to focus on SQLObject TurboGears support here because the current support for SQLObject is stronger than SQLAlchemy support. There are two classes defined by the TurboGears database component that help carry out SQLObject support. These classes are AutoConnectHub and PackageHub and are illustrated below.



Typically, in a TurboGears project, you'll have several database abstractions using SQLObject. In simple projects, these abstractions are store in model.py. However, you may have enough abstractions to justify spanning them across several modules. Each one of these modules defines a package hub to connect to a database. A package hub is a simple variable that states the name of the project package. Since this is a TurboGears project, it will module likely contain a configuration file which specifies a database connection string. Although, there are other ways to do this.

When TurboGears encounters a package hub, a new PackageHub instance is created. This instance then creates a AutoConnectHub instance by calling the set_hub() method. This AutoConnectHub instance that is now a member of the PackageHub instance is the bridge between your TurboGears project and the database connection.

However, for some reason, the set_hub() method is not invoked in the constructor of the PackageHub instance. It is invoked for all access methods (if needed). Perhaps there is good reasoning for this design but if it is possible, I wonder if it would make more sense to have this instance created in the constructor. If this were the case, it would eliminate the need for checking if the AutoConnectHub instance exists for every other method of the PackageHub instance.

One last improvement I think would make sense for this component is to classify some of the functions defined here. For example, there are several functions that are either SQLObject or SQLAlchemy specific. I think these should at least be class methods. There could be two classes here called SODBTools and SADBTools. The ORM-specific functions could then be moved to these classes accordingly. This is just a minor improvement I feel would have major design perks.