Showing posts with label turbogears. Show all posts
Showing posts with label turbogears. Show all posts

Tuesday, October 13, 2009

Shrinking Python Frameworks

An older entry by Ian Bicking talks about the shrinking world of Python web application frameworks. It is still a valid statement today, I think. There is no shortage of Python web application frameworks in which to choose from. Quite the contrary, it seems that a new one springs into existence every month. This often happens because a set of developers have a very niche application to develop and the existing web application frameworks don't cut it. Either that or they are missing a feature or two, or they have too many moving parts and so they will make some modifications. Whatever the difference, some developers will release their frameworks as an open source project.

The shrinking aspect refers to the number of frameworks which are a realistic choice for implementing a production grade application. Most of the newer Python web application frameworks, still in their infancy, are simply not stable enough.

Take Pylons and TurboGears for instance. Both are still OK web frameworks, you can't have TurboGears without Pylons now. However, they are somewhat problematic to implement applications with. Even if stable enough, there are complexities that shouldn't exist in a framework. Besides, I have yet to see a stable TurboGears release.

Taking complexity to a new level is Zope. This framework has been around for a long time and is extremely stable. But unless you have been using it for several years, it isn't really worth it because of the potential for misuse is so high.

The choice of which Python web application framework to use really comes down to how much programming freedom you want. If you want everything included, Django does everything and is very stable. However, if there are still many unknowns in the application in question, there are many stable frameworks that will simply expose WSGI controllers and allow the developers to do as they will.

Thursday, April 23, 2009

TurboGears i18n catalog loading

In a perfect world, every single application would have full translations of every major spoken language. Applications having this ability is referred to as internationalization. Internationalization become increasingly more important when dealing with web applications. The whole idea behind web applications is portability amongst many disparate clients. Thus, it is safe to assume that not all of these clients are going to speak a universal language. Fortunately, there are many internationalization tools available at developers disposal. One of the more popular tools on Linux platforms is the GNU gettext utility. In fact, Python has a built-in module that is built on top of this technology. Taking this a step further, some Python developers have built abstractions on top of the gettext module, providing further internationalization capabilities for Python web applications. The gettext Python module uses message catalogs to store the translated messages used throughout any given application. These catalogs can then be compiled into a more efficient format for performance purposes. This method of using a single message catalog would suffice for monolithic applications that cannot be extended. However, all modern Python web application frameworks can be extended one way or another. These components that extend these frameworks are more than likely to need internationalization of messages. The problem of storing and using translated messages suddenly becomes much more difficult.

The TurboGears Python web application framework provides internationalization support. TurboGears provides a function that will translate the specified string input. Of course, the message must exist in a message catalog but that is outside the scope of this discussion. The tg_gettext() function is the default implementation provided by the framework. The reason it is the default implementation and not the only implementation is because a custom text translation function may be specified in the project configuration. Either way, custom or default, the text translation function becomes the global _() function that can then be used anywhere in the application. Below is illustration of how the tg_gettext() function works.



The get_catalog() function does as the name says and returns the message catalog of the specified locale. The function will search the locale directory specified in the i18n.locale_dir configuration value. Once a message catalog has been loaded by TurboGears, it is then stored in the _catalogs dictionary. The _catalogs dictionary acts as a cache for message translations. Illustrated below is the work-flow of this function.



As mentioned earlier, this method of using a single message catalog is all well and good until your application needs to be extended by third-party components. If these components have their own message catalogs, any text translation calls made to _() will use the base application catalog. The reason being, the first time a catalog is loaded, it stays loaded because of the _catalogs cache. This means that even if the extension to the web application framework were to update the i18n.locale_dir configuration value, it will make no difference if the specified locale has already been cached. Ideally, the get_catalogs() function could check the _catalogs cache based on the locale directory rather than the locale.

Monday, March 23, 2009

Registering configuration values in ECP

The Enomaly Elastic Computing Platform has an extension module API that allows developers to register new ECP components. These components include new web controllers, new RESTful API controllers, and so on. One component that cannot be registered are configuration values. Extension modules can be viewed as smaller applications that are executed within ECP. Therefore, these smaller extension module applications will need to be configured. There are always going to be values that should be configurable within any application such as storage locations. Currently, extension modules must implement their own settings abstractions. This functionality already exists in the ECP core and the way configuration values are accessed and stored should be consistent and hence the need for the custom settings class. It would make sense for extension modules in ECP to have the ability to register their own configuration values. This way, configuration values would be accessed and stored in the exact same way across the platform. An additional complication arises when trying to use the configuration editor. The configuration editor is tightly-coupled with TurboGears widgets and thus requires that all extension modules be tightly-coupled with TurboGears widgets. Ideally, when configuration values are registered, which currently is not possible, additional meta data suitable for generating a display widget for the configuration value could also be registered.

The current implementation of the ECP Settings class uses managed Python attributes to seamlessly save and load configuration values. Every time a managed Settings attribute is accessed, the Variable class will attempt to load the variable. Likewise, when a managed Settings attribute is altered, the Variable class will attempt to store the configuration value. It is easier to use managed attributes for simple storage and retrieval operations. The alternative is to use the Variable class directly. In fact, earlier implementations of ECP did exactly that. Every time a configuration value was needed, we had to invoke Variable.load() while specifying a default value in case the configuration value didn't exist. The new Settings class was introduced to help alleviate some of this troubled configuration access. A single instance of the Settings class is created in the configuration.py module. This instance can then be used throughout the ECP application, including extension modules. Configuration categories are also incorporated into the Settings class. This is done by using the same concept as the Settings class for each category. This category class is then set as an attribute of Settings. This allows us to access configuration values in the form of settings.kvm.bridge. This syntax offers much more readable code when used in context. However, the problem with this method of managing configuration values was soon after realized. There will always be a need to add new configuration values. Most noteably, extension modules are going to need this capability since developers are going to want to access configuration values in the same way as the rest of ECP. There is a need to be able to register new configuration values. This eliminates the extensibility problem of adding new configuration values. If every time a new configuration value needed by an extension module, or the core application for that matter, needs to be added to the Settings class, it will grow exponentially and become very challenging to maintain. Additionally, the configuration editor is very tightly coupled to TurboGears widgets because extension modules need to display these configuration values in the configuration editor. This is done by the extension module defining a hook that passes in TurboGears widgets used to display the configuration values for the extension module in the configuration editor. This isn't the ideal method since this also couples the extension modules to ECP dependencies (TurboGears). Ideally, the widgets for displaying configuration values should be generated by the configuration editor based on minimal meta-data provided by the extension module at registration time.

The new approach to ECP configuration value management is to have configuration values registered in the Settings class. The same approach of using managed attributes to access and store configuration values is still used. What is different is the ability to register a value and have these managed attributes automatically built for the developer. This is accomplished by introducing a new MetaSettings class. The purpose behind this new class is to dynamically construct new categorization classes and methods that will become attributes of the settings instance. There is also a new settings.register() method that can be used to register new configuration values. The end result of using settings.register() to register a new configuration value is the same syntax as before when using the configuration values. The name of the module passed to settings.register() will become an attribute of the settings instance. There are also meta-data parameters in the settings.register() method that allow developers to specify a title and description of the configuration value. In the current ECP configuration management implementation, this information must be specified in the TurboGears widget. With the new implementation, the managed attribute functionality found in the ECP core no longer needs to be duplicated. There is now a much more uniform interface.

With this new configuration registration functionality in place, there is now an opportunity for great improvements in the configuration editor. We could now potentially eliminate the coupling to TurboGears widgets and have each configuration widget generated automatically. Grouping by extension module is now also possible in the configuration editor.

Wednesday, March 18, 2009

Evaluating file-monitoring techniques in Python

There is a general need to monitor changes made to files in any computer system. The question being, why? The short answer being that when a file has changed, the state of the system has also changed and there are going to be reactions to that change in state. These events that take place in response to file changes usually happen at a very low system level. At the application level, there is also a need to monitor the system state or sub-states such as files. For instance, if we are working with a web development framework such as TurboGears and we want the development HTTP server to reload every time a source code file changes, those files need to be monitored. Once a change has been detected by the monitoring process, the process can then reload the HTTP server. Another use for files is to communicate between different processes within a system. One process, or many processes can monitor a given file and react accordingly when the state of that file changes.

There are two approaches I'm going to evaluate here. The first is the CherryPy method which is non-blocking. The second is the generic method which is blocking. Although the two methods differ at a higher level, they are the same in that they determine if a file has changed state based on the modification date.

The first of the two methods is a blocking method. This method will block the flow of control within an application. This means that any code that comes after this logic, will not be executed until this file monitoring code is complete. The reason this method is blocking to begin with is it involves a loop and breaking out of it is the only way to terminate the file monitoring logic. The developer using this method can specify an interval at which the logic will test the specified files for changes. It will check the modification date of the file and compare it to the last modification date recorded by this method. If the date is later than the last recorded date, the file has changed. Obviously the more files being monitored, the more overhead involved so care must be taken to not overload the system by monitoring too many files. This method of file monitoring is generic and can be used in many contexts with little modification since it wasn't designed with any particular application domain in mind.

The non-blocking method is based on the CherryPy Python web application framework. It uses this non-blocking method to monitor changes made to Python modules within a given CherryPy application. Once a module state change has been detected, it is an indication that the HTTP server should be restarted to reflect those changes in the running application. The monitoring logic is periodically run in a separate thread of control at a specified interval. This means that is the server is in the middle of processing a request, the control flow does not block in the middle of the request entirely. The method used to determine if the file has been changed is the same as the blocking method. The modification date is compared to the previous modification date. This method of monitoring for file state changes on a file system is a very elegant solution. The main downfall is that it is context-specific. It was designed with CherryPy in mind. However, it is not so tightly-coupled to CherryPy that it could not be used somewhere else. Some minor changes would do the trick.

Lastly, if you need to monitor file system state changes, which method do you want to use? That is, which method is best suited for your application? There are a couple factors to consider. First, if your application can only support a single process, the blocking method is out of the question. However, this is rarely the case. The application could simply spawn a new file monitoring process. This could introduce a new problem though because the file monitoring process would need to communicate to the main application process. Having done this, you will have introduced a new communication channel in your application and thus increasing the complexity considerably. The CherryPy, non-blocking, file monitoring approach could prove to be the better approach if the application you are developing needs to react to file system state changes as well as changes in state from other resources. The challenge here is that it is not nearly as generic as the blocking method and would require a larger development time investment. Do some investigation as to what state changes your application must respond to. If only file system state changes, the blocking method may suffice. In must other cases, the non-blocking approach may be better suited.

Friday, March 13, 2009

The need for a simplified pypi package.

Given the growing complexity of many Python applications these days, developers often use other packages and libraries to help manage this complexity. TurboGears, for instance, will fetch several other packages from PyPi when it is installed and install these packages as well. PyPi provides access to packages that the Python community has provided, possibly because they feel it will be useful in a different context.

However, the PyPi code itself isn't exactly a simple Python package used to host egg files. It is a full-featured, hosted solution. The setuptools package can fetch eggs listed on a simple HTML page. In this case, you wouldn't even need anything other than Apache. However, what would be nice, is a middle-ground. A Python package that uses CherryPy or some other web framework to host the actual packages and provides a very simplistic management interface. I think something like this would be very valuable for packages that are limited by having to retrieving dependencies from PyPi and would need their own repository.

I'm not too sure how difficult this would actually by to implement, I'm only thinking of the need for such a solution at the moment. Perhaps I'll do some experimentation and write about what I find.

Tuesday, March 10, 2009

Babel in the Trac trunk.

With the Trac 0.11 branch, internationalization and localization are not possible. I do, however, like the approach that are taking to incrementally implementing this feature. In the 0.11 branch, the gettext() function is actually defined in translation.py. Although it doesn't do anything useful in this branch of the software, it is implemented. More importantly, it is used throughout the application. The pipes of the Trac translation architecture have been assembled, they just don't have anything flowing through them yet.

In the trunk version of translation.py, we have something radically different. There is a new TranslationsProxy class. This class is considered a proxy because the original translation functions still exist only now they invoke methods in this class. Here is an illustration of the TranslationsProxy class.



The attributes are as follows.
  • The _current attribute is the current thread of control.
  • The _null_translations attribute is an empty message catalogue.
  • The _plugin_domains attribute is a dictionary that contains locale domains that may be added to the TranslationsProxy in a programmatic way.
  • The _plugin_domains_lock is a threading lock that is acquired when loading
The need for the _current and _plugin_domains_lock attributes arises when activating a locale. The threading lock is acquired while loading plugin translation domains. Once loaded, the lock is released and the translation message catalogue is set as an attribute on the current thread. This allows different threads of control to have different message catalogues.

I really enjoy the way the plugin message catalogues are merged with the existing catalogue in a thread-safe way. Just based on this feature alone I'm very exited about the next major Trac release.

I wonder if some of the other Python web-frameworks out there handle internationalization message catalogue extensibility this way? I'm sure TurboGears doesn't. I've found that having only a single catalogue for my applications to be very limiting.

Saturday, January 10, 2009

TurboGears 2.0 and SQLObject support

Since I use the TurboGears Python framework quite often, I was curious to see if SQLObject would still be supported by TurboGears 2.0. Searching around on the web yielded no results so I took look at the TurboGears source.

It does not look like SQLObject will be around in TurboGears 2.0.

Wednesday, December 3, 2008

New tools in the TG controllers module

After taking a look at some changes made to the TurboGears controller module in trunk, I must say I'm impressed with the improvements over the current 1.0.x branch.

The first change I noticed was that all classes are now derived from the WSGIController class from the Pylons Python web framework. Also new, and the most interesting in my view, are the hook processing mechanism implemented by the DecoratedController class. What this means is that developers writing TurboGears applications can define hooks that are processed in any of these controller states:
  • before validation
  • before the controller method is called
  • before the rendering takes place
  • after the rendering has happened
If nothing else, I think this will add great value in monitoring the state transitions in larger TurboGears applications. Some requests can be quite large; especially during development time and it is handy to know where this requests are failing. You can now easily log attribute values of your controller instance before validation takes place. This could give some insight as to why the validation is failing with valid values. These hook processors also allow for pre and post processing for every state transition within the controller life-cycle.

It looks like using a controller is not all that different from the current TurboGears. Simply extend the TGController class and expose your methods as needed.

Tuesday, November 4, 2008

Some TurboGears configuration thoughts

The TurboGears Python web framework uses Cherrypy as the web server. Cherrypy offers several other useful features other than strictly "web server" functionality. One of these features is project configuration. TurboGears basically builds on the Cherrypy configuration functionality. TurboGears also uses a Python package called ConfigObj to help distribute responsibilities. If there is one thing the TurboGears project does well, it would have to be reusing existing functionality rather than rebuild it.

When retrieving a configuration value in a TurboGears project using get(), TurboGears simply uses the Cherrypy configuration. However, the TurboGears framework is responsible for keeping the Cherrypy configuration component up to date. For example, lets say you need to update the project configuration dynamically as the result of some client request. This can be done is one of two ways. The first, you could pass a key/value dictionary to update(). The second, you could pass a configuration file to update_config(). In both cases, we are essentially updating the Cherrypy configuration. The ConfigObj package comes in handy when we need to read configuration files. Not when we already have a key/value configuration dictionary.

My main criticism of the TurboGears config module is that I wish there were a factory function that simply generated a configuration instance. This instance would be a representation of the configuration for the entire project. All the functions that are currently defined in the TurboGears config module could be instance methods of this new configuration class. I haven't yet looked in the trunk yet to see whats happening there with the config module.

Friday, October 17, 2008

The turbogears database component

The TurboGears database component is a Python module which allows the framework to connect to some database specified in the project configuration. I'd like to explore some interesting features and implementation details of this component. No particular reason, I just find it interesting since I've built several projects using this framework. The source for this component can be found here (I'm pointing to the 1.0.7 tag because it is the most recent stable release).

The responsibility of this component is to provide database access to the TurboGears project that is being developed. This includes providing support for both SQLAlchemy and SQLObject. The database TurboGears component implements a conditional declaration of various SQLAlchemy and SQLObject classes and functions depending on which one is enabled for the project. This is done by attempting to import both packages. If both the SQLAlchemy and SQLObject packages are installed and importable, all declarations are executed as illustrated below.



I'm going to focus on SQLObject TurboGears support here because the current support for SQLObject is stronger than SQLAlchemy support. There are two classes defined by the TurboGears database component that help carry out SQLObject support. These classes are AutoConnectHub and PackageHub and are illustrated below.



Typically, in a TurboGears project, you'll have several database abstractions using SQLObject. In simple projects, these abstractions are store in model.py. However, you may have enough abstractions to justify spanning them across several modules. Each one of these modules defines a package hub to connect to a database. A package hub is a simple variable that states the name of the project package. Since this is a TurboGears project, it will module likely contain a configuration file which specifies a database connection string. Although, there are other ways to do this.

When TurboGears encounters a package hub, a new PackageHub instance is created. This instance then creates a AutoConnectHub instance by calling the set_hub() method. This AutoConnectHub instance that is now a member of the PackageHub instance is the bridge between your TurboGears project and the database connection.

However, for some reason, the set_hub() method is not invoked in the constructor of the PackageHub instance. It is invoked for all access methods (if needed). Perhaps there is good reasoning for this design but if it is possible, I wonder if it would make more sense to have this instance created in the constructor. If this were the case, it would eliminate the need for checking if the AutoConnectHub instance exists for every other method of the PackageHub instance.

One last improvement I think would make sense for this component is to classify some of the functions defined here. For example, there are several functions that are either SQLObject or SQLAlchemy specific. I think these should at least be class methods. There could be two classes here called SODBTools and SADBTools. The ORM-specific functions could then be moved to these classes accordingly. This is just a minor improvement I feel would have major design perks.

Thursday, September 18, 2008

Dynamic turbogears widgets and validation.

If you are using TurboGears widgets and have a need to build widgets dynamically within the context of a request, it simply doesn't work. Well, that's not entirely true. Here is a case where it is.

Say you are displaying a form to a user that displays a selectbox widget. This selectbox widget contains a list of languages supported by your application that the user can choose. For this, you would use a TurboGears widget. Now, say that you also want the default selection to be the language of the requesting client. This is a case where the widget would need to be dynamically constructed within the context of the request.

So far, so good. We can build TurboGears widgets in web controllers no problem. Now, what if we want to enforce some validation on the language selector? We use a TurboGears validator to say this widget may not be empty say. Well, when the widget validation fails, so does your widget because it can no longer be constructed. This is because the validation logic is taken care of for the developer outside the controller.

This is an obscure case but it is nonetheless realizable. Perhaps the TurboGears validation decorator should offer more flexibility in how widget failure validation is handled. Maybe this is already possible?

Thursday, August 7, 2008

The future of TurboGears

I wonder what the future holds for TurboGears? The current version is 1.0.5 and it doesn't look like the next major release is going to be available anytime soon.

Hopefully this changes. There are a lot of great features that the TG framework provides and it would be a shame if developers start moving away due to lack of development. That's not to say that there isn't ongoing development on the project, the releases just need to happen more frequently. There is talk about the 2.0 TG release being based on the Pylons web framework. This could be really good except for the fact that there nothing stopping developers from using Pylons on its own.

So, the future looks bleak for TurboGears. I hope I'm wrong and the project stays on track while keeping people interested in the project. This is especially hard with Django out in the wild.

Wednesday, July 9, 2008

Try something new or use what you know?

Good question. What would a software object do? Well, it will do what its creator tells it to do. And a software object knows only what its creator tells it. This doesn't seem right. What about the system in which the object lives? The system is alive and constantly changing state. Shouldn't the software object know about these changes?

Well, in some cases, they do. You pass external variables into your object by sending your object messages. This is the fancy term for "method invocation". Your object then has knowledge about its environment.

But what about your software object's attribute that are initialized during construction? Sometimes these values are hard-coded and can only be changed further down the lifeline of the object. Is this the best way to initialize attribute values? Or are we better off allowing for the possibility of dynamic attribute value initialization?

TurboGears allows developers to retrieve configuration values using a key. It also allows the developer to specify a default value should the key not exist. Using this feature, or a similar pattern for that matter, we can initialize attribute values by checking if there is something new that our software object didn't previously know about. If not, it will always use the specified default.

Thursday, June 26, 2008

A simple TurboGears i18n principle.

I've been doing a lot of internationalization improvements in Enomalism lately and I've noticed a generally good practice to use when building i18n applications.

TurboGears adds a built-in _() Python function that serves as a basic message translator. During run-time, if the label is found in the message catalog for the current locale, the label is replaced. Simple enough.

Things tend to get messy when there are %s string formatters in the label. I find that this severely limits the opportunity for language label re-use.