In the latest release of the
boduch Python library, Set instances can now be iterated over. This is done by defining a custom iterator class,
SetIterator, that is returned by the
Set.__iter__() method. I thought I would further test out this new functionality in a hope that I would discover some new unit tests that I can include with the library. But before I could even get to the
Set iteration testing, I discovered an entirely new bug with the
Set class.
Firstly, here is the code I used to find the bug.
#Example; boduch Set bug.
from boduch.data import Set
from boduch.handle import Handle
from boduch.event import subscribe, threaded, EventSetPush
class MyHandle(Handle):
def __init__(self, *args, **kw):
Handle.__init__(self, *args, **kw)
def run(self):
pass
if __name__=="__main__":
threaded(True)
subscribe(EventSetPush, MyHandle)
set_obj1=Set()
set_obj2=Set()
set_obj1.push("data1")
set_obj2.push("data2")
print "SET1",set_obj1.data
print "SET2",set_obj2.data
Here, we defined a custom event handle called
MyHandle. I the run method doesn't actually do anything because I discovered the bug before I wrote any handling code. In the main program, we set the event manager to threaded mode. Next, we subscribe our custom event handle to the
EventSetPush event. This means that every time
Set.push() is invoked, so is
MyHandle.run() (in a new thread since we are running in threaded mode here). We then create two set instances and push some data onto each set. Finally, we print the underlying
Python lists associated with each set instance.
Here is my initial output.
SET1 ['data1', 'data2']
SET2 ['data1', 'data2']
Slightly different from what was expected. Each set instance should have had one element each. Instead, the lists look identical. Naturally, I assumed that they were the same list. This lead me to start examining the thread manager, thinking that since I was testing in threaded mode, there must be some sort of cross-thread data contamination. Thankfully, the problem got much simpler since I was able to eliminate this as a potential cause. Next in line, the event manager. I tried everything to try and prove that the
Set instances were in fact the same instance. Not so. The instances had different memory addresses.
I then realized that
Set inherits from
Type but the constructor for type is not invoked. Odd. I tried to think of a reason why I would want to inherit something with no static functionality and not initialize it. I think I may have done this because the underlying list instances of
Set objects are stored in an attribute called data.
Type instances also define a
data attribute. I must have thought, during the original implementation, that defining a
data attribute for the
Set class would have some adverse effect on the
Type functionality. Not so. So now, the
Type constructor is invoked but with no parameters. This means that the initial value of the
Set.data attribute is actually an empty dictionary since this is what the
Type constructor will initialize it as. The
Set constructor will then initialize the
data attribute to a list accordingly.
This however, wasn't the problem either. I was still getting the strange output that pointed so convincingly at the fact that the
Set.data attribute was pointing to the same list instance. So, I took a look at the way in which the
data attribute is initialized for
Set instances. The
Set constructor will accept a
data keyword parameter. The default value of this parameter is an empty list. This parameter then becomes the
Set.data attribute. Just for fun, I decided to take away this parameter and have the
data attribute be initialized as an empty list inside the constructor.
Sure enough, that did it. I got the correct output for my two set instances. The
data attribute must have been pointing to the same keyword parameter variable. I have a felling that this may be caused somewhere in the event manager. Or maybe not. I haven't tested this scenario outside the library yet.
I'm going to get this release out as soon as possible. The
data keyword parameter for the
Set constructor will be removed for now. As a side note, this will also affect the iteration functionality for
Hash instances in the next release since the
Hash.__iter__() method will return a
SetIterator instance containing the hash keys. Each key will simply need to be pushed onto the set instead.