The documentation on the Counter class has some great examples showing off it's capabilities. I thought I would share my experience in taking a counter and using it in conjunction with coroutines. Dave Beazly has an excellent introduction to coroutines in Python. Including the coroutine decorator. You can do some interesting things with counters and coroutines own their own, but I figured I would try and combine the two.
What I came up with was a coroutine for feeding words into a counter. Another coroutine gets the instance of the counter every time it is updated with new words, and displays the most common word. Quite simple, but it accomplishes a lot given the amount of code.
"""
Simple word counter coroutines using the Counter
class.
"""
import sys, re, time
from random import choice
from collections import Counter
def coroutine(func):
"""
David Beazly's decorator to make a function a
coroutine.
"""
def start(*args,**kwargs):
cr = func(*args,**kwargs)
cr.next()
return cr
return start
@coroutine
def word_counter(target):
"""
The word_counter coroutine takes a target coroutine
to send the counter to when it gets updated. When
this coroutine receives data, the counter gets updated
with a new list of words. We then send the target
coroutine the counter instance.
"""
counter = Counter()
while True:
data = (yield)
counter.update(re.findall('\w+', data))
target.send(counter)
@coroutine
def most_common():
"""
The most_common coroutine receives counter instances
when the're updated, and prints the most common item,
along with it's count value.
"""
while True:
counter = (yield)
word, count = counter.most_common(1)[0]
sys.stdout.write(
'Most Common: "%s" (%d) \r' %\
(word, count)
)
sys.stdout.flush()
# Main Demo
if __name__ == '__main__':
# Static words used to generate text and feed the
# word_counter coroutine.
words = (
'Ada',
'Bash',
'C',
'Delphi',
'Erlang',
'Fortran',
'Groovy',
'Haskell',
'Java',
'Lisp',
'Python',
'Ruby',
'Smalltalk',
'Tcl',
'VisualBasic'
)
# Create our word counter, passing in the most_common
# coroutine as the target.
counter = word_counter(
most_common()
)
# Feed the coroutine for a while. Generate a line of
# text, and send it to the counter. We're sleeping so
# we can actually see the output.
for i in range(20):
line = ' '.join([choice(words) for i in range(100)])
counter.send(line)
time.sleep(0.5)
The word_counter coroutine creates the Counter instance and continues to feed it with strings as they're sent. On it's own, the word_counter coroutine wouldn't serve much purpose. It creates the counter instance and updates it as new text arrives. That is part one of it's responsibility. The second part is to notify a target coroutine when the counter changes state. That is, when new text arrives and is fed into the counter, we want to pass the counter down the coroutine pipeline for further processing.
And that is the essence of this example. When we created the word_counter coroutine, we passed it the most_common coroutine as the target. Remember, word_counter will send it's target the counter instance. All most_common does is take the counter and print some data about it. In this case, we can use these two coroutines to keep a real-time display of the most common word passed to the counter.
In summary, the Counter class keeps a tally on the most common word. We can use a coroutine process this data as new input becomes available. Notice that the word_counter coroutine intentionally passes the counter instance to a generic target. This means that we can easily swap out the most_common coroutine in favour of something else. Alternatively, the most _common coroutine could forward the counter onto another target coroutine for further processing. The choice really depends on the application, but there is much flexibility.
No comments :
Post a Comment