Using iterators and generators in multi-threaded applications
24 May 2012 – Bangalore
Python iterators and generators have almost the same behavior, but
there are subtle differences, especially when the iterator/generator
is used in a multi-threaded application.
In the iterator case, it only creates a race condition as multiple
threads are trying to update self.i at the same time. That is the
reason for seeing wrong output, and it will change everytime we run
the program.This can be easily fixed by protecting that of code using
a lock.
If we run the program now, we’ll get the excpected value for c2.
$ python count.py
...
c2 200001
The similar approach won’t work for generators as we don’t have
control over the calling of next method. Whatever changes we make to
the generator function, multiple threads can still call the next
method at the same time.
The only way to fix it is by wrapping it in an iterator and have a
lock that allows only one thread to call next method of the
generator.
Now you can take any iterator or generator and make it thread-safe by
wrapping it with threadsafe_iter.
This can be made still easier by writing a decorator.
Now we can use this decorator to make any generator thread-safe.