Two weeks ago, I wrote a post called Python: tips, tricks and idioms, where I went through many features of python. However, I want to narrow down on just a few and look at them in more depth. The first is decorators, which I did not cover, and the second is context managers, which I only gave one example. Again all the code samples are on gist.github.com.
There is a reason that I put them together; they both have the same goal. They can both help separate what you are trying to do (the “business logic”) from some extra code added for clean-up or performance etc. (the “administrative logic”). So basically, it helps package away in a reusable way code that we don’t care too much about.
Decorators
Decorators are easy to use, even if you have never seen them. You could probably guess what is going on, even if not how or why. Take this for example:
@cache def web_lookup(url): page = urlopen(url) try: return page.read() finally: page.close()
Overlooking for now precisely what library urlopen comes from… We can assume that the results of web_lookup() will be cached so that they are not fetched every time we ask for the same URL. So simple, just know we can put @some_decorator before our function, and we can use any decorator. But how do we write one?
First, we need to understand what the decorator is doing. @cache is just syntactic sugar for the following.
web_lookup = cache(web_lookup)
So this is important. What cache is, is a function that takes another function as an argument and returns a new function that can be used just like the function could be before, but presumably adding in some extra logic. So for our first decorator, let us start with something simple, a decorator that squares the result of the function it wraps.
def square(func): def _square(num): return func(num) ** 2 return _square # which we can then use like this @square def plus(num): """Adds 1 to a number""" return num + 1
So in this little example, every time we call plus()
the number will have 1 added to it, but because of our decorator, the result will also be squared.
But there is a problem with this, plus()
is no longer the plus()
function that we defined, but another function wrapping it. Such things like the doc string have gone missing. Things like help(plus) will no longer work. But in the functools library, there is a decorator to fix that, functools.wraps()
. Always use functools.wraps()
when writing a decorator.
from functools import wraps def square(func): @wraps(func) def _square(num): return func(num) ** 2 return _square
But did you notice? wraps()
is a decorator that takes arguments. How do we write something like that? It gets a little more involved, but let us start with the code. It will be a function that now raises the number to some power.
def power(pow): def _power(func): @wraps(func) def _pow(num): return func(num) ** pow return _pow return _power @power(2) def plus(num): """Adds 1 to a number""" return num + 1
So yes, three functions, one inside the other. The result is that power() is no longer the decorator but a function that returns the decorator. See, the issue is one of scoping, which we why we have to put the functions inside each other. When _pow() is called the value of pow
, comes from the scope of the power()
function that contains it.
So now we know how to write highly reusable function decorators, or do we? There is a problem still, and that is our internal function _square()
or _pow()
only takes one argument, so any function it wraps can take only one argument. What we want is to be able to have any number of arguments. So that is where the star operator comes in.
Star operator
The * (star) operator can be used in a function definition to take an arbitrary number of arguments, all of which are collapsed into a single tuple. An example might help.
def join_words(*args): """Joins all the words into a single string""" return " ".join(args) print(join_words("Hello", "world"))
The * operator can also be used for the reverse case when we have an iterator, but we want to pass that as the arguments to a function. This gets called argument unpacking.
words = ("Hello", "world") print(join_words(*words))
The same basic idea can also be used for keyword arguments. For this, we use the ** (double star) operator. But instead of getting a list of the arguments, we get a dictionary. We can also use both together. So some examples.
def print_args(*args, **kwargs): print(args) print(kwargs) print_args("Hello", "world", count=2, letters=10) # output: # ('Hello', 'world') # {'count': 2, 'letters': 10} # or calling the function with argument unpacking. words = ("Hello", "world") arguments = {'count': 2, 'letters': 10} print_args(*words, **arguments)
Better Decorators
So now we can go back and change our decorator to be truly generic. Let’s do it with the simplest one, we wrote, @square
.
def square(func): @wraps(func) def __square(*args, **kwargs): return func(*args, **kwargs) ** 2 return __square
Now no matter what arguments the function takes, we will happily just pass them through to the function we are wrapping.
So let us go back to our web_lookup
function and write it first with caching, and then write the decorator to see the difference.
saved = {} def web_lookup(url): if url in saved: return saved[url] page = urlopen(url) try: data = page.read() finally: page.close() saved[url] = data return data
That is how it might look, and our problem here is that the caching code is mixed up with what web_lookup() is supposed to do. It makes it harder to maintain it, harder to reuse it, and harder to update the way we cache it if we have done something like this all over our code. So our very generic decorator might look like this.
def cache(obj): saved = obj.saved = {} @functools.wraps(obj) def memoizer(*args, **kwargs): key = str(args) + str(kwargs) if key not in saved: saved[key] = obj(*args, **kwargs) return saved[key] return memoizer # now our nice clean web_lookup() @cache def web_lookup(url): page = urlopen(url) try: return page.read() finally: page.close()
So that can wrap any function with any number of arguments just by putting @cache
before it. But I did not write that function myself. I just lifted it right off the Python Decorator Library, which has many examples of decorators you can use.
Context Managers
In the previous post, I did a single example of using a context manager, opening a file. It looked like this:
with open('/etc/passwd', 'r') as f: print(f.read()) # which is equivalent to the longer f = open('/etc/passwd', 'r') try: print(f.read()) finally: f.close()
Admittedly the context manager is only a little shorter, and the file will be garbage collected (at least in CPython), but there are other cases where it might be a bigger problem if you get it wrong. So, for example, the threading library can also use a context manager.
lock = threading.Lock() with lock: print('Critical section')
This is nice and simple, and you can see from the indent what the critical section is and be sure the lock is released.
If you are dealing with a file-like object that doesn’t have a content manager, the contextlib has the closing context manager. So let us go back and improve our web_lookup()
function.
from contextlib import closing def web_lookup(url): with closing(urlopen(url)) as page: return page.read()
We can also write our own context managers. All that is needed is to use the @contextmanager decorator and have a function with a yield
in it. The yield
marks the point at which the context manager stops while the code within the with
statement runs. So the following can be used to time how long it takes to do something.
from contextlib import contextmanager import time @contextmanager def timeit(): start = time.time() try: yield finally: print("It took", time.time() - start, "seconds") # this might take a few seconds with timeit(): list(range(1000000))
The try finally
in this case, is optional, but without it, the time would not be printed if there was an exception raised inside the with
statement.
We can also do more complicated context managers. The following is something like what was added in Python 3.4. It will take whatever is printed to sysout and put it in a file (or file-like object). So, for example, if we had all our timeit()
context managers in your code and wanted to start putting the results into a log file. Here the yield
is followed by a value, which is why we can then use the with ... as
syntax.
from contextlib import contextmanager import io, sys @contextmanager def redirect_stdout(fileobj=None): if fileobj is None: fileobj = io.StringIO() # in python 2 use BytesIO oldstdout = sys.stdout sys.stdout = fileobj try: yield fileobj finally: sys.stdout = oldstdout with redirect_stdout() as f: help(pow) help_text = f.getvalue() with open('some_log_file', 'w') as f: with redirect_stdout(f): help(pow) # above can be also written as with open('some_log_file', 'w') as f, redirect_stdout(f): help(pow)
The last with statement also shows off the use of compound with statements. It is
just the same as putting one with
inside another.
Finally, at least in passing, it is worth mentioning that any class can be turned into a context manager by adding the __enter__() and __exit__() methods. The code in either will do more or less what the code on either side of the yield statement would do.
And that is all for this round. I hope you learned something new and interesting. Don’t forget to follow me on Twitter if you want more python tips, such as when I write about sets and dictionaries next time. In the meantime, if you are looking for more, there is an excellent book, Python Cookbook, Third edition by O’Reilly Media. I have been reading parts of it and might include a few things I learned from it in my next post. Or, if you want something simpler, try Learning Python, 5th Edition.