I almost exclusively use Python in my research. I write 3D interactive experiments using Panda3D and I collect, analyze, and visualize my data using NumPy, SciPy, and matplotlib. While I have been using Python for almost 5 years now, I only began using Python for scientific programming when I joined the Computational Cognitive Science Group at MIT. The sort of programming I do these days is very different from the software engineering I used to focus on.
In particular, I almost never use for loops when I am doing any form of serious number crunching. For loops do have applications, but I think they tend to be overused, especially in Python. There are two reasons:
 Almost anything numerical can be done with NumPy.
 Almost everything else can be done with list comprehensions.
So what? Does it really matter whether you use NumPy vs. a list comprehension vs. a normal for loop? It absolutely does and I will go through a few examples in this post.
Let me start with a story. Once upon a time, when I was still new to scientific computing, I needed to compute an array of correlations between my model’s predictions and human data for many different settings of the model’s parameters. Naively, my code involved a lot of for loops. Something like this:
1 2 3 4 5 6 7 8 9 

… ouch. I shudder to think that I used to code like that! What makes this such poor code is that I have four nested for loops (and possibly more, if I had more parameters). Wouldn’t it be so much nicer to do something like:
1 2 3 4 5 

Or, even better, if you could just operate over the arrays without any loops at all?
Well, you can!
In this post, I’m going to go through a few interesting examples of how using NumPy, list comprehensions, and for loops can be used for the same applications, and furthermore, how well these different approaches actually perform. You can find all of the code in this blog post here.
Timing
Let me first start by introducing a simple timing function that I’ll be using to make these comparisons (you’ll notice I do use a for loop here!):
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 

This function takes the names of an arbitrary number of local functions, runs each function 10000 times, and prints out the mean and standard deviation of these runs.
Ranges and Lists
Ok. One of the simplest things you need to be able to do in python is to create a list of numbers, for example, from 0 to 1000 (exclusive):
1 2 3 4 5 6 

And when we time this:
1 2 3 4 5 

The fastest method is xrange
. This isn’t terribly surprising because
xrange
doesn’t actually create a list of numbers, it creates a
generator which will produce a list of numbers. If we actually force
it to be a list:
1 2 

And time it:
1 2 3 

We see that xrange
isn’t actually faster than numpy.arange
. So,
this is a case where you should keep your application in mind (e.g.,
if you need a list of so many numbers that it won’t fit in memory, you
would want to use xrange
).
Another thing to keep in mind is whether you need the list to work
with other NumPy functions or not. NumPy will convert lists and
tuples to ndarray
types, and this can actually take a significant
amount of time. Let’s look at some conversions to and from NumPy
arrays:
1 2 3 4 5 6 

And timing them, we see it it is an order of magnitude slower to create these lists and convert them than just creating them as the appropriate type!
1 2 3 4 5 

Array Operations
Let’s move on to actually using our arrays to do things.
Sums
One really common operation is summing a list of numbers; NumPy here has the advantage both in terms of readability…
1 2 3 4 5 6 

… and speed (though you’ll notice that the mean runtime of
loop_sum
is within one standard deviation of numpy_sum
’s mean, so
this isn’t very significant):
1 2 3 4 

Mean and Standard Deviation
We often want to do more complex array operations, such as finding the mean and standard deviation of a list of numbers. NumPy makes this pretty trivial compared to writing these functions by hand:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 

The difference in speed is pretty significant, too. NumPy is primarily written in C, so all of its functions are incredibly optimized; unsurprisingly, they’re faster than unoptimized pure Python:
1 2 3 4 5 6 

Squares
I mentioned at the beginning of this post that many loops could be replaced with list comprehensions. In particular, if you have a loop where you are doing some computation and then appending to a list, you should probably be using a loop comprehension instead. For example, let’s say we want a list of the first 10000 squares:
1 2 3 4 5 6 7 8 

With the exception of the append, listcomp_squares
is basically the
same as loop_squares
. However, this small change can make a huge
difference:
1 2 3 4 5 

I’ll note that NumPy is, again, the faster choice for this operation. However, NumPy does not have an optimized function for everything. When this is the case, you should fall back on list comprehensions and not for loops!
Indexing
Let’s end with a complex application of NumPy and list comprehensions
over for loops. One particular application that I frequently need is
to run models on all possible settings of several parameters – for
example, if I have x=(0, 1, 2)
and y=(0, 1)
, I would want to run
my model on all (x, y)
pairs in
[(0, 0), (0, 1), (1, 0), (1, 1), (2, 0), (2, 1)]
. With many
parameters, computing these combinations can become messy.
NumPy gives us some magic that makes it really easy to compute these
settings. Importantly, we’re going to use numpy.ix_
, which (from the
documentation) “takes N 1D sequences and returns N outputs with N
dimensions each, such that the shape is 1 in all but one dimension and
the dimension with the nonunit shape value cycles through all N
dimensions”. So we’ll use numpy.ix_
to create arrays for each
dimension/parameter, and then concatenate these arrays together to get
the final array of shape (n, d)
(where n
is the number of
combinations, and d
is the number of dimensions). The biggest
downside is that this function is somewhat difficult to understand
without a solid knowledge of NumPy.
1 2 3 4 5 6 7 8 

We can do the same without using NumPy, but it’s a much larger
function (though perhaps easier to understand if you don’t know what
numpy.ix_
does). The approach I’ll take with this function is to
first determine the number of combinations, create a big list of the
appropriate size, and then fill it in by looping over each of the
input lists. One benefit to this is that we could use loop_make_grid
to make combinations of strings, whereas numpy_make_gride
requires
numerical combinations.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 

Let’s now compare the runtime of these two approaches. We’ll make a 1000 by 3 array for three lists of ten numbers:
1 2 3 4 

And actually timing it, we see that numpy_make_grid
is a lot faster
than loop_make_grid
:
1 2 3 4 

Conclusion
When working with numbers, NumPy is a significantly faster option in many cases. So, if you are doing any form of scientific computing, I highly recommend learning how to use it well! You can find pretty good documentation on using NumPy arrays at http://docs.scipy.org/doc/numpy/user/.
That said, NumPy is not always appropriate. If you’re doing something more procedural – for example, computing a kernel function on a matrix – you may find that NumPy doesn’t have the nice optimized functions you want. In these cases, try to use list comprehensions as much as possible. If you’re smart about using NumPy arrays and list comprehensions, you may find you can get a considerable speed up in your code!