I first heard about Python pandas from a friend at RenTech or AQR in the early summer of last year. At the time, the project was little more than a documentation page and a few wrapper methods around numpy. Since then, pandas has matured into a very useful and featureful addition to numpy, scipy, and matplotlib. As I often share code based on development branches, I thought I would share the steps to build this library from development source.
First, make sure you have the necessary git and development packages. I’ll assume you’re on a Debian flavor like Ubuntu and have apt-get, but the packages are similar for yum/RPM distributions as well.
Install git, compiler, setuptools, and cython with apt-get
$ sudo apt-get install build-essential git python-dev python-setuptools cython
At this point, if you don’t already have a version of numpy >= 1.6, you have a few options. If you don’t need to track bleeding edge and your distribution is past 1.6, you can use apt-get or pip. Otherwise, if you do, you’ll have to first install numpy and scipy from source. I’ll give you the apt-get check and install, as well as the pip options below. Note that if you use pip, you will need to install the standard linear algebra and FFT dependencies (BLAS, LAPACK, FFTW).
Install numpy and scipy with apt-get
$ apt-cache show python-numpy python-scipy | grep "^Version" # Check version numbers $ sudo apt-get install python-numpy python-scipy
Installing numpy and scipy with pip
$ sudo apt-get install python-pip libatlas-dev libblas-dev liblapack-dev libfftw3-dev $ sudo pip install numpy $ sudo pip install scipy
Once you’ve got numpy installed, you have two choices to download – either the tarball release or git.
Installing from git
$ git clone git://github.com/pydata/pandas.git $ cd pandas
Installing from website
$ wget http://pandas.pydata.org/pandas-build/pandas-0.8.0b1.tar.gz $ tar xzf pandas-0.8.0b1.tar.gz $ cd pandas-0.8.0b1
Finally, we can build the library.
Compiling and testing pandas
$ python setup.py build $ sudo python setup.py install
Once we’re done, let’s test that pandas is working. Make sure to change directory out of the compilation path for the second version test!
$ python setup.py nosetests running nosetests running build_ext ... Ran 1962 tests in 33.089s OK (SKIP=65) $ cd $ python Python 2.6.6 (r266:84292, Dec 26 2010, 22:31:48) [GCC 4.4.5] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> import pandas >>> pandas.__version__ '0.8.0b1'
There you are – a working pandas installation. Enjoy working in Python instead of Python + R whenever you can.