NumPy/MKL vs Matlab performance

Further to the post I wrote on the MKL performance improvement on NumPy, I have tried to get some figures comparing it to Matlab. Here are some results. Any suggestion to improve the comparison is welcome. I will update it with the different values I do collect.

Here is the Matlab script I used to compare the Python code :

disp('Eig');tic;data=rand(500,500);eig(data);toc;
disp('Svd');tic;data=rand(1000,1000);[u,s,v]=svd(data);s=svd(data);toc;
disp('Inv');tic;data=rand(1000,1000);result=inv(data);toc;
disp('Det');tic;data=rand(1000,1000);result=det(data);toc;
disp('Dot');tic;a=rand(1000,1000);b=inv(a);result=a*b-eye(1000);toc;
disp('Done');

Each line is linked to the corresponding Python function in my test script (see my MKL post).

The results are interesting. Tests have been made with R2007a and R2008a versions and compared to and EPD 6.1 with NumPy using MKL.

Here are the timings on an Intel dual core computer running Matlab R2007a :

Eig : Elapsed time is 0.718934 seconds.
Svd : Elapsed time is 17.039340 seconds.
Inv : Elapsed time is 0.525181 seconds.
Det : Elapsed time is 0.200815 seconds.
Dot : Elapsed time is 0.958015 seconds.

And those are the timings on an Intel Xeon 8 cores machine running Matlab R2008a :

Eig : Elapsed time is 1.235884 seconds.
Svd : Elapsed time is 25.971139 seconds.
Inv : Elapsed time is 0.277503 seconds.
Det : Elapsed time is 0.142898 seconds.
Dot : Elapsed time is 0.354413 seconds.

Compared to the NumPy/MKL tests, here are the results :

Function	Core2Duo-With MKL	Core2Duo-R2007a	Speedup using numpy
test_eigenvalue	752ms	718ms	0.96
test_svd	4608ms	17039ms	3.70
test_inv	418ms	525ms	1.26
test_det	186ms	200ms	1.07
test_dot	666ms	958ms	1.44

Function	Xeon8core-With MKL	Xeon8core-R2008a	Speedup using numpy
test_eigenvalue	772ms	986ms	1.28
test_svd	2119ms	26081ms	12.5
test_inv	153ms	230ms	1.52
test_det	65ms	105ms	1.61
test_dot	235ms	287ms	1.23

This entry was posted on Tuesday, March 16th, 2010 at 5:25 pm and is filed under EPD, numpy, python. You can follow any responses to this entry through the RSS 2.0 feed. You can leave a response, or trackback from your own site.

17 Responses to NumPy/MKL vs Matlab performance

Truong Nghiem says:

July 31, 2010 at 4:44 am

Is this MKL feature exclusive to Windows? Results on my system were different. Mine is a Macbook Pro, Core 2 duo 2.26GHz, running Snow Leopard. I used EPD 6.2 (32bit) and Matlab R2009b for Mac.
Matlab: eigenvalue=~390ms, svd=~14476ms, inv=~244ms
EPD: eigenvalue=~709ms, svd=~5510ms, inv=~505ms

Reply
- Truong Nghiem says:
  
  July 31, 2010 at 4:59 am
  
  I forgot to set the number of threads. When set to the max number (2), EPD performed better: eigenvalue=~594ms, svd=~3710ms, inv=~328ms.
  
  Still, except for svd, EPD is slower than Matlab. Probably Matlab was improved a lot in version 2009b?
  
  Reply
Mattias Villani says:

January 7, 2011 at 2:27 am

Thanks for this. Here is what I get:

Comparison Matlab vs Python lin alg test

Matlab Python PythonEPD
Eigen: 254.49 1022.5 329.9
SVD: 1166.69 7375.9 1134.4
Inv 141.16 1138.9 97.8
Det: 55.15 259.3 42.6
Dot: 236.89 1888.9 157.4

Timings are in ms.
Machine: CPU – Intel® CoreTM i7-960 3,20 GHz, “Bloomfield”. 6 GB Ram DDR3. Ubuntu 10.04 64-bit.

Matlab is R2010b

Python is plain Python from Ubuntu 10.04 Repositories with Atlas BLAS? Numpy 1.3.0. Python version: 2.6.5 (r265:79063, Apr 16 2010, 13:57:41)
[GCC 4.4.3]

PythonEPD is Enthought Python distribution with Intel MKL BLAS. Numpy 1.4.0. Python version: 2.6.6 |EPD 6.3-2 (64-bit)| (r266:84292, Sep 17 2010, 19:18:23)
[GCC 4.1.2 20080704 (Red Hat 4.1.2-44)]

Reply
Phil says:

July 16, 2011 at 4:41 pm

you are timing this with a call to generate a random matrix each time, that would give you inaccurate performance of the function itself

Reply
Kostia says:

April 25, 2012 at 7:32 pm

results of my benchmark are the following

NumPy/MKL:

Function : Timing [ms] – MKL / No MKL
test_det : 654.6 ms – 0.28 / 0.61
test_svd : 24278.8 ms – 0.19 / 0.66
test_eigenvalue : 2849.6 ms – 0.26 / 1.18
test_dot : 3773.9 ms – 0.18 / 0.65
test_inv : 3212.4 ms – 0.13 / 0.45

Eig
Elapsed time is 0.657280 seconds.
Svd
Elapsed time is 14.668627 seconds.
Inv
Elapsed time is 0.524102 seconds.
Det
Elapsed time is 0.073352 seconds.
Dot
Elapsed time is 0.327279 seconds.
Done

I’m using MKL 10.3 and matlab R2009b.

any ideas why my results for numpy is much worse than yours ?

Reply
- 4fire says:
  
  May 5, 2012 at 3:36 pm
  
  I think Kostia’s result is exact. Matlab use Intel’s MKL for all the tasks related to BLAS and Lapack, so it is fastest. The core of Python, I supposed, is the Numpy-MKL, also uses Intel MKL. But I found Python a bit slower than Matlab.
  
  Reply
linuS says:

May 29, 2012 at 6:07 pm

The huge difference is because in MATLAB you are only calculating the eigenvalues but in python/numpy you are calculating both eigenvalues and eigenvectors. To correct this and make appropriate comparisons, you must do one of the following:
1. change numpy.linalg.eig(x) to numpy.linalg.eigvals(x) , leave matlab code as it is OR
2. change eig(x) to [V,D] = eig(x) in matlab, leave python/numpy code as it is (this might create more memory being consumed by matlab script)
in my experience, python/numpy optimized with MKL(the one provided by Christoph Gohlke) is as fast as or slightly faster than matlab(2011b)optimized with MKL.

Reply
- 4fire says:
  
  June 4, 2012 at 3:45 pm
  
  Hello linuS, I do not think your opinion is right, becaus: in Matlab 2010b, I used [V, D] = eig(X) as your suggestion (2.) to calculate both eigen vectors and eigen values. I do not find any useful purpose if we only calculate eigen values. You can see some other results at my blog: http://4fire.wordpress.com/2012/05/07/python-3-2-vs-matlab-and-openblaslapack-on-matrix-multiply-svd-and-eig-tests/
  
  Reply
  - linuS says:
    
    June 5, 2012 at 3:45 am
    
    On the post mentioned by the writer,
    1. disp(‘Eig’);tic;data=rand(500,500);eig(data);toc;
    has been compared with
    2. result = numpy.linalg.eig(data)
    So, clearly two different things are being compared here.
    
    Also, for the test you mentioned, is numpy officially supported on python 3.2? And did you use numpy distribution optimized with MKL(enthought distribution) or that one by Golhke?Or did you compile numpy yourself? My “opinion” pertains to numpy with MKL.
    There would be very slight difference between the performance because both Matlab and numpy would be using MKL. Numpy and Matlab would just be passing data to and from MKL which would do the actual calculation.
    By the way, I have tested this on Matlab 2011b. And in that, I found very small differences.
    And could you also post the code with which you tested these functions? And please do mention the hardware you tested these functions on.
  - 4fire says:
    
    June 5, 2012 at 8:16 am
    
    Hello linuS, in my tests, I use numpy 1.6.1-MKL built by Golhke because I found that official version of numpy at http://sourceforge.net/projects/numpy/files/NumPy/1.6.1/ is slower (I do not know why). I understand that Matlab and numpy’s eig, svd, matrix multiplication functions are based on MKL for the best performances. My code for Matlab:
    % test for eig function
    N = 1000;
    X=rand(N,N,’single’);
    tic;
    [D, V]=eig(X);
    toc;
    and Python
    N = 1000
    X = matrix(random.rand(N,N).astype(float32))
    start = clock()
    D,V = linalg.eig(X)
    finish = clock()
    print(“Time for eig “,”X”+str(shape(X)),” is “, finish – start,”s.”)
  - 4fire says:
    
    June 5, 2012 at 8:19 am
    
    Ah, and I did the tests on my laptop, HP Elitebook 6930p, Win7 32 bit, 4Gb Ram, Core 2 Duo P8600 2.4Ghz.
  - dpinte says:
    
    June 5, 2012 at 11:13 am
    
    Thanks for the suggested fixes. I’ll try to take some time to update the scripts and make sure we get accurate results. I do not have access to a Matlab install anymore … but will find one 😉
linuS says:

June 5, 2012 at 4:38 pm

4fire, do you get similar results when you compare them using float64? Numpy is treating the float32 random matrix as float64 and giving complex64 outputs. Check the dtype of D and V after running the script you have provided and it shall be obvious.

Reply
- 4fire says:
  
  June 5, 2012 at 4:58 pm
  
  I am sorry linuS, but I did not clearly understand your comment. I just care about 32 bit float matrix and I want to find the fastest tool to calculate eigen vectors and eigen values of a rectangular matrix. But I also tested with 64 bit float maxtrix and on my machine, Matlab 2010b is still faster than Python 3.2 with Numpy-MKL 1.6.2.
  
  Reply
  - linuS says:
    
    June 5, 2012 at 5:20 pm
    
    I am not sure why 64 bit float matrix operations are significantly slower on Numpy-MKL than on Matlab on your computer. On mine they are almost the same.
    I could not figure out a way to carry out 32 bit eigenvectors operation using numpy. As far as I know, numpy.linalg.eig() is treating the 32 bit matrix as a 64-bit matrix because the results produced are 64-bit complex numbers.
    Lets wait for the results from dpinte.
Denis says:

January 31, 2013 at 9:24 am

This is an interesting article. However, based on my experience of Matlab, I would rather perform each operation many time rather than once.
Indeed, I realized that the time for one call may vary when you call it successively. So performing 100 or 1000 time the same operation should -in my sense- be more accurate; especially if total time for an operation is below 1sec. This would be a sort of average time.

Reply
- Sathish Sanjeevi says:
  
  April 23, 2017 at 12:37 pm
  
  The averaging of more operations indeed makes more sense. I also have read earlier that the MATLAB tic toc functions for measuring time are not the most accurate, especially considering the measured times are very small.
  
  Reply

Things and thoughts