I am now translating a code written in R into Python. This is not because someone said R will be replaced with Python, but just because I need a stand-alone code for functions I wrote in R. I will write down what I found while learning Python programming with numpy/sicpy.

The default version of Python does not have enough functionality for direct translation of R codes. So I need to use numpy/scipy libraries. With numpy/scipy enabled, Python can do almost the same things as what R can do. For example, it can readily apply a function to all elements of a vector.

>>>x = numpy.array([1,2,3,4]) >>>numpy.exp(x) array([ 2.71828183, 7.3890561 , 20.08553692, 54.59815003])

Really useful.

However, some functions in R can not be replicated even by numpy.

The “outer” function in R creates a matrix of outer products of 2 vectors. A very good point of this function is that you can use any functions which receive 2 arguments.

#outer product >outer(1:3, 1:3) [,1] [,2] [,3] [1,] 1 2 3 [2,] 2 4 6 [3,] 3 6 9 #sum >outer(1:3, 1:3, FUN="+") [,1] [,2] [,3] [1,] 2 3 4 [2,] 3 4 5 [3,] 4 5 6 #more complicated function >outer(1:3, 1:3, FUN=function(x, y){choose(x,y)*x/y}) [,1] [,2] [,3] [1,] 1 0.0 0 [2,] 4 1.0 0 [3,] 9 4.5 1

Numpy has a similar function, “numpy.outer”, but this is only for product. Another option is a method of ufunc. Each function has a method “outer”, eg. “numpy.add.outer”. However, you can not call this method from user-defined functions. If you want to define an outer method on your own functions, you should probably wirte your own “ufunc”, which is maybe too labourious. (but must be faster than the R’s outer function)

There are useful functions which only exist in numpy. For instance,

>>>x = numpy.array([[1,2,3],[4,5,6],[7,8,9]]) >>>numpy.tril(x, k=1) array([[1, 2, 0], [4, 5, 6], [7, 8, 9]]) >>>numpy.triu(x, k=1) array([[0, 2, 3], [0, 0, 6], [0, 0, 0]])

These functions return a part of a matrix below/above k-th diagonal. They are particularly useful when you need to cull part of matrix. The R’s upper.tri or lower.tri only returns indexes of upper/lower triangle, and writing codes to do like tril/triu is a bit tricky.

Now, the translated Python code is slightly slower than the R code for unknown reason, but it is sufficiently fast for ordinal use and I am satisfied with the power of numpy.

—–

Apart from the functionality, as is often the case with language translations, names of functions are very confusing. In R, the “choose” function calculates binomial coefficient while “numpy.choose” chooses elements from arrays and constructs a new array. The counterpart of the R “choose” is “scipy.misc.comb”. And, confusingly, the “combn” function in R returns all possible combinations of numbers by choosing k elements from a list.

—

If you are interested in statistical analysis using Python, there is an introductory post here.

Pingback: from R to Python (2): libraries for statistical analysis | Tomochika Fujisawa's site