Lets say, I have a following R array
a <- array(1:18, dim = c(3, 3, 2))
r$> a
, , 1
[,1] [,2] [,3]
[1,] 1 4 7
[2,] 2 5 8
[3,] 3 6 9
, , 2
[,1] [,2] [,3]
[1,] 10 13 16
[2,] 11 14 17
[3,] 12 15 18
and now I want to have the same array in Python numpy. I use
a = np.arange(1, 19).reshape((3, 3, 2))
array([[[ 1, 2],
[ 3, 4],
[ 5, 6]],
[[ 7, 8],
[ 9, 10],
[11, 12]],
[[13, 14],
[15, 16],
[17, 18]]])
But somehow, those two do not look like the same. how can one replicate the same array in Python?
I also tried
a = np.arange(1, 19).reshape((2, 3, 3))
array([[[ 1, 2, 3],
[ 4, 5, 6],
[ 7, 8, 9]],
[[10, 11, 12],
[13, 14, 15],
[16, 17, 18]]])
which is also not identical.
0
3 Answers
Basically the difference is in the ordering of the array.
Arrays can be row-major or column-major. See here about some info.
R by default constructs column-major arrays, and Python constructs row-major arrays. That’s why the default constructors you use will not give the same output. To fix that you can tell Python to construct the array in column-major by saying it should be in Fortran contiguous (column-major):
np.reshape(np.arange(1,19), (3,3,2), "F")
array([[[ 1, 10],
[ 4, 13],
[ 7, 16]],
[[ 2, 11],
[ 5, 14],
[ 8, 17]],
[[ 3, 12],
[ 6, 15],
[ 9, 18]]])
It looks different, but the underlying data is the same as in R.
If you do indexing you can see that it performs exactly the same:
R:
a[1,1,]
[1] 1 10
Python:
a[0,0,]
>>> array([ 1, 10])
Comparing that to the formerly top-rated answer, which will not give you the same values. It may visually look the same, but it does not function the same.
b = np.arange(1, 19).reshape((2, 3, 3)).transpose(0,2,1)
b[0,0,]
>>> array([1, 4, 7])
See also this great vignette by the reticulate team about the differences: https://cran.r-project.org/web/packages/reticulate/vignettes/arrays.html
2
-
1
It does, it just doesn't look the same, because Python doesn't provide structured output like R
– JonasV14 hours ago
-
That is what confused me, I also tried the same, and because the printed versions did not look the same, I assumed it produced something else.
– Avto Abashishvili14 hours ago
The order of the items in NumPy must match the column-major order in order to reproduce the same array as in R.
This may be accomplished by reshaping first, then transposing the axes.
Here is how I go about it:
import numpy as np
# Create the array in NumPy with the desired shape and order
a = np.arange(1, 19).reshape((3, 3, 2), order='F')
print(a)
This will give you an array in NumPy that matches the order of the R array:
array([[[ 1, 10],
[ 2, 11],
[ 3, 12]],
[[ 4, 13],
[ 5, 14],
[ 6, 15]],
[[ 7, 16],
[ 8, 17],
[ 9, 18]]])