Introduction
I worked on “Port NMatrix to JRuby” in the context of the Google Summer of Code (GSoC) 2016 and I am pleased to announce that NMatrix can now be used in JRuby.
With JRuby NMatrix, a linear algebra library, wraps Apache Commons Math for its most basic functionalities. NMatrix supports dense matrices containing either doubles or Ruby objects as the data type. The performance of JRuby with Apache Commons Maths is quite satisfactory (see below for performance comparisons) even without making use of JRuby threading capabilities.
I have also ported the mixed_models gem, which uses NMatrix heavily at its core, to JRuby. This gem allowed us to test NMatrixJRuby with reallife data.
This blog post summarizes my work on the project with SciRuby, and reports the final status of the project.
The original GSoC proposal, plan and application can be found here. Until merging is complete, commits are available here.
Performance
I have benchmarked some of the NMatrix functionalities. The following plots compare the performance between NMatrixJRuby, NMatrixMRI, and NMatrixMRI using LAPACK/ATLAS libraries. (Note: MRI refers to the reference implementation of Ruby, for those who are new.)
Notes:
 LAPACK and ATLAS aren’t involved in most elementwise operations, such as addition and subtraction.
 NMatrixMRI relies on LAPACK/ATLAS for calculating determinants and LU Decomposition (lud).
Result
For twodimensional matrices, NMatrixJRuby is currently slower than NMatrixMRI for matrix multiplication and matrix decomposition functionalities (calculating determinant and factoring a matrix). NMatrixJRuby is faster than NMatrixMRI for other functionalities of a twodimensional matrix — like addition, subtraction, trigonometric operations, etc.
NMatrixJRuby is a clear winner when we are working with matrices of arbitrary dimensions.
Implementation
Storing ndimensional matrices as onedimensional arrays
The major components of an NMatrix
are shape, elements, dtype and
stype. When initialized, the dense type stores the elements as a onedimensional
array; in the JRuby port, the ArrayRealVector
class is used to store
the elements.
@s
stores elements, @shape
stores the shape of the matrix, while
@dtype
and @stype
store the data type and storage type
respectively. Currently, I have nmatrixjruby implemented only for
:float64
(double) and Ruby :object
data types.
NMatrixMRI uses struct
as a type
to store dim
, shape
, offset
, count
, src
of an NMatrix. ALLOC
and xfree
are used to wrap the NMatrix attributes to C structs
and release the unrequired memory.
Slicing and Rank
Implementing slicing was the toughest part of NMatrixJRuby
implementation. NMatrix@s
stores the elements of a matrix as a
onedimensional array. The elements along any dimension are accessed with the
help of the stride. NMatrix#get_stride
calculates the stride with
the help of the dimension and shape and returns an Array.
1 2 3 4 5 6 7 8 9 10 

NMatrix#[]
and NMatrix#[]=
are thus able to read and write the
elements of a matrix. NMatrix#MRI uses the @s
object which stores
the stride when the nmatrix is initialized.
NMatrix#[]
calls the #xslice
operator which calls #get_slice
operator that use the stride to determine whether we are accessing a
single element or multiple elements. If there are multiple elements,
#dense_storage_get
returns an NMatrix object with the elements along
the dimension.
NMatrixMRI differs from NMatrixJRuby implementation as it makes sure that memory is properly utilized as the memory needs to be properly garbage collected.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 

NMatrix#[]=
calls the #dense_storage_set
operator which calls
#get_slice
operator that use the stride to find out whether we are
accessing a single element or multiple elements. If there are
multiple elements #set_slice
recursively sets the elements of the
matrix then returns an NMatrix object with the elements along the
dimension.
All the relevant code for slicing can be found here.
Enumerators
NMatrixMRI uses the C code for enumerating the elements of a matrix. Just as with slicing, the NMatrixJRuby uses pure Ruby code in place of the C code. Currently, all the enumerators for dense matrices with real datatype have been implemented and are properly functional. Enumerators for objects have not yet been implemented.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 

TwoDimensional Matrices
Linear algebra is mostly about twodimensional matrices. In NMatrix,
when performing calculations in a twodimensional matrix, a onedimensional array
is converted to a twodimensional matrix. A twodimensional matrix is
stored in the JRuby implementation as a BlockRealMatrix
or
Array2DRowRealMatrix
. Each has its own advantages.
Getting a 2D Matrix
1 2 3 4 5 6 7 8 9 10 11 12 13 14 

Convert a 2Dmatrix to 1Darray
1 2 3 4 5 6 7 8 9 10 11 12 13 14 

Why use a Java method instead of Ruby method?
Memory Usage and Garbage Collection: A scientific library is memory intensive and hence, every step counts. The JRuby interpreter doesn’t need to dynamically guess the data type and uses less memory, typically around onetenth of it. If the memory is properly utilized, when the GC kicks in, the GC has to clear less used memory space.
Speed: Using java method greatly improves the speed — by around 1000 times, when compared to using the Ruby method.
Operators
All the operators from NMatrixMRI have been implemented except modulus. The binary operators were easily implemented through Commons Math API and Java Math API.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 

Unary Operators (Trigonometric, Exponentiation and Log operators) have been implemented using #mapToSelf
method that takes a Univariate function
as an argument. #mapToSelf
maps every element of ArrayRealVector object to the Univariate function
, that is passed to it and returns self
object.
1 2 3 4 5 

NMatrix#method(arg) has been implemented using bivariate functions provided by Commons Math API and Java Math API.
1 2 3 4 5 

1 2 3 4 5 6 7 8 9 10 11 12 13 

Decomposition
NMatrixMRI relies on LAPACK and ATLAS for matrix decomposition and
solving functionalities. Apache Commons Math provides a different set
of API for decomposing a matrix and solving an equation. For example,
#potrf
and other LAPACK specific functions have not been implemented
as they are not required at all.
Calculating determinant in NMatrix is tricky where a matrix is reduced either to a lower or upper matrix and the diagonal elements of the matrix are multiplied to get the result. Also, the correct sign of the result (whether positive or negative) is taken into account while calculating the determinant. However, NMatrixJRuby uses Commons Math API to calculate the determinant.
1 2 3 4 5 6 7 

Given below is code that shows how Cholesky decomposition has been implemented by using Commons Math API.
Cholesky Decomposition
1 2 3 4 5 6 7 8 9 10 11 12 13 

Similarly, LU Decomposition and QR factorization have been implemented.
LU Decomposition
QR Factorization
NMatrix#solve
The solve method currently uses LU and Cholesky decomposition.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 

NMatrix#matrix_solve
Suppose we need to solve a system of linear equations:
AX = B
where A is an m×n matrix, B and X are n×p matrices, we need to solve this equation by iterating through B.
NMatrixMRI implements this functionality using NMatrix::BLAS::cblas_trsm
method. However, for NMatrixJRuby, NMatrix#matrix_solve
is the analogous method used.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 

Currently, Hessenberg transformation for NMatrixJRuby has not been implemented.
Other dtypes
I have tried implementing float dtypes using FloatMatrix
class
provide by jblas. jblas was used instead of Commons Math as the
latter uses Field Elements
for Floats and it had some issues
with Reflection
and Type Erasure
.
However, using jblas resulted in errors due to precision.
Code Organisation and Deployment
To minimise conflict with the MRI codebase all the JRuby front end
code has been placed in the /lib/nmatrix/jruby
directory. lib/nmatrix/nmatrix.rb
decides whether to load
nmatrix.so
or nmatrix_jruby.rb
after detecting the Ruby platform.
The added advantage is that the Ruby interpreter must not decide which function to call at runtime. The impact on performance can be seen when programs which intensively use NMatrix for linear algebraic computations (e.g., mixed_models) are run.
Spec Report
After the port; this is the final report that summarizes the number of tests that successfully pass:
NMatrix
Spec file  Total Tests  Success  Failure  Pending  

00_nmatrix_spec  188  139  43  6  
01_enum_spec  17  8  09  0  
02_slice_spec  144  116  28  0  
03_nmatrix_monkeys_spec  12  11  01  0  
elementwise_spec  38  21  17  0  
homogeneous_spec.rb  07  06  01  0  
math_spec  737  541  196  0  
shortcuts_spec  81  57  24  0  
stat_spec  72  40  32  0  
slice_set_spec  6  2  04  0 
Why do some tests fail?
 Complex dtype has not been implemented.
 Sparse matrices (list and yale) have not been implemented.
 Decomposition methods that are specific to LAPACK and ATLAS have not been implemented.
 Integer dtype is not properly assigned to
floor
,ceil
, andround
methods.
Mixed_Models
Spec file  Total Test  Success  Failure  Pending  

Deviance_spec  04  04  0  0  
LMM_spec  195  195  0  0  
LMM_categorical_data_spec.rb  48  45  3  0  
LMMFormula_spec.rb  05  05  0  0  
LMM_interaction_effects_spec.rb  82  82  0  0  
LMM_nested_effects_spec.rb  40  40  0  0  
matrix_methods_spec.rb  52  52  0  0  
ModelSpecification_spec.rb  07  07  0  0  
NelderMeadWithConstraints_spec.rb  08  08  0  0 
Future Work
NMatrix on JRuby offers comparable speeds to MRI. For specific computations it will be possible to leverage the threading support of JRuby and speed up things using multiple cores.
Adding new functionality to NMatrixJRuby will be easy from here. Personally, I am interested to add OpenCL support to leverage the GPU computational capacity available on most machines today.
Conclusion
The main goal of this project was to to gain from the performance JRuby offers, and bring a unified interface for linear algebra between MRI and JRuby.
By the end of GSoC, I have been able to successfully create a linear algebra library, NMatrix for JRuby users, which they can easily run on their machines — unless they want to use complex numbers, at least for now.
I have mixed_models gem simultaneously ported to JRuby. Even here, NMatrixJRuby is very close to NMatrixMRI, considering the performance .
Acknowledgements
I would like to express my sincere gratitude to my mentor Pjotr Prins for the continuous support through the summers, and for his patience, motivation, enthusiasm, and immense knowledge. I could not have imagined having a better advisor and mentor, for this project.
I am very grateful to Google and the Ruby Science Foundation for this golden opportunity.
I am very thankful to Charles Nutter, John Woods, Sameer Deshmukh, Kenta Murata and Alexej Gossmann, who mentored me through the project. It has been a great learning experience.
I thank my fellow GSoC participants Rajith, Lokesh and Gaurav who helped me with certain aspects of my project.