Chapter 20 Integrating with C++ using Rcpp and RcppEigen

20.1 Prerequisite

  • In this chapter, we learn how to integrate C++ using Rcpp and RcppEigen.
  • RcppEigen is a package to use a linear algebra library Eigen with R. The original Eigen library and its documentation is found in their website.
  • Instead of RcppEigen, you may want to use RcppArmadillo. Armadillo is another libear algebra library in C++.
  • We presume that:
    • C/C++ environment is installed to the computer.
    • Rcpp and RcppEigen are installed.
    • The project is created in RStudio from File > New Project > New Directory > R Package using RcppEigen. In the following, I use the project name EmpiricalIO but the name can be as you like.
  • We presume the you have the following folder and file structure from the root directory:
    • main.
      • main.R: all executable statements are written in this file.
    • R.
      • functions.R: all function definitions in R are written in this file.
    • src.
      • functions.cpp: all function definitions in C++ are written in this file.
        • It includes:
        "src/functions.cpp"
        -------------------
        #include <Rcpp.h>
        #include <RcppEigen.h>
        • Inside functions.cpp, avoid using name spaces.
      • Makevars: compilation flags for osx/Linux should be written here.
      • Makevars.win: compilation flags for Windows should be written here.
    • inst.
      • include: header files for external libraries in C/C++ are stored here.
  • Makevars/Makevars.win should be:
Makevars
--------
PKG_CPPFLAGS = -w -std=c++11 -I../inst/include/ -O3
Makevars.win
------------
PKG_CPPFLAGS = -w -std=c++11 -I../inst/include/ -O3
- `-w` is for supprssing some warnings in `Eigen`. If you want to keep warnings shown, this can be removed.
- `-std=c++11` is for using the latest functionalities of C++ (optional).
- `-I../inst/include/` is for setting the header path to `../inst/include/` (optional).

20.2 Workflow

  • To minimize the likelihood of bugs and the time to edit and debug the code, I recommend you to follw the following workflow.
  • This workflow is based on my experience. If you find better workflow, you can overwrite by your own way.
  1. Write a procedure in main/main.R.
  1. Rewrite the procedure to a function in main/main.R.
  1. Execute the function in main/main.R and check that the output is the same with the output of the original
## [1] 0
  1. Cut and paste the function to R/functions.R.
  2. Build the package and load the function from the library.
  3. Check if the function can be executed as in step 3.
  4. Copy the function to src/functions.cpp and comment them out.
// src/functions.cpp
// # compute coefficient-wise square
// compute_square <-
//   function(x) {
//     y <- x^2
//     return(y)
//   }
  1. Write function in C++ by translating the copied and pasted R code.
    • The function name should be consistent with the function name in R. I often put _rcpp to the end of the name.
    • Put // [[Rcpp::export]] above the name of the function if you want to call this function directly from R. Otherwise, the wrapper function to call from R is not created.
    • The class of the inputs and output should be consistent with Rcpp objects. This will be explained later.
// src/functions.cpp
// # compute coefficient-wise square
// compute_square <-
// [[Rcpp::export]]
Eigen::VectorXd compute_square_rcpp(Eigen::VectorXd x) {
  //   function(x) {
  Eigen::VectorXd y = x.array().square();
  //     y <- x^2
  //     return(y)
  //   }
  return(y);
}
  1. Check if clean and rebuild work.
    • If there is a compilation error happens, debug until the compilation succeeds.
    • Sometimes, deleting R/RcppExports.R and R/RcppExports.cpp may be need when re-compile the functions.
  2. Clean up the code by eliminating the copied and pasted R code.
// src/functions.cpp
// compute coefficient-wise square
// [[Rcpp::export]]
Eigen::VectorXd compute_square_rcpp(Eigen::VectorXd x) {
  Eigen::VectorXd y = x.array().square();
  return(y);
}
  1. Now by calling the library, you should be able to use the function written in C++ in R. Check if it returns a valid value and if the output is the same as the output of the R function.
## [1] 0
  1. If there is a run-time bug in the C++ function, you may have to use some debugger for C++ to debug the function.
    • In osx, you can debug the C++ function called from R function in the following way.
    • Open the terminal and run R with the debugger lldb by typing the following command in the terminal:
    # terminal
    R -d lldb
    • Run main/main.R by typing the following command in the terminal:
    # terminal
    run -f main/main.R
    • This should execute the R source code.
    • After library(EmpiricalIO) is read, stop the process by Ctrl + C before the function in question is called. If there is no time gap between them, set Sys.sleep() in R to buy some time.
    • As you stop the process, set the break point at the function in question by typing in the terminal as:
    # terminal
    br s -n compute_square_rcpp
    and then continue the process by typing c in the terminal.
    • For the rest, refer to the documentation of lldb.
    • There is no such an easy way in Windows. You will have to establish an environment in which you can run a cpp file with executable statemtns and call the debugging functions from the file. Then, you can use some debuggers such as gdb to debug the functions inside the cpp file.
  2. If you need to modify the function, first rewrite the R function and then follow the same step to rewrite the C++ function. Never start the debugging from C++ functions.

20.3 Passing R Objects as Rcpp Objects

  • R class corresponds to the following Rcpp class:
R Rcpp
logical Logical
integer Integer
numeric Numeric
complex Complex
character String
Date Date
POSIXct Datetime
  • R data structure corresponds to the following Rcpp data structure:
R Rcpp
vector Vector
matrix Matrix
data.frame DataFrame
list List
  • For example, a numeric vector in R is passed to Rcpp::NumericVector in Rcpp, an integer matrix is to Rcpp::IntegerMatrix, and so on.
## [1]  2.18543791 -0.01282801 -0.30530893 -0.58421346  0.77126850
##          [,1]       [,2]       [,3]       [,4]      [,5]
## [1,] 2.106189 -0.2612649 -0.7788302 -0.4213451  1.218305
## [2,] 0.412157  2.0737836  1.1315325 -1.0217474 -1.799761
  • Let’s write a C++ function that just receives a numeric vector and returns the numeric vector, and receives a numeric matrix and returns the numeric matrix.
// src/functions.cpp
// [[Rcpp::export]]
Rcpp::NumericVector pass_numeric_vector_to_rcpp(Rcpp::NumericVector x) {
  return(x);
}
// [[Rcpp::export]]
Rcpp::NumericMatrix pass_numeric_matrix_to_rcpp(Rcpp::NumericMatrix Y) {
  return(Y);
}
  • Check if you can pass R objects and get the right result:
## [1] 0
## [1] 0
  • Exercise: Write functions pass_integer_vector_to_rcpp, pass_integer_matrix_to_rcpp, pass_list_to_rcpp, pass_data_frame_to_rcpp that receive an integer vector, list, and data frame and just return themselves.
## [1] 1 2 3 4 5
##      [,1] [,2]
## [1,]    1    1
## [2,]    1    1
## $x
## [1]  2.18543791 -0.01282801 -0.30530893 -0.58421346  0.77126850
## 
## $Y
##          [,1]       [,2]       [,3]       [,4]      [,5]
## [1,] 2.106189 -0.2612649 -0.7788302 -0.4213451  1.218305
## [2,] 0.412157  2.0737836  1.1315325 -1.0217474 -1.799761
## 
## $z
## [1] 1 2 3 4 5
##            x1         x2
## 1 -0.30824994 -1.5570357
## 2  0.01551524  1.9231637
## 3 -0.44231772 -1.8568296
## 4 -1.63800773 -2.1061184
## 5 -0.64140116  0.6976485
## [1] 0
## [1] 0
## [1] 0
## [1] 0

20.4 Passing R Objects as Eigen Objects

  • R data structure corresponds to the following Eigen data structure:
R Eigen
vector Eigen::VectorX
matrix Eigen::MatrixX
  • If you pass an integer vector and matrix, the corresponding Eigen objects are Eigen::VectorXi and Eigen::MatrixXi.
  • If you pass an numeric vector and matrix, the corresponding Eigen objects are Eigen::VectorXd and Eigen::MatrixXd.
  • The class of the output can be Eigen class. If you return Eigen::VectorXd, Eigen::MatrixXd, then they are automatically converted to the corresponding R objects.
  • Check if you can pass x, Y, and z as follows:
// src/functions.cpp
// [[Rcpp::export]]
Eigen::VectorXd pass_numeric_vector_to_eigen(Eigen::VectorXd x) {
  return(x);
}
// [[Rcpp::export]]
Eigen::MatrixXd pass_numeric_matrix_to_eigen(Eigen::MatrixXd Y) {
  return(Y);
}
## [1] 0
## [1] 0
  • Exercise: Write functions pass_integer_vector_to_rcpp and pass_integer_matrix_to_rcpp that receive an integer vector and integer matrix and just return themselves.
## [1] 0
## [1] 0
  • If you pass a vector and matrix by Eigen::VectorX and Eigen::MatrixX, the objects are copied to the new objects. This means that the new memory is allocated. If you are going to modify the passed object inside the C++ function, the objects have to be copied. Otherwise, you can just map the objects in the following way so that the new memory is not allocated, whereas you cannot modify the objects in the C++ function.
// src/functions.cpp
// [[Rcpp::export]]
Eigen::VectorXd map_numeric_vector_to_eigen(Eigen::Map<Eigen::VectorXd> x) {
  return(x);
}
// [[Rcpp::export]]
Eigen::MatrixXd map_numeric_matrix_to_eigen(Eigen::Map<Eigen::MatrixXd> Y) {
  return(Y);
}
## [1] 0
## [1] 0
  • I recommend to directly pass R vectors and matrices to Eigen::VectorX and Eigen::MatrixX rather than to Vector and Matrix in Rcpp, because Eigen::VectorX and Eigen::MatrixX have richer methods for linear algebra.

  • R list cannot be directly translated to Eigen objects, but the list of vectors and matrices, and the list of list of R objects can be passed to Eigen in the following way.

// src/functions.cpp
// [[Rcpp::export]]
Rcpp::List pass_list_to_eigen(Rcpp::List L) {
  // double vector
  Eigen::VectorXd x(Rcpp::as<Eigen::VectorXd>(L.at(0)));
  // double matrix
  Eigen::MatrixXd Y(Rcpp::as<Eigen::MatrixXd>(L.at(1)));
  // integer vector
  Eigen::VectorXi z(Rcpp::as<Eigen::VectorXi>(L.at(2)));
  // integer matrix
  Eigen::MatrixXi W(Rcpp::as<Eigen::MatrixXi>(L.at(3)));
  // return
  Rcpp::List output = Rcpp::List::create(x, Y, z, W);
  return(output);
}
## [[1]]
## [1]  2.18543791 -0.01282801 -0.30530893 -0.58421346  0.77126850
## 
## [[2]]
##          [,1]       [,2]       [,3]       [,4]      [,5]
## [1,] 2.106189 -0.2612649 -0.7788302 -0.4213451  1.218305
## [2,] 0.412157  2.0737836  1.1315325 -1.0217474 -1.799761
## 
## [[3]]
## [1] 1 2 3 4 5
## 
## [[4]]
##      [,1] [,2]
## [1,]    1    1
## [2,]    1    1
## [1] 0
  • You can also pass a list with named arguments and return a named list as follows:
// src/funtions.cpp
// [[Rcpp::export]]
Rcpp::List pass_named_list_to_eigen(Rcpp::List L) {
  // double vector
  Eigen::VectorXd x(Rcpp::as<Eigen::VectorXd>(L["x"]));
  // double matrix
  Eigen::MatrixXd Y(Rcpp::as<Eigen::MatrixXd>(L["Y"]));
  // integer vector
  Eigen::VectorXi z(Rcpp::as<Eigen::VectorXi>(L["z"]));
  // integer matrix
  Eigen::MatrixXi W(Rcpp::as<Eigen::MatrixXi>(L["W"]));
  // return
  Rcpp::List output = Rcpp::List::create(
    Rcpp::Named("x") = x, 
    Rcpp::Named("Y") = Y, 
    Rcpp::Named("z") = z, 
    Rcpp::Named("W") = W);
  return(output);
}
## $x
## [1]  2.18543791 -0.01282801 -0.30530893 -0.58421346  0.77126850
## 
## $Y
##          [,1]       [,2]       [,3]       [,4]      [,5]
## [1,] 2.106189 -0.2612649 -0.7788302 -0.4213451  1.218305
## [2,] 0.412157  2.0737836  1.1315325 -1.0217474 -1.799761
## 
## $z
## [1] 1 2 3 4 5
## 
## $W
##      [,1] [,2]
## [1,]    1    1
## [2,]    1    1
## [1] 0
  • You can also access to the column of Rcpp::DataFrame in the similar way.
// src/functions.cpp
// [[Rcpp::export]]
Eigen::VectorXd extract_column_from_data_frame(Rcpp::DataFrame D) {
  // pass column x1 of D to Eigen::VectorXd named x1
  Eigen::VectorXd x1(Rcpp::as<Eigen::VectorXd>(D["x1"]));
  return(x1);
}
## [1] 0
  • This allows us to pass whatever objects in R to a C++ function.
  • If you are planning to translate R functions to C/C++, from the beginning, you should write the R functions in the way inputs and outpus can be passed to C/C++ as above.

20.5 Passing R Objects in Other Objects in C/C++

  • integer and numeric scalars in R can be simply passed to int and double in C/C++.
  • Vectors in R can be passed to std::vector<int> or std::vector<double> objects. This may be helpful if you want to use the methods for std::vector.

20.6 Manipulating Objects in a C++ Function

  • As mentioned above, the best practice is to pass vectors and matrices to Eigen::VectorXd/Eigen::VectorXi and Eigen::MatrixXd/Eigen::MatrixXi rather than Rcpp::NumericVector/Rcpp::IntegerVector and Rcpp::NumericMatrix/Rcpp::IntegerMatrix.
  • The other objects will be passed as Rcpp::DataFrame or Rcpp::List and at the end converted to Eigen::VectorXd/Eigen::VectorXi and Eigen::MatrixXd/Eigen::MatrixXi usin Rcpp::as as explained above.
  • The rest of manipulation will be done using the methods in Eigen. Here I introduce basic operations. For the detail, refer to the online document of Eigen.

20.6.1 Matrix and Vector Arithmetic

  • Addition and subtraction:
    • Binary operator \(+\) as in a + b.
    • Binary operator \(-\) as in a - b.
    • Unary operator \(-\) as in - a.
  • Scalar multiplication and division:
    • Binary operator \(\times\) as in matrix * scalar.
    • Binary operator \(\times\) as in scalar * matrix.
    • Binary operator \(/\) as in matrix / scalar.
  • Transposition:
    • Transposition as in matrix.transpose().
  • Matrix-matrix and matrix-vector multiplication:
    • This part is different from R.
    • Matrix multiplication is as in A * B.
    • Coefficientwise multiplication is explained later but is as in A.array() * B.array().
  • Following the best practice, first write R function for these operations in R/functions.R and run in main/main.R:
  • Next, write these operations in C++ function:
// src/functions.cpp
// [[Rcpp::export]]
Rcpp::List matrix_vector_arithmetic_rcpp(
    Eigen::MatrixXd A, Eigen::MatrixXd B, 
    Eigen::VectorXd v, double c) {
  // addition and subtraction
  Eigen::MatrixXd X1 = A + B;
  Eigen::MatrixXd X2 = A - B;
  Eigen::MatrixXd X3 = - A;
  // scalar multiplication and division
  Eigen::MatrixXd X4 = A * c;
  Eigen::MatrixXd X5 = c * A;
  Eigen::MatrixXd X6 = A / c;
  // transpose
  Eigen::MatrixXd X7 = A.transpose();
  // matrix-matrix and matrix-vector multiplication
  Eigen::MatrixXd X8 = A * B.transpose();
  Eigen::VectorXd X9 = A * v;
  // return 
  Rcpp::List output = 
    Rcpp::List::create(
      Rcpp::Named("X1") = X1,
      Rcpp::Named("X2") = X2,
      Rcpp::Named("X3") = X3,
      Rcpp::Named("X4") = X4,
      Rcpp::Named("X5") = X5,
      Rcpp::Named("X6") = X6,
      Rcpp::Named("X7") = X7,
      Rcpp::Named("X8") = X8,
      Rcpp::Named("X9") = X9);
  return(output);
}
  • Then, we can check if this yields (almost) the same result with the R function:
## [1] 5.551115e-17
  • The output looks like:
## $X1
##             [,1]      [,2]       [,3]      [,4]
## [1,] -0.05067246 0.6761526 -0.2917328 1.6123600
## [2,] -0.12174506 1.9851240 -3.0351683 0.6933911
## 
## $X2
##            [,1]      [,2]      [,3]       [,4]
## [1,] -1.2022352 -2.347410 0.9507484 -0.6375019
## [2,]  0.4890317  1.205438 1.3942315  0.7832583
## 
## $X3
##            [,1]       [,2]       [,3]       [,4]
## [1,]  0.6264538  0.8356286 -0.3295078 -0.4874291
## [2,] -0.1836433 -1.5952808  0.8204684 -0.7383247
## 
## $X4
##           [,1]      [,2]       [,3]     [,4]
## [1,] -1.879361 -2.506886  0.9885233 1.462287
## [2,]  0.550930  4.785842 -2.4614052 2.214974
## 
## $X5
##           [,1]      [,2]       [,3]     [,4]
## [1,] -1.879361 -2.506886  0.9885233 1.462287
## [2,]  0.550930  4.785842 -2.4614052 2.214974
## 
## $X6
##             [,1]       [,2]       [,3]      [,4]
## [1,] -0.20881794 -0.2785429  0.1098359 0.1624764
## [2,]  0.06121444  0.5317603 -0.2734895 0.2461082
## 
## $X7
##            [,1]       [,2]
## [1,] -0.6264538  0.1836433
## [2,] -0.8356286  1.5952808
## [3,]  0.3295078 -0.8204684
## [4,]  0.4874291  0.7383247
## 
## $X8
##           [,1]       [,2]
## [1,] -1.280368 -0.8861152
## [2,]  3.857726  2.3497425
## 
## $X9
## [1] -0.2184706  1.2674165

20.6.2 The Array Class and Coefficient-wise Operations

  • When you want to have coefficient-wise operations, you first use .array() method to convert the object to an array and then apply methods for arrays.

  • The resulting array can be assigned to a matrix object implicitly or explicitly by using .matrix() method.

  • Coefficient-wise multiplication:

    • A * B in R is A.array() * B. array() in Eigen.
  • Other coefficient-wise math functions:

    • Absolute value: A.array().abs().
    • Exponential: A.array().exp().
    • Logarithm: A.array().log().
    • Power: A.array().power(r).
    • For the other methods, refer to the online document.
  • Write functions in R in R/functions.R and run in main/main.R:

  • Write C++ function in src/functions.cpp and call from main/main.R:
// src/functions.cpp
// [[Rcpp::export]]
Rcpp::List coefficientwise_operation_rcpp(
  Eigen::MatrixXd A,
  Eigen::MatrixXd B,
  int r
) {
  // coefficient-wise multiplication
  Eigen::MatrixXd X1 = A.array() * B.array();
  // other coefficient-wise math functions
  Eigen::MatrixXd X2 = A.array().abs();
  Eigen::MatrixXd X3 = A.array().exp();
  Eigen::MatrixXd X4 = A.array().abs().log();
  Eigen::MatrixXd X5 = A.array().pow(r);
  // return
  Rcpp::List output =
    Rcpp::List::create(
      Rcpp::Named("X1") = X1,
      Rcpp::Named("X2") = X2,
      Rcpp::Named("X3") = X3,
      Rcpp::Named("X4") = X4,
      Rcpp::Named("X5") = X5
    );
  return(output);
}
## [1] 2.220446e-16
## $X1
##             [,1]       [,2]       [,3]        [,4]
## [1,] -0.36070042 -1.2632876 -0.2047036  0.54832401
## [2,] -0.05608254  0.6219094  1.8170912 -0.03317559
## 
## $X2
##           [,1]      [,2]      [,3]      [,4]
## [1,] 0.6264538 0.8356286 0.3295078 0.4874291
## [2,] 0.1836433 1.5952808 0.8204684 0.7383247
## 
## $X3
##           [,1]      [,2]      [,3]     [,4]
## [1,] 0.5344838 0.4336018 1.3902836 1.628125
## [2,] 1.2015872 4.9297132 0.4402254 2.092427
## 
## $X4
##            [,1]       [,2]       [,3]       [,4]
## [1,] -0.4676802 -0.1795710 -1.1101553 -0.7186105
## [2,] -1.6947599  0.4670498 -0.1978799 -0.3033716
## 
## $X5
##            [,1]      [,2]      [,3]      [,4]
## [1,] 0.39244438 0.6982752 0.1085754 0.2375871
## [2,] 0.03372487 2.5449208 0.6731684 0.5451234

20.6.3 Solving Linear Least Squares Systems

  • It is often required to solve a linear least squares system \(A \cdot x = b\).
  • Solving using SVD decomposition:
A.bdcSvd(Eigen::ComputeThinU | Eigen::ComputeThinV).solve(b)
  • Solving using QR decomposition:
A.colPivHouseholderQr().solve(b)
  • Write R function in R/functions.R and run in main/main.R:
  • Write C++ function in src/functions.cpp and call in main/main.R:
// src/functions.cpp
// solve least squares using SVD decomposition
// [[Rcpp::export]]
Eigen::MatrixXd solve_least_squares_svd(
  Eigen::MatrixXd A,
  Eigen::MatrixXd B
) {
  Eigen::MatrixXd x = A.bdcSvd(Eigen::ComputeThinU | Eigen::ComputeThinV).solve(B);
  return(x);
}
// solve least squares using QR decomposition
// [[Rcpp::export]]
Eigen::MatrixXd solve_least_squares_qr(
    Eigen::MatrixXd A,
    Eigen::MatrixXd B
) {
  Eigen::MatrixXd x = A.colPivHouseholderQr().solve(B);
  return(x);
}
## [1] 8.881784e-16
## [1] 1.554312e-15
##             [,1]       [,2]       [,3]       [,4]
## [1,]  0.65530863 -1.2441297 -1.0799835  0.5926519
## [2,] -1.34801870  0.1453720  0.7313845 -2.2353988
## [3,]  1.38750763 -0.3405502 -0.7649107  1.5987206
## [4,] -0.06376266 -0.4632165 -0.2296862  0.4681169
##             [,1]       [,2]       [,3]       [,4]
## [1,]  0.65530863 -1.2441297 -1.0799835  0.5926519
## [2,] -1.34801870  0.1453720  0.7313845 -2.2353988
## [3,]  1.38750763 -0.3405502 -0.7649107  1.5987206
## [4,] -0.06376266 -0.4632165 -0.2296862  0.4681169

20.6.4 Accessing to Elements of a Matrix and Vector

  • You can access to the size information and elements of a matrix and vector as follows:

  • Sizes: A.rows() for row numbers and A.cols() for column numbers.

  • Element: A(i, j) for (\(i + 1\), \(j + 1\))-th element.

  • Rows and columns: A.row(i) for \(i + 1\)-th row and A.col(j) for \(j + 1\)-th column.

  • Write R function in R/functions.R and run in main/main.R:

  • Write C++ function in src/functions.cpp and call in main/main.R:
// src/functions.cpp
// [[Rcpp::export]]
Rcpp::List access_rcpp(
    Eigen::MatrixXd A, 
    int i, 
    int j
  ) {
    int I = A.rows();
    int J = A.cols();
    double a_ij = A(i - 1, j - 1);
    Eigen::VectorXd a_i = A.row(i - 1);
    Eigen::VectorXd a_j = A.col(j - 1);
    Rcpp::List output =
      Rcpp::List::create(
        Rcpp::Named("I") = I,
        Rcpp::Named("J") = J,
        Rcpp::Named("a_ij") = a_ij,
        Rcpp::Named("a_i") = a_i,
        Rcpp::Named("a_j") = a_j
      );
    return(output);
  }
## [1] 0
## $I
## [1] 4
## 
## $J
## [1] 4
## 
## $a_ij
## [1] 0.3295078
## 
## $a_i
## [1] -0.6264538  0.3295078  0.5757814 -0.6212406
## 
## $a_j
## [1]  0.3295078 -0.8204684  0.4874291  0.7383247