Documentation ¶
Overview ¶
Package rgo provides a translation between R and Go using cgo.
Rgo helps translate data from R's internal representation (a SEXP) in C to o objects of standard types (floats, ints, strings, etc.). It contains C functions that help create, modify, and extract data from a SEXP. In order to take advantage of these functions, the workhorse type of rgo wraps its internal notion of a C.SEXP.
type RSEXP C.SEXP
Whenever Rgo wants to read data that comes from R, or format data to send back to R, it uses the RSEXP type. Because C types cannot be exported, the notion of a C.SEXP in the user's package is different from that of Rgo's. Therefore, the best way to create an RSEXP is to use the NewRSEXP function.
Sending data from R to Go ¶
Rgo uses R's internal functions to extract data from a SEXP and into a Go typed object. More about these objects can be found in R's documentation at https://cran.r-project.org/doc/manuals/r-release/R-ints.html#SEXPs. In short, everything in R is a SEXP, which is a pointer to a SEXPREC, which in turn contains some header information, attributes, and a pointer to the data itself. A SEXP can point to a SEXPREC of up to a couple dozen types which map to R's types. Rgo only concerns itself with 5 of them:
- REALSXP, akin to a Go slice of float64s and, when containing the dimension attributes, a matrix.
- INTSXP, akin to a Go slice of integers
- CHARSXP, akin to a a Go string
- STRSXP, akin to a Go slice of strings
- VECSXP, which is an R list and, when containing the correct attributes, data frame
In C, the type of data a SEXP points to can be found using the ”TYPEOF” function. It returns an integer, which can be matched to the relevant types based on the constant enumerations declared in this package. As a convenience, Rgo's TYPEOF function wraps R's TYPEOF function. Rgo's LENGTH function also wraps R's LENGTH (also called LENGTH) function.
Rgo contains generic function which can be used to extract data from a SEXP as a desired type in Go. There are three functions, which relate to R's numeric type, character type, and matrix type. The numeric and character functions use a type parameter as an input so that the Go slice can be made to be the supported type which is most convenient for the caller, within reason. These functions are:
- func AsNumeric[t RNumeric](r RSEXP) ([]t, error)
- func AsCharacter[t RCharacter](r RSEXP) ([]t, error)
- func AsMatrix(r RSEXP) (Matrix, error)
Each of these functions checks the SEXPTYPE of the underlying SEXP and will return an error if it doesn't match the function that was called.
Sending data from Go to R ¶
Sending data from Go to R is done by creating an RSEXP (which will always point to a newly created C.SEXP) from one of the supported Go types:
- func NumericToRSEXP[t RNumeric](in []t) *RSEXP
- func CharacterToRSEXP[t RCharacter](in []t) *RSEXP
- func MatrixToRSEXP(in Matrix) *RSEXP
This SEXPTYPE of the output SEXP from these functions will match the R internal type which makes the most sense.
Because R does not allow functions to have multiple returns, the preferred way to return multiple pieces of data from a function is a list. Therefore, Rgo contains functions to create three types of lists: a generic list, a named list, and a data frame.
- func MakeList(in ...*RSEXP) *RSEXP
- func MakeNamedList(names []string, data ...*RSEXP) (*RSEXP, error)
- func MakeDataFrame(rowNames, colNames []string, dataColumns ...*RSEXP) (*RSEXP, error)
The functions to create named lists and data frames enforce data quality so that valid R objects can be created. This includes enforcing the number of names and objects provided, checking the lengths of all the columns provided in a data frame, and making sure no nested objects (like lists or data frames themselves) are provided as columns for data frames.
In order to send data back to R, it must be a C.SEXP that matches the user's notion of a SEXP, not Rgo's. Therefore, Rgo provides a generic ExportRSEXP function which is used to create a user's C.SEXP that has the same data as the RSEXP that has been made using the rgo library. Callers provide a type parameter, which should always be their C.SEXP:
mySEXP, err := [C.SEXP](rgoRSEXP)
If their provided type is not a C.SEXP (or a *C.SEXP which is also acceptable) an error is returned.
Building Your Package and Calling Functions in R ¶
In order for a package of Go functions to be callable from R, they must take any number of C.SEXP objects as input and return a single C.SEXP object. They also need to be marked to be exported, by including an export statement immediately above the function signature. Note that if there is a space between the comment slashes and the export mark, Go will parse it as a vanilla comment and the function won't be exported.
//export DoubleVector func DoubleVector(C.SEXP) C.SEXP {}
The Go package must then be compiled to a C shared library:
go build -o <libName>.so -buildmode=c-shared <package>
Finally, the Go functions can be called in R using the .Call function:
output = .Call("DoubleVector", input)
For a more complete demonstration, see the example below, or the demo package at https://github.com/EMurray16/rgo/demo.
Example ¶
The code below contains a functional example of a Go function that can be called from R:
package main // #include <Rinternals.h> // We need to include the shared R headers here // One way to find this is via the rgo directory // Another way is to find them from your local R installation // - Typical Linux: /usr/share/R/include/ // - Typical MacOS: /Library/Frameworks/R.framework/Headers/ // If all else fails, you can also find the required header files wherever rgo is located on your computer // For example, on my computer all github packages are put in /Go/mod/pkg/github.com/... // #cgo CFLAGS: -I/Go/mod/pkg/github.com/EMurray16/rgo/Rheader/ import "C" import( "github.com/EMurray16/rgo" ) //export DoubleVector func DoubleVector(input C.SEXP) C.SEXP { // cast the incoming SEXP as a GoSEXP r, err := rgo.NewRSEXP(&input) if err != nil { fmt.Println(err) return nil } // create a slice from the SEXPs data floats, err := rgo.AsNumeric[float64](r) if err != nil { fmt.Println(err) return nil } // double each element of the slice for i, _ := range floats { floats[i] *= 2 } // create a SEXP and GoSEXP from the new data outputRSEXP := rgo.NumericToSEXP(floats) mySEXP, err := rgo.ExportRSEXP[C.SEXP](outputSEXP) if err != nil { fmt.Println(err) return nil } return mySEXP }
Once it is compiled to a shared library, the function can be called using R's .Call() interface:
input = c(0, 2.71, 3.14) output = .Call("DoubleVector", input) print(output)
The result would look like this:
[0, 5.52, 6.28]
Index ¶
- Variables
- func AreMatricesEqual(A, B Matrix) bool
- func AreMatricesEqualTol(A, B Matrix, tolerance float64) bool
- func AsCharacter[t RCharacter](r RSEXP) (out []t, err error)
- func AsNumeric[t RNumeric](r RSEXP) (out []t, err error)
- func ExportRSEXP[t any](r *RSEXP) (out t, err error)
- func LENGTH(r RSEXP) int
- type Matrix
- func AsMatrix(r RSEXP) (out Matrix, err error)
- func CopyMatrix(in Matrix) (out Matrix)
- func CreateIdentity(size int) (*Matrix, error)
- func CreateZeros(Nrow, Ncol int) (*Matrix, error)
- func MatrixAdd(A, B *Matrix) (C *Matrix, err error)
- func MatrixMultiply(A, B *Matrix) (C *Matrix, err error)
- func NewMatrix(Nrow, Ncol int, data []float64) (*Matrix, error)
- func (m *Matrix) AddConstant(c float64)
- func (m *Matrix) AppendCol(data []float64) error
- func (m *Matrix) AppendRow(data []float64) error
- func (m *Matrix) CreateTranspose() *Matrix
- func (m *Matrix) GetCol(ind int) ([]float64, error)
- func (m *Matrix) GetInd(row, col int) (float64, error)
- func (m *Matrix) GetRow(ind int) ([]float64, error)
- func (m *Matrix) MultiplyConstant(c float64)
- func (m *Matrix) SetCol(ind int, data []float64) error
- func (m *Matrix) SetInd(row, col int, data float64) error
- func (m *Matrix) SetRow(ind int, data []float64) error
- type RCharacter
- type RNumeric
- type RSEXP
- func CharacterToRSEXP[t RCharacter](in []t) *RSEXP
- func MakeDataFrame(rowNames, colNames []string, dataColumns ...*RSEXP) (*RSEXP, error)
- func MakeList(in ...*RSEXP) *RSEXP
- func MakeNamedList(names []string, data ...*RSEXP) (*RSEXP, error)
- func MatrixToRSEXP(in Matrix) *RSEXP
- func NewRSEXP(in any) (r RSEXP, err error)
- func NumericToRSEXP[t RNumeric](in []t) *RSEXP
- type RSEXPTYPE
Constants ¶
This section is empty.
Variables ¶
var ( ImpossibleMatrix = errors.New("matrix size and underlying data length are not compatible") SizeMismatch = errors.New("operation is not possible with given input dimensions") InvalidIndex = errors.New("given index is impossible (ie < 0)") IndexOutOfBounds = errors.New("index is out of bounds (ie too large)") LengthMismatch = errors.New("lengths of provided inputs are not the same") )
All matrix and data frame operations check inputs for validity and will return errors where applicable.
var NotASEXP = errors.New("non-SEXP object provided to a function that needs a SEXP")
NotASEXP is returned by NewRSEXP or ExportRSEXP when it cannot coerce the input object into a *C.SEXP.
var TypeMismatch = errors.New("input SEXP type does not match desired output type")
TypeMismatch is most often returned from an AsX method when the caller tries to extract the incorrect type from a SEXP, or when they try to create a SEXP of the wrong type using a Go slice.
var UnsupportedType = errors.New("type provided is not currently supported in Rgo for this operation")
UnsupportedType is returned when a function input is not of a type that Rgo supports, such as when reading data from R that isn't of the simpler types used by Rgo.
Functions ¶
func AreMatricesEqual ¶
AreMatricesEqual returns true if the input matrices are of the same dimension and have identical data vectors. It's important to note that this function uses strict equality - even if elements of two matrices differ by floating point error, it will return false.
func AreMatricesEqualTol ¶
AreMatricesEqualTol is the same as AreMatricesEqual, except the data vectors are checked in relation to the input tolerance allowed. If any elements differ by more than the tolerance, this function will return false.
func AsCharacter ¶
func AsCharacter[t RCharacter](r RSEXP) (out []t, err error)
AsCharacter extracts the data from the input RSEXP and returns it as a slice of the given type parameter. The resulting slice that contains the same data as the contents of the RSEXP, but a new copy that can be modified independently. If the underlying data connot be coerced into string data, the TypeMismatch error is returned.
func AsNumeric ¶
AsNumeric extracts data from the input RSEXP and returns it as a slice of the given type parameter. The data is the same data that is contained in the RSEXP, but a new copy that can be modified independently. If the underlying data cannot be coerced into numeric data, the TypeMismatch error is returned.
func ExportRSEXP ¶
ExportRSEXP converts an input RSEXP object into the caller's provided C.SEXP type. It is used as a final function to prepare data to be sent back to R. The input type is any, because the rsexp package cannot anticipate the strict C.SEXP type used by the caller. Like NewRSEXP, ExportRSEXP checks the type parameter provided using reflection and returns a NotASEXP error if it is not a C.SEXP.
The intent of ExportRSEXP is that it is always called with the user providing their C.SEXP as the type parameter, like so:
mySEXP, err := [C.SEXP](rgoRSEXP)
Types ¶
type Matrix ¶
type Matrix struct {
// The Matrix header - two integers which specify its dimension
Nrow, Ncol int
// The data in a matrix is represented as a single slice of data
Data []float64
}
Matrix is a representation of a matrix in Go that mirrors how matrices are represented in R. The Matrix contains a vector of all the data, and a header of two integers that contain the dimensions of the matrix. The Data vector is organized so that column indices are together, but row indices are not. In other words, the data can be thought of as a concatenation of several vectors, each of which contains the data for one column.
For example, the following Matrix:
Matrix{Nrow: 3, Ncol: 2, Data: []float64{1.1,2.2,3.3,4.4,5.5,6.6}}
will look like this:
[1.1, 4.4 2.2, 5.5 3.3, 6.6]
Matrix data is accessed using 0-based indexing, which is natural in Go but differs from R. For example, the 0th row in the example matrix is [1.1, 4.4], while the "1st" row is [2.2, 5.5].
func AsMatrix ¶
AsMatrix returns a matrix based on the input RSEXP. All matrices must contain doubles/float64s with a dimension attribute. The data returned by this function is a copy of the data contained in the RSEXP that can be modified independently. If the data in the RSEXP cannot be coerced into a matrix, the TypeMismatch error is returned.
func CopyMatrix ¶
CopyMatrix creates an exact copy of an existing matrix. The copies are independent, so that the output matrix can be changed without changing the input matrix and vice versa.
func CreateIdentity ¶
CreateIdentity creates an identity matrix, which is always square by definition, of the input dimension. An identity matrix is a matrix will all 0s, except for having a 1 in each element of the diagonal. If the given size is impossible, it will return an InvalidIndex error.
func CreateZeros ¶
CreateZeros creates a matrix of the given dimensions in which every element is 0. If the given dimensions are nonsensical (negative, for example) it will return an InvalidIndex error.
func MatrixAdd ¶
MatrixAdd adds two matrices. Matrix addition is done by adding each element of the two matrices together, so they must be of identical size. If they are not, a SizeMismatch error will be returned.
func MatrixMultiply ¶
MatrixMultiply performs a matrix multiplication of two matrices. This is not an element-wise multiplication, but a true multiplication as defined in elementary linear algebra. In matrix multiplication, order matters. Two matrices A and B can only be multiplied if A has the same number of rows as B has number of columns. If the dimensions of the input matrices do not allow for a multiplication, a SizeMismatch error is returned.
func NewMatrix ¶
NewMatrix creates a new matrix given a vector of data. The number of rows and columns must be provided, and it assumes the data is already in the order a Matrix should be, with column indexes adjacent. In other words, the data vector should be a concatenation of several vectors, one for each column. NewMatrix makes a copy of the input slice, so that changing the slice later will not affect the data in the matrix. If the provided dimensions don't match the length of the provided data, an ImpossibleMatrix error will be returned.
func (*Matrix) AddConstant ¶
AddConstant adds a constant to every element of a matrix. There is no SubtractConstant method. To subtract a constant N from a matrix, add its negative, -N.
func (*Matrix) AppendCol ¶
AppendCol appends a column onto an existing matrix and updates the dimension metadata accordingly. If the provided data column is not equal to the number of rows in the matrix, it will return a SizeMismatch error.
func (*Matrix) AppendRow ¶
AppendRow appends a row onto an existing matrix and updates the dimension metadata accordingly. If the length of the provided row is not equal to the number of columns in the matrix, it will return a SizeMismatch error.
func (*Matrix) CreateTranspose ¶
CreateTranspose creates a new matrix which is a transpose of the input matrix. The output matrix is created from a copy of the input matrix such that they can be altered independently.
func (*Matrix) GetCol ¶
GetCol gets the column of the matrix specified by the provided index, using 0-based indexing. The first column of a matrix is index 0, even though it may be more intuitive that it should be 1. If the input index is too big, it will return a IndexOutOfBounds error. If you get this error, there's a good chance it's just an off-by-one error. The resulting slice does not point to the matrix itself, so it can be edited without altering the matrix.
func (*Matrix) GetInd ¶
This method returns the value in the element of the matrix defined by the inputs.
func (*Matrix) GetRow ¶
GetRow gets the row of the matrix specified by the provided index, using 0-based indexing. The first row of a matrix is index 0, even though it may be more intuitive that it should be 1. If the input index is too big, it will return a IndexOutOfBounds error. If you get this error, there's a good chance it's just an off-by-one error. The resulting slice does not point to the matrix itself, so it can be edited without altering the matrix.
func (*Matrix) MultiplyConstant ¶
MultiplyConstant multiplies each element of a matrix by a constant. There is no DivideConstant method. To divide a matrix by a constant N, multiply it by its reciprocal, 1/N.
func (*Matrix) SetCol ¶
SetCol sets the column of the matrix, specified by the input index, to match the data provided. If the length of the provided column is not of the same as the number of rows in the matrix, it will return a SizeMismatch error.
type RCharacter ¶
RCharacter is a type parameter of Go types that map well onto R's character type, which is a string and a byte slice.
type RNumeric ¶
RNumeric is a type parameter of Go types that map well onto R's numeric types, including both doubles and integers. It includes both float types and all int types, but does not contain unsigned integers because R has no equivalent type.
type RSEXP ¶
RSEXP is the workhorse type of the rsexp package. It is an identical to R's SEXP implementation in C. It is used for any operation that deals with sending, receiving, or modifying data that moves from R to Go or from Go back to R.
The RSEXP type exists because Cgo does not allow packages to export C types. Therefore, the C.SEXP that is defined in the rsexp package is different from a C.SEXP in the caller's main package. The RSEXP type acts as a go-between, allowing the use of functions in this package to extract, modify, or create SEXP objects without needing to write additional C code.
func CharacterToRSEXP ¶
func CharacterToRSEXP[t RCharacter](in []t) *RSEXP
CharacterToRSEXP converts a slice of strings (or byte slices) into a C.SEXP, represented by the returned RSEXP data. The R representation will have the same data as the input slice and be the STRSXP type (aka the character type in R).
func MakeDataFrame ¶
MakeDataFrame creates an R data frame from the provided inputs and returns its representing RSEXP object.
It creates a data frame based on the provided data columns (as RSEXP objects) and column names. Users may directly provide row names or an empty slice. If an empty slice is provided, then row names are automatically generated as the row indexes, starting at 1 instead of 0 to be consistent with R.
MakeDataFrame is strict about making sure the provided inputs can create a valid R data frame. The number of column names and data columns provided must match. Likewise, the lengths of all the provided data columns must match and, if row names are provided, match the number of provided row names. If any of these conditions are false, then a LengthMismatch error will be returned with more detail about which condition failed.
MakeDataFrame also checks that the types of all the provided data columns are valid types according to Rgo and that they can be used to create a column in a data frame. If these conditions are not met, an UnsupportedType error will be returned. Right now, this list includes integer vectors, real vectors, and string vectors only. Examples of invalid types include lists, data frames, or other nested SEXP objects.
func MakeList ¶
MakeList creates an R list from the provided inputs and returns its representing RSEXP object. Unlike MakeDataFrame and MakeNamedList, there are no restrictions on the data that is provided.
func MakeNamedList ¶
MakeNamedList creates a named list based on the provided input names and data and returns its representing RSEXP object. Users must provide the names and each element of the list. If number of names and elements provided do not match, a LengthMismatch error is returned. Elements of a named list may be of any valid SEXP type.
func MatrixToRSEXP ¶
MatrixToRSEXP converts a Matrix a C.SEXP, represented by the returned RSEXP data. The R representation will have the same data and dimensions as the input Matrix and be of the REALSXP type (aka a double in R).
func NewRSEXP ¶
NewRSEXP creates an RSEXP object from the function input. It attempts to create an RSEXP object from any input, but only succeeds if the provided type is a C.SEXP or a *C.SEXP.
It would be ideal for this function to have a more limited set of input types (like only those that can be coerced to a C.SEXP), but checking the "coercibility" of an input (without knowing the universe of input types a priori) is impossible at compile time in Go.
Generally speaking, a failed type cast in Go results in a panic. NewRSEXP returns useful errors to the fullest extent possible, but guaranteeing no runtime failures or panics is impossible.
NewRSEXP uses a combination of reflection and the unsafe package to coerce the input into Rgo's C.SEXP type. First, it creates an unsafe pointer to the underlying data. Then, it uses reflection to verify that the input type is either a C.SEXP or a *C.SEXP. Then, it performs the coercion. If this fails at any point, it returns a NotASEXP error. If the input is a SEXP, but not one of the types supported by the Rgo package, then it returns an UnsupportedType error.
func NumericToRSEXP ¶
NumericToRSEXP converts a slice of numeric data into a C.SEXP, represented by the returned RSEXP data. The R representation will have the same data as the input slice and be the REALSXP type (aka a double in R). Because the intent of this function is to prepare data to be sent back to R, which largely treats doubles and integers the same, this function cannot return an RSEXP of type INTSXP.
type RSEXPTYPE ¶
type RSEXPTYPE int
RSEXPTYPE is the Go equivalent of R's type enumerations for SEXP types. Package constants match the enumerations used by R and have the same names.
const ( CHARSXP RSEXPTYPE = 9 INTSXP RSEXPTYPE = 13 REALSXP RSEXPTYPE = 14 // A STRSXP is a vector of strings, where each element points to a CHARSXP. STRSXP RSEXPTYPE = 16 // VECSXP is a list, which is not obvious from the name. Each element of a VECSXP is a SEXP and can be of any type. VECSXP RSEXPTYPE = 19 )
These constants are enumerations of the SEXPTYPEs that are part of R's internals. There are about 2 dozen in all, Rgo only supports 5 of them.