With Repository, in-memory objects do not need to know whether there is a database present or absent, they need no SQL interface code, and certainly no knowledge of the database schema.
In R, the simplest form of Repository encapsulates data.frame
entries persisted in a data store and the operations performed over them, providing a more object-oriented view of the persistence layer. From the caller point of view, the location (locally or remotely), the technology and the interface of the data store are obscured.
In situations with multiple data sources.
In situations where the real data store, the one that is used in production, is remote. This allows you to implement a Repository mock with identical queries that runs locally. Then, the mock could be used during development and testing. The mock itself may comprise a sample of the real data store or just fake data.
In situations where the real data store doesn’t exist. Implementing a mock Repository allows you to defer immature decisions about the database technology and/or defer its deployment. In this way, the temporary solution allows you to focus the development effort on the core functionality of the application.
In situations where using SQL queries can be represented by meaningful names. For example Repository$get_efficient_cars() = SELECT * FROM mtcars WHERE mpg > 20
When building stateless microservices.
The code of the abstract base class of Repository is
AbstractRepository <- R6::R6Class("Repository", inherit = Singleton, cloneable = FALSE, public = list(
#' @description Instantiate an object
initialize = function() exceptions$not_implemented_error(),
#' @description Add an element to the Repository.
add = function(key, value) exceptions$not_implemented_error(),
#' @description Delete an element from the Repository.
del = function(key) exceptions$not_implemented_error(),
#' @description Retrieve an element from the Repository.
get = function(key) exceptions$not_implemented_error()
))
Tip: By passing the input argument inherit = Singleton
, the AbstractRepository
inherits the qualities of the Singleton pattern.
The given implementing of AbstractRepository
requires you to define four functions:
initialize
establishes a database connection of some sort;add
adds one or more domain objects into the database;del
deletes one or more domain objects from the database; andget
retrieve one or more domain objects from the database.Tip: In general, the Repository patterns requires at least the add
and get
operations. However, you may rename those operations to fit your context. For example, if you use Repository to access various tables in a database, write_table
and read_table
might be better names.
Note: It is up to you to devise a policy that defines (A) what to do when the same entity is added to the Repository; and (B) what to do when a query matches no results.
Each Repository implementation is project specific. The following implementation is a Repository of car models with their specifications.
From the caller perspective, both implementations behave identically – they have the same queries. Nevertheless, under the hood the two implementations employ different storage approaches.
collections
Tip: Transient implementations are a temporal solution that is good for testing and rapid prototyping.
Caution: Transient implementations are not recommended during the production stage. Transient storage is lost when a session is rebooted. You should think about what are the ramifications of losing all the data put into storage.
First, we define the class constructor, initialize
, to establish a transient data storage. In this case we use a dictionary from the collections
package.
Second, we define the add
, del
and get
functions that operate on the dictionary.
As an optional step, we define the NULL object. In this case, rather then the reserved word NULL
, the NULL object is a data.frame with 0 rows and predefined column.
TransientRepository <- R6::R6Class(
classname = "Repository", inherit = R6P::AbstractRepository, public = list(
initialize = function() {private$cars <- collections::dict()},
add = function(key, value){private$cars$set(key, value); invisible(self)},
del = function(key){private$cars$remove(key); invisible(self)},
get = function(key){return(private$cars$get(key, default = private$NULL_car))}
), private = list(
NULL_car = cbind(uid = NA_character_, datasets::mtcars)[0,],
cars = NULL
))
Adding customised operations is also possible via the R6 set
function. The following example, adds a query that returns all the objects in the database
TransientRepository$set("public", "get_all_cars", overwrite = TRUE, function(){
result <- private$cars$values() %>% dplyr::bind_rows()
if(nrow(result) == 0) return(private$NULL_car) else return(result)
})
In this example, we use the mtcars
dataset with a uid
column that uniquely identifies the different cars in the Repository:
mtcars <- datasets::mtcars %>% tibble::rownames_to_column("uid")
head(mtcars, 2)
#> uid mpg cyl disp hp drat wt qsec vs am gear carb
#> 1 Mazda RX4 21 6 160 110 3.9 2.620 16.46 0 1 4 4
#> 2 Mazda RX4 Wag 21 6 160 110 3.9 2.875 17.02 0 1 4 4
Here is how the caller uses the Repository
:
## Instantiate a repository object
repository <- TransientRepository$new()
## Add two different cars specification to the repository
repository$add(key = "Mazda RX4", value = dplyr::filter(mtcars, uid == "Mazda RX4"))
repository$add(key = "Mazda RX4 Wag", value = dplyr::filter(mtcars, uid == "Mazda RX4 Wag"))
## Get "Mazda RX4" specification
repository$get(key = "Mazda RX4")
#> uid mpg cyl disp hp drat wt qsec vs am gear carb
#> 1 Mazda RX4 21 6 160 110 3.9 2.62 16.46 0 1 4 4
## Get all the specifications in the repository
repository$get_all_cars()
#> uid mpg cyl disp hp drat wt qsec vs am gear carb
#> 1 Mazda RX4 21 6 160 110 3.9 2.620 16.46 0 1 4 4
#> 2 Mazda RX4 Wag 21 6 160 110 3.9 2.875 17.02 0 1 4 4
## Delete "Mazda RX4" specification
repository$del(key = "Mazda RX4")
## Get "Mazda RX4" specification
repository$get(key = "Mazda RX4")
#> [1] uid mpg cyl disp hp drat wt qsec vs am gear carb
#> <0 rows> (or 0-length row.names)
DBI
First, we define the class constructor, initialize
, to establish an SQLite database.
Second, we define the add
, del
and get
functions that operate on the dictionary.
As an optional step, we define the NULL object. In this case, rather then the reserved word NULL
, the NULL object is a data.frame with 0 rows and predefined column.
PersistentRepository <- R6::R6Class(
classname = "Repository", inherit = AbstractRepository, public = list(
#' @param immediate (`logical`) Should queries be committed immediately?
initialize = function(immediate = TRUE){
private$immediate <- immediate
private$conn <- DBI::dbConnect(RSQLite::SQLite(), dbname = ":memory:")
DBI::dbCreateTable(private$conn, "mtcars", private$NULL_car)
},
add = function(key, value){
car <- private$NULL_car %>% tibble::add_row(value)
self$del(key = key)
DBI::dbAppendTable(private$conn, "mtcars", car)
invisible(self)
},
del = function(key){
statement <- paste0("DELETE FROM mtcars WHERE uid = '", key, "'")
DBI::dbExecute(private$conn, statement, immediate = private$immediate)
invisible(self)
},
get = function(key){
statement <- paste0("SELECT * FROM mtcars WHERE uid = '", key, "'")
result <- DBI::dbGetQuery(private$conn, statement)
if(nrow(result) == 0) return(private$NULL_car) else return(result)
}
), private = list(
NULL_car = cbind(uid = NA_character_, datasets::mtcars)[0,],
immediate = NULL,
conn = NULL)
)
Adding customised operations is also possible via the R6 set
function. The following example, adds a query that returns all the objects in the database
PersistentRepository$set("public", "get_all_cars", overwrite = TRUE, function(){
statement <- "SELECT * FROM mtcars"
result <- DBI::dbGetQuery(private$conn, statement)
if(nrow(result) == 0) return(private$NULL_car) else return(result)
})
In this example, we use the mtcars
dataset with a uid
column that uniquely identifies the different cars in the Repository:
mtcars <- datasets::mtcars %>% tibble::rownames_to_column("uid")
head(mtcars, 2)
#> uid mpg cyl disp hp drat wt qsec vs am gear carb
#> 1 Mazda RX4 21 6 160 110 3.9 2.620 16.46 0 1 4 4
#> 2 Mazda RX4 Wag 21 6 160 110 3.9 2.875 17.02 0 1 4 4
Here is how the caller uses the Repository
:
## Instantiate a repository object
repository <- PersistentRepository$new()
## Add two different cars specification to the repository
repository$add(key = "Mazda RX4", value = dplyr::filter(mtcars, uid == "Mazda RX4"))
repository$add(key = "Mazda RX4 Wag", value = dplyr::filter(mtcars, uid == "Mazda RX4 Wag"))
## Get "Mazda RX4" specification
repository$get(key = "Mazda RX4")
#> uid mpg cyl disp hp drat wt qsec vs am gear carb
#> 1 Mazda RX4 21 6 160 110 3.9 2.62 16.46 0 1 4 4
## Get all the specifications in the repository
repository$get_all_cars()
#> uid mpg cyl disp hp drat wt qsec vs am gear carb
#> 1 Mazda RX4 21 6 160 110 3.9 2.620 16.46 0 1 4 4
#> 2 Mazda RX4 Wag 21 6 160 110 3.9 2.875 17.02 0 1 4 4
## Delete "Mazda RX4" specification
repository$del(key = "Mazda RX4")
## Get "Mazda RX4" specification
repository$get(key = "Mazda RX4")
#> [1] uid mpg cyl disp hp drat wt qsec vs am gear carb
#> <0 rows> (or 0-length row.names)
Repository at Martin Fowler Blog
Fowler, Martin. 2002. Patterns of enterprise application architecture. Addison-Wesley Longman Publishing Co., Inc.