A small, checkmate-first schema DSL for R data.
schemate provides a small, checkmate-first schema DSL for R data. It can infer schemas from example objects, edit schema documents, save them as JSON, read them back, and validate new inputs against the schema.
The package is meant for package authors and pipeline authors who want a compact R-native schema format without adopting the full JSON Schema vocabulary. A typical workflow is:
- infer a conservative schema with
schema_infer(); - edit it with
schema_*()authoring verbs; - save it with
schema_write(); - read it back with
schema_read(); - validate inputs with
schema_validate().
Installation
install.packages("schemate")Development Version
To get a bug fix or to use a feature from the development version, you can install the development version of schemate from GitHub.
# install.packages("pak")
pak::pak("hongyuanjia/schemate")Quick Start
The public API uses a single schema_ prefix and works well in pipelines. Start from an example object, infer a conservative schema, then compact it into something easier to edit and review.
library(schemate)
payload <- list(
items = list(
list(id = 1L, name = "alpha", label = "Alpha", slug = "alpha"),
list(id = 2L, name = "beta", label = "Beta", slug = "beta")
)
)
schema <- payload |>
schema_infer(keys = "named", arrays = "rest") |>
schema_compact() |>
schema_set_desc("$items", "Repository-like result items")
schema## {
## "check": {
## "kind": "list"
## },
## "keys": {
## "type": "named"
## },
## "fields": {
## "items": {
## "description": "Repository-like result items",
## "check": {
## "kind": "list"
## },
## "keys": {
## "type": "unnamed"
## },
## "rest": {
## "check": {
## "kind": "list"
## },
## "keys": {
## "type": "named"
## },
## "fields": {
## "id": {
## "check": {
## "kind": "int"
## }
## }
## },
## "groups": [
## {
## "names": ["name", "label", "slug"],
## "check": {
## "kind": "string"
## }
## }
## ]
## }
## }
## }
## }
schema |>
schema_validate(payload, mode = "test")schema_validate() defaults to assert mode: invalid input raises an error and valid input is returned invisibly. Other modes are available when you need a message or a boolean result.
bad_payload <- payload
bad_payload$items[[1L]]$id <- "bad"
schema |>
schema_validate(bad_payload, mode = "check", name = "payload")
schema |>
schema_validate(bad_payload, mode = "test", name = "payload")When validating many payloads against the same schema, flatten once and reuse the flattened schema.
flat <- schema_flatten(schema)
schema_validate(flat, payload, mode = "test")For a data frame example, see the Get started article.
JSON Workflow
Schemas are stored as a compact JSON DSL. The DSL is not JSON Schema; it is a thin representation of checkmate checks, field schemas, local definitions, and combinators. See the Schema DSL article for the complete format reference. schema_read() and schema_write() require the suggested package jsonlite.
path <- tempfile(fileext = ".json")
schema_write(schema, path)
restored <- schema_read(path)
restored## {
## "check": {
## "kind": "list"
## },
## "keys": {
## "type": "named"
## },
## "fields": {
## "items": {
## "description": "Repository-like result items",
## "check": {
## "kind": "list"
## },
## "keys": {
## "type": "unnamed"
## },
## "rest": {
## "check": {
## "kind": "list"
## },
## "keys": {
## "type": "named"
## },
## "fields": {
## "id": {
## "check": {
## "kind": "int"
## }
## }
## },
## "groups": [
## {
## "names": ["name", "label", "slug"],
## "check": {
## "kind": "string"
## }
## }
## ]
## }
## }
## }
## }
restored |>
schema_validate(payload)Example schema files are installed under inst/extdata:
system.file("extdata", "person-schema.json", package = "schemate")Validation Modes
schema_validate() supports four modes:
| Mode | Return value on success | Return value on failure |
|---|---|---|
assert |
invisibly returns the input | throws an error |
check |
TRUE |
diagnostic string |
test |
TRUE |
FALSE |
expect |
testthat-style expectation object | expectation failure object |
Use assert inside application code, check when displaying diagnostics, test for control flow, and expect in tests.
Standalone Use
schemate also publishes a generated standalone bundle for packages that want the schema features without depending on schemate at runtime.
usethis::use_standalone("hongyuanjia/schemate", "schema", ref = "standalone")Relation to Other Tools
schemate is closest in spirit to checkmate: schemas ultimately validate R objects by calling checkmate checks. It adds a schema lifecycle around those checks: infer, edit, serialize, read, and validate.
pointblank is a better fit for tabular data quality workflows, reporting, and column-oriented validation plans. schemate is deliberately narrower and more structural: it describes R values, R object names, nested lists, JSON-like payloads, and package-facing input contracts. It is not a replacement for JSON Schema or jsonvalidate, which are better choices when you need standards-compliant JSON document validation.
The R validation ecosystem is broad:
-
validatecaptures data validation rules that can be documented, stored, and applied to data sets. -
assertris designed for assertive data checks inside analysis pipelines. -
data.validatorfocuses on dataset validation with reporting. -
vetrprovides template-based structural checks for R objects. -
testthatis the right home for unit-test expectations;schema_validate(..., mode = "expect")is intended to fit into that style.