This vignette documents the JSON DSL used by schemate.
Most users should start with schema_infer() and the edit
helpers; hand-written DSL is useful for package fixtures, configuration
files, and stable schemas reviewed outside R.
schemate JSON DSL
Goal
This DSL is a human-centric, checkmate-first schema format for
hand-written JSON. It is designed to describe the structure of an
existing R object and to be parsed into a schemate
SchemaDoc object.
This DSL is not a JSON Schema implementation. Its
semantics are derived from checkmate::assert_*() and a
small set of schema combinators.
Minimum mental model
Most schema documents are nested combinations of a few keywords:
| DSL key | Use it for |
|---|---|
check |
the checkmate constraint for the current node |
fields |
exact field-name schemas for named containers |
groups |
multiple sibling fields that share the same schema |
patterns |
regex field-name schemas for otherwise unspecified fields |
positions |
position-specific schemas for unnamed lists |
rest |
otherwise unspecified fields or remaining unnamed-list positions |
$defs / $ref
|
manually named reusable schema components |
all, any, one,
not
|
schema combinators |
For most workflows, start with schema_infer(), inspect
as.list(schema), then use edit verbs to add the few rules
inference cannot know.
Common schema shapes
A scalar schema is just a check node:
A named container uses fields for exact child names and
can use rest for the remaining fields:
{
"check": { "kind": "list" },
"keys": { "type": "named" },
"fields": {
"id": { "check": { "kind": "int", "lower": 1 } }
},
"rest": { "check": { "kind": "string" } }
}An unnamed JSON-like array uses keys.type = "unnamed".
Use rest for a homogeneous array and positions
when the first positions have tuple-like meaning:
{
"check": { "kind": "list" },
"keys": { "type": "unnamed" },
"rest": { "check": { "kind": "string" } }
}The rest of this article is a reference for the same building blocks.
The container-child sections explain fields,
groups, patterns, positions, and
rest; the formal-rule sections explain reserved keys,
shorthand syntax, validation order, and serialization order.
When to write DSL JSON by hand
Most users should start from schema_infer(), edit the
result with schema_add_field(),
schema_set_desc(), schema_set_keys(), and
related helpers, then save with schema_write().
Hand-written JSON is useful when:
- a schema is part of a package or analysis configuration;
- a non-R workflow needs to review or patch the schema text;
- you want stable schema fixtures for tests;
- you are translating an existing input contract into a compact R schema.
Prefer small, explicit schemas. Use $defs and
$ref when the same child schema appears repeatedly, and use
groups when several sibling fields share one rule.
Helper mapping
The R helpers create the same JSON shapes documented below:
| R helper | DSL keyword |
|---|---|
schema_check() |
check |
schema_ref() |
$ref |
schema_all() |
all |
schema_any() |
any |
schema_one() |
one |
schema_not() |
not |
schema_group() |
groups[] |
Formal reference
Core principles
- Each schema node has exactly one primary operator.
- The primary operator must be one of:
checkallanyonenot$ref
- A
checknode usescheck.kindto select a supportedcheckmate::assert_*()suffix. - All keys inside
checkother thankindare passed directly toassert_<kind>(). - There is no
checkskeyword in this grammar; useallfor conjunction. - Container kinds are:
listdata_framedata_tabletibble
-
fields,groups,patterns,positions,rest, andkeysare only valid onchecknodes whosecheck.kindis a container kind. -
positionsmaps 1-based positions to child schemas for unnamed lists. -
restis the child schema for otherwise unspecified object fields or unnamed list elements. -
patternsmaps regular expressions to child schemas for otherwise unspecified object fields whose names match those patterns. -
groupscan batch-assign multiple field names to one shared child schema node. - A top-level schema document may optionally declare
versionas a non-empty string representing the document version. - Root-level
$defscan store reusable schema nodes and$refcan reuse them. - Inside
all/any/one/not, a branch may use shorthand check syntax without an explicitcheckwrapper.
Reserved schema DSL keys
The following keys are reserved by the schema DSL:
checkallanyonenotfieldsgroupspatternspositionsrestkeysdescription$defs$refversion
$defs and version are document-level keys.
They are only valid on the top-level schema document and must not appear
inside nested schema nodes or shorthand combinator branches.
Field names and field-name rules from the validated data do not appear at the node top level. They appear inside:
fieldsgroups[].namespatterns-
$defsdefinition names
Top-level schema document
A schema document contains:
- an optional
versionstring - an optional
$defsobject - exactly one root schema node
Example:
{
"version": "1.0.0",
"$defs": {
"text": { "check": { "kind": "string" } }
},
"description": "root alias",
"$ref": "#/$defs/text"
}version is document metadata only. It describes the
schema document version and does not alter node parsing or validation
semantics.
For canonical serialization via as.list(), top-level
keys are emitted in this order when present:
version$defs- the root schema node keys
Container check nodes
Only the following check.kind values are treated as
container checks:
listdata_framedata_tabletibble
Only these container checks may carry:
keysfieldsgroups
All non-container check.kind values must reject these
keys.
Node forms
Check node
A check node has the form:
Rules:
-
checkmust be an object. -
check.kindmust be a non-empty string. -
check.kindmust name a supportedcheckmate::assert_*()suffix. - All other keys inside
checkare forwarded toassert_<kind>().
Examples:
-
"string"->assert_string() -
"list"->assert_list() -
"choice"->assert_choice() -
"int"->assert_int() -
"number"->assert_number() -
"flag"->assert_flag() -
"data_frame"->assert_data_frame()
Unknown kind values should be rejected during schema
parsing.
all
all is a non-empty array of child schemas.
The node is valid only if all child schemas validate successfully.
Example:
any
any is a non-empty array of child schemas.
The node is valid if at least one child schema validates successfully.
Example:
one
one is a non-empty array of child schemas.
The node is valid if exactly one child schema validates successfully.
Example:
not
not is a single child schema.
The node is valid only if the child schema does not validate successfully.
Example:
Shorthand check syntax inside combinators
Inside the child positions of:
allanyonenot
the DSL allows a shorthand form for a check node:
This is equivalent to:
Rules for shorthand check objects:
- They must contain
kind. - They may use any supported
check.kind, including container kinds such aslist,data_frame,data_table, andtibble. - They must not contain any explicit primary operator keys:
checkallanyonenot$ref
- They must not contain any node-level adjunct keys:
fieldsgroupskeysdescription$defs
- They are only valid inside combinator child positions.
If a combinator branch needs node-level metadata or adjunct schema features, write it as a full node instead of shorthand.
Allowed examples:
Disallowed examples:
Example:
$defs and $ref
The DSL supports a minimal, local-reference reuse mechanism inspired by JSON Schema.
fields
fields is only valid on nodes where:
- the primary operator is
check -
check.kindis a container kind
fields must be a JSON object whose values are complete
schema nodes. The field name "*" has no special meaning; it
is treated like any other literal field name. Use rest for
otherwise unspecified fields.
Example:
{
"check": { "kind": "list" },
"fields": {
"field1": { "check": { "kind": "string" } },
"field2": { "check": { "kind": "int" } }
}
}Pattern fields
patterns maps regular expressions to schema nodes.
Patterns are applied only to fields that are not already covered by
fields or groups.
Example:
{
"check": { "kind": "list" },
"patterns": {
"^meta_": { "check": { "kind": "string" } },
"_count$": { "check": { "kind": "int" } }
}
}If a field matches multiple patterns, it must satisfy every matching schema.
Rest schema
rest is the child schema for otherwise unspecified
fields.
Example:
This means that any field not covered by fields,
groups, or patterns must satisfy the
rest schema.
positions
positions is only valid on unnamed container nodes:
- the primary operator is
check -
check.kindis a container kind -
keys.typeis"unnamed"
positions is an unnamed array of schema nodes. It
follows JSON Schema prefixItems semantics: a declared
position is validated only when the input actually has that position.
Missing positions do not fail by themselves; use len,
min.len, or max.len inside the
check rule when length matters.
Example, similar to a CrossRef date-parts item:
{
"check": { "kind": "list", "min.len": 1, "max.len": 3 },
"keys": { "type": "unnamed" },
"positions": [
{ "check": { "kind": "int", "lower": 0 } },
{ "check": { "kind": "int", "lower": 1, "upper": 12 } },
{ "check": { "kind": "int", "lower": 1, "upper": 31 } }
]
}When positions is combined with rest,
declared positions are validated first and all remaining positions are
validated with rest:
{
"check": { "kind": "list" },
"keys": { "type": "unnamed" },
"positions": [
{ "check": { "kind": "string" } },
{ "check": { "kind": "int" } }
],
"rest": { "check": { "kind": "number" } }
}With keys.type = "unnamed", positions and
rest are allowed. fields, groups,
and patterns are named-object constraints and must not be
mixed with unnamed array semantics.
groups
groups is an optional array available only on nodes
where:
- the primary operator is
check -
check.kindis a container kind
Each group item must be a named object with:
-
names: a non-empty character vector of field names - a complete schema node written directly beside
names
groups is expanded during parsing into ordinary named
fields that behave exactly like entries written explicitly in
fields.
Rules:
- group names must be unique within a group
- group names must not overlap across groups
- group names must not overlap with explicit
fields - each group item must contain exactly one primary operator in
addition to
names - a group item must not wrap its target node inside an anonymous nested list
- a group item must not contain both
$refandcheckat the same level - if present,
descriptionmust be a non-empty string
If a group item provides description, it is copied to
every expanded field in that group and overrides the referenced node’s
own description.
Valid examples:
{
"names": ["name", "label"],
"all": [
{ "$ref": "#/$defs/text" },
{ "check": { "kind": "string" } }
]
}Groups can also share a container schema. This is useful after compacting API payload schemas where several fields have the same nested structure:
{
"names": ["issued", "created", "published-print"],
"check": { "kind": "list" },
"fields": {
"date-parts": {
"check": { "kind": "list" },
"keys": { "type": "unnamed" },
"rest": {
"check": { "kind": "list" },
"keys": { "type": "unnamed" },
"rest": { "check": { "kind": "int" } }
}
}
}
}Invalid example (R-list shape):
The invalid form above is rejected because:
- the target node is wrapped in an anonymous nested list
- the inner object contains more than one primary operator
Example:
keys
keys is only valid on container check
nodes.
keys encodes arguments for:
checkmate::check_names(names(x), ...)Scalar form
If keys is a scalar value, it is shorthand for:
That scalar value is therefore interpreted as the type
argument of checkmate::check_names().
Example:
Object form
If keys is an object:
- all keys are passed directly to
checkmate::check_names(names(x), ...) -
keys.typeis the only way to specify the name-check type
Example:
{
"check": { "kind": "list" },
"keys": {
"type": "unique",
"must.include": ["kind", "value"],
"subset.of": ["kind", "value", "negate"]
}
}This represents:
checkmate::check_names(
names(x),
type = "unique",
must.include = c("kind", "value"),
subset.of = c("kind", "value", "negate")
)
description
description is metadata only.
It does not affect validation and may be used for documentation, debugging, or pretty-printing.
description may appear on any full schema node and on
$ref nodes.
For canonical serialization via as.list(),
description is emitted first inside ordinary schema nodes.
At the top-level document, version remains first when
present, then root description, then $defs,
then the root operator keys.
Validation order for one node
Validation depends on the node’s primary operator.
check node
Validation of a check node proceeds in this order:
- run the primary
assert_<kind>() - apply optional
keyshandling - if
check.kindis a named container, recursively validate exact fields, pattern fields, and rest fields - if
check.kindis an unnamed container, recursively validate positions, then rest positions
all
Validate each child schema in order.
- If any child fails, the
allnode fails immediately. - If all children pass, the
allnode passes.
any
Validate child schemas until one succeeds.
- If a child succeeds, the
anynode passes immediately. - If all children fail, the
anynode fails.
one
Validate child schemas while counting successful branches.
- If exactly one child succeeds, the
onenode passes. - If zero children succeed, the
onenode fails. - If more than one child succeeds, the
onenode fails.
Formal validation rules
When parsing the JSON DSL itself, the following should be enforced:
- Every node must contain exactly one primary operator from
check,all,any,one,not,$ref. -
checkmust be an object when present. -
check.kindmust be a non-empty string. -
check.kindmust name a supportedcheckmate::assert_*()suffix. -
all,any, andonemust be non-empty arrays when present. -
notmust be a single schema node. - Child entries of
all,any,one, andnotmust be either:- a complete schema node, or
- a shorthand check object.
- A shorthand check object must contain
kind. - A shorthand check object may use any supported
check.kind. - A shorthand check object must not contain explicit primary operator keys.
- A shorthand check object must not contain node-level adjunct keys.
-
fieldsis only allowed when the node is achecknode andcheck.kindis a container kind. -
groupsis only allowed when the node is achecknode andcheck.kindis a container kind. -
fieldsmust be an object when present. -
groupsmust be an array when present. -
patternsmust be an object when present. -
positionsmust be an unnamed array when present. -
positionsrequireskeys.type = "unnamed". - If
keys.type = "unnamed",fields,groups, andpatternsmust not be present. -
keysis only allowed on containerchecknodes. -
keysmust be either a scalar or an object. - If
keysis an object, all of its arguments are forwarded tocheckmate::check_names(). -
keys.checkis invalid. - Each
groups[]item must containnamesplus a complete schema node. - The schema node inside
groups[]follows the same primary-operator and keyword rules as any other schema node. -
groups[]items must not wrap the target node inside an anonymous nested list. -
versionis only allowed at the root schema document. -
versionmust be a non-empty string when present. -
$defsis only allowed at the root schema document. -
$defsmust be an object whose values are complete schema nodes. -
$refmust be a local string reference of the form#/$defs/name. - A
$reftarget must exist in the current document’s$defs. - A
$refnode must not contain other schema keys exceptdescription. -
descriptionmust be a non-empty string when present.
Additional examples
Parameter-like schema
{
"check": { "kind": "list" },
"keys": {
"type": "unique",
"subset.of": ["project", "activity_id", "experiment_id"]
},
"rest": {
"check": { "kind": "list" },
"keys": {
"type": "unique",
"must.include": ["kind", "value"],
"subset.of": ["kind", "value", "negate"]
},
"fields": {
"kind": {
"check": {
"kind": "choice",
"choices": ["facet", "datetime_start", "datetime_stop"]
}
},
"value": {
"check": {
"kind": "atomic",
"any.missing": false
}
},
"negate": {
"check": {
"kind": "flag",
"null.ok": true
}
}
}
}
}Common mistakes
- Do not mix primary operators in one node. A node cannot contain both
checkandall, for example. - Do not put field names directly beside
check. Field names belong underfieldsorgroups. - Do not use JSON Schema keywords such as
type,properties,required, oradditionalPropertiesunless they are checkmate arguments insidecheckorkeys. - Do not expect
fieldsto imply required fields. Add akeysrule such as"must.include": ["id", "name"]when fields must be present. - Do not use
$defsinside nested nodes. Definitions live at the document root.