Validate JSON data using schemas
Introduction
JSON files are a great way to expose configurations in our application without having to change the code. However, sometimes we or our users might enter data that is not expected, so we need to validate somehow the input. One way to do this is programatically, by checking that certain key values are present, however, for more complex configurations where some attributes are optional this can become difficult.
One alternative is to use schemas, as defined by the JSON Schema specification https://json-schema.org/specification
Let's see a simple example
Dependencies
We are going to use python's jsonschema library, install it with
pip install jsonschema
There are validators for other languages, please see them in https://json-schema.org/implementations#validators
File: schema.json
{ "type": "object", "required": ["name", "age"], "properties": { "name": {"type": "string"}, "age": {"type": "number"}, "children": { "type": "array", "items": { "$ref": "#"} }, "address": { "$ref": "#/$defs/address"} }, "additionalProperties": false, "$defs": { "address": { "type": "object", "properties": { "street": {"type": "string"}, "city": {"type": "string"}, "state": {"type": "string"} }, "additionalProperties": false } } }
As we can see we have a "person" type which has the properties name (string), age (number), children (of type person as well), and address which is another object defined in the schema in the $defs section with properties street, city, and state (all strings). For "person" name and age are required and no extra properties are allowed by setting "additionalProperties": false
Let's define an object which we will try to validate
File: data.json
{ "name": "Joe", "age": 25, "address": { "street": "123 Main St", "city": "Springfield", "state": "IL" } }
Now let's define the python code to validate it
import json from jsonschema import validate schema = None data = None with open('schema.json', 'r') as f: schema = json.load(f) with open('data.json', 'r') as f: data = json.load(f) try: validate(data, schema) print("Data is valid") except Exception as e: print("Data is invalid") print(e)
As expected if we run this script, the data will be valid
python3 validate.py Data is valid
However, if we edit the json and remove the "age" property, the validation will fail with a message like
Data is invalid 'age' is a required property Failed validating 'required' in schema: {'$defs': {'address': {'additionalProperties': False, 'properties': {'city': {'type': 'string'}, 'state': {'type': 'string'}, 'street': {'type': 'string'}}, 'type': 'object'}}, 'additionalProperties': False, 'properties': {'address': {'$ref': '#/$defs/address'}, 'age': {'type': 'number'}, 'children': {'items': {'$ref': '#'}, 'type': 'array'}, 'name': {'type': 'string'}}, 'required': ['name', 'age'], 'type': 'object'} On instance: {'address': {'city': 'Springfield', 'state': 'IL', 'street': '123 Main St'}, 'name': 'Joe'}
The same will happen if we change the type of a property like name to an int
Data is invalid 1 is not of type 'string' Failed validating 'type' in schema['properties']['name']: {'type': 'string'} On instance['name']: 1
There are a lot more attributes and validations you can perform, if you would like to learn more visit https://json-schema.org/specification