CSVLinter

JSON schema example

Learn how to use JSON schemas to validate your CSV data structure and enforce data quality rules.

What is a JSON schema?

A JSON schema defines the structure, data types, and validation rules for your CSV data. It helps ensure your data meets specific requirements like required fields, data formats, and value constraints.

Basic example

Here's a simple schema for validating user data:

{
  "$schema": "http://json-schema.org/draft-07/schema#",
  "title": "CSV row",
  "type": "object",
  "required": [
    "id",
    "name",
    "email"
  ],
  "properties": {
    "id": {
      "type": "integer"
    },
    "name": {
      "type": "string"
    },
    "email": {
      "type": "string",
      "pattern": "^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\\.[a-zA-Z]{2,}$"
    }
  },
  "additionalProperties": false
}

What this schema validates:

  • Required fields: id, name, and email must be present
  • Data types: id must be an integer, name and email must be strings
  • Email format: email must match a valid email pattern
  • No extra fields: additionalProperties: false prevents unexpected columns

Advanced features

String constraints

"name": {
  "type": "string",
  "minLength": 1,
  "maxLength": 100
}

Number ranges

"age": {
  "type": "integer",
  "minimum": 0,
  "maximum": 120
}

Enum values

"status": {
  "type": "string",
  "enum": ["active", "inactive", "pending"]
}

Date formats

"created_date": {
  "type": "string",
  "format": "date"
}

Complete example

Here's a comprehensive schema for an employee database:

{
  "$schema": "http://json-schema.org/draft-07/schema#",
  "title": "Employee record",
  "type": "object",
  "required": ["id", "first_name", "last_name", "email", "department"],
  "properties": {
    "id": {
      "type": "integer",
      "minimum": 1
    },
    "first_name": {
      "type": "string",
      "minLength": 1,
      "maxLength": 50
    },
    "last_name": {
      "type": "string",
      "minLength": 1,
      "maxLength": 50
    },
    "email": {
      "type": "string",
      "format": "email"
    },
    "department": {
      "type": "string",
      "enum": ["engineering", "marketing", "sales", "hr", "finance"]
    },
    "salary": {
      "type": "number",
      "minimum": 0
    },
    "hire_date": {
      "type": "string",
      "format": "date"
    },
    "is_active": {
      "type": "boolean"
    }
  },
  "additionalProperties": false
}

Tips for writing schemas

  • Start simple: Begin with basic type validation and add constraints gradually
  • Use descriptive titles: Help users understand what each field represents
  • Be specific with patterns: Use regex patterns for formats like phone numbers, IDs, etc.
  • Consider optional fields: Not all fields need to be required
  • Test your schema: Validate it against sample data to ensure it works as expected