Use Case

Generate Healthcare Test Data (HIPAA-Safe Synthetic Data)

The Problem

Healthcare software needs realistic test data, but using real patient records is a non-starter. HIPAA regulations impose severe penalties for unauthorized use of Protected Health Information (PHI), even in development environments. De-identification is complex and error-prone: you have to strip 18 different identifier types and there is always a risk of re-identification.

Most teams resort to absurdly simple test data: five patients named "Test Patient 1" through "Test Patient 5" with identical birth dates. This data does not test date-of-birth validation, insurance eligibility calculations, appointment scheduling logic, or any of the complex business rules that healthcare software depends on.

The Solution: MockHero Synthetic Healthcare Data

MockHero generates completely synthetic data that looks realistic but has zero connection to real individuals. Every name, date of birth, insurance number, and medical record is fabricated from scratch. There is no HIPAA risk because no PHI was ever involved. You get the realism needed for thorough testing without the compliance burden.

Quick Setup

curl -X POST https://api.mockhero.dev/api/v1/generate \
  -H "x-api-key: mh_your_api_key" \
  -H "Content-Type: application/json" \
  -d '{
  "tables": [
    {
      "name": "patients",
      "count": 100,
      "fields": [
        { "name": "id", "type": "uuid" },
        { "name": "mrn", "type": "integer", "params": { "min": 100000, "max": 999999 } },
        { "name": "first_name", "type": "first_name" },
        { "name": "last_name", "type": "last_name" },
        { "name": "date_of_birth", "type": "date" },
        { "name": "gender", "type": "enum", "params": { "values": ["Male","Female","Non-binary","Prefer not to say"] } },
        { "name": "blood_type", "type": "enum", "params": { "values": ["A+","A-","B+","B-","O+","O-","AB+","AB-"] } },
        { "name": "insurance_id", "type": "uuid" },
        { "name": "phone", "type": "phone" },
        { "name": "email", "type": "email" }
      ]
    },
    {
      "name": "providers",
      "count": 15,
      "fields": [
        { "name": "id", "type": "uuid" },
        { "name": "full_name", "type": "full_name" },
        { "name": "specialty", "type": "enum", "params": { "values": ["Cardiology","Dermatology","Endocrinology","Gastroenterology","Neurology","Oncology","Orthopedics","Pediatrics","Psychiatry","Radiology"] } },
        { "name": "npi", "type": "integer", "params": { "min": 1000000000, "max": 1999999999 } },
        { "name": "email", "type": "email" }
      ]
    },
    {
      "name": "appointments",
      "count": 500,
      "fields": [
        { "name": "id", "type": "uuid" },
        { "name": "patient_id", "type": "ref", "params": { "ref": "patients.id" } },
        { "name": "provider_id", "type": "ref", "params": { "ref": "providers.id" } },
        { "name": "scheduled_at", "type": "datetime" },
        { "name": "type", "type": "enum", "params": { "values": ["initial_consult","follow_up","physical","lab_work","imaging","procedure","telehealth"] } },
        { "name": "status", "type": "enum", "params": { "values": ["scheduled","checked_in","in_progress","completed","cancelled","no_show"] } },
        { "name": "notes", "type": "sentence" }
      ]
    }
  ],
  "format": "json"
}'

Step-by-Step Guide

1. Define your healthcare schema

Map your application's patient, provider, and encounter models to MockHero tables. Use enum fields for coded values like blood types, specialties, and appointment statuses to match your real application's constraints.

2. Get your MockHero API key

Sign up at mockhero.dev/sign-up and copy your API key.

3. Generate the data

Call the MockHero API with your schema. The response includes patients, providers, and appointments with valid referential links between them.

4. Load into your database

Use your preferred database driver or ORM to insert the data. MockHero's JSON output works with any database. For Postgres:

import pg from "pg";
const { Client } = pg;
const client = new Client({ connectionString: process.env.DATABASE_URL });
await client.connect();

const res = await fetch("https://api.mockhero.dev/api/v1/generate", {
  method: "POST",
  headers: {
    "x-api-key": process.env.MOCKHERO_API_KEY,
    "Content-Type": "application/json",
  },
  body: JSON.stringify({ /* schema from above */ }),
});
const { data } = await res.json();

for (const p of data.patients) {
  await client.query(
    "INSERT INTO patients (id, mrn, first_name, last_name, date_of_birth, gender, blood_type, insurance_id, phone, email) VALUES ($1,$2,$3,$4,$5,$6,$7,$8,$9,$10)",
    [p.id, p.mrn, p.first_name, p.last_name, p.date_of_birth, p.gender, p.blood_type, p.insurance_id, p.phone, p.email]
  );
}
// ... insert providers and appointments similarly

5. Run compliance checks

Verify that none of the generated data matches real individuals. Because MockHero creates data from scratch (not by modifying real records), there is no risk of accidental PHI exposure.

Why MockHero vs De-Identified Production Data

  • Zero HIPAA risk — synthetic data has no connection to real patients. No BAA needed for test environments.
  • Complete control — choose exactly which fields, distributions, and volumes you need for testing.
  • Reproducible — pass a seed parameter to get identical data every run, perfect for regression testing.
  • Fast — generate 500 appointments in seconds, not hours of data masking and review.

Get Started

Generate HIPAA-safe healthcare test data today. Sign up free at mockhero.dev and get 1,000 rows per month.

M

MockHero Team

Guides and tutorials for generating realistic test data with the MockHero API.

Start generating test data for free

1,000 rows/month on the free tier. No credit card required.

Get Your API Key

Related Articles