Skip to main content

Information Governance

🛡 Data Protection · DPIA · DSPT · Audit Logging · Safe Sharing
🎯 Why this matters now

The NHS shift to digital, preventative, community-oriented care only works when data is handled safely. This page gives IG leads a concrete, repeatable way to embed privacy, security, and auditability into open‑source projects from day one.


👤 Role snapshot​

You protect patient data and ensure compliance with NHS, legal, and ethical frameworks for data use — while enabling safe, timely access for care and analytics.

See also: Secrets & .env · GitHub · Docker · Evidence.dev · FastAPI · AWS · Azure

🧰 Core NHS toolchain​

  • DSP Toolkit (DSPT) for organisational assurance
  • DPIA/Caldicott templates & approvals workflow
  • NHS Mail / M365 for secure sharing and controls

🔗 Open‑source augmentations​

Audit logging
Python JSON logs; SQL audit tables; optional ELK stack.
Secure transfer
SFTP; S3/MinIO with server‑side encryption and bucket policies.
Policy docs
MkDocs/Docusaurus for living guidance and runbooks.
Data masking
SQL functions; hash/pseudonymise; small‑number suppression.
IG checklists
GitHub Issue templates and PR gates.

⚙️ 90‑minute quickstart​

Goal: create an IG‑ready starter repo with a DPIA checklist, secrets setup, audit logging, masking helpers, and a secure transfer example.

1) Project guardrails (choose template style)​

.github/ISSUE_TEMPLATE/ig-checklist.yml
name: IG Checklist
description: Minimum IG checks for new or updated data projects
title: "IG: <project/feature>"
labels: ["IG"]
body:
- type: checkboxes
attributes:
label: Data classification
options:
- label: No direct patient identifiers used in development data
- label: DPIA reference recorded in README
- label: Data dictionary updated (fields, types, source, owner)
- type: checkboxes
attributes:
label: Security
options:
- label: Secrets are in env/secret store (not in code or Git)
- label: Access is least-privilege (service accounts, RBAC)
- label: Audit logs enabled for read/write operations
- type: checkboxes
attributes:
label: Sharing & outputs
options:
- label: Small-number suppression applied where required
- label: Aggregated outputs only; no free-text PHI in logs
- label: Data retention and deletion plan documented

2) Secrets & configuration​

.env.example (do not commit)
# SQL Server
SQLSERVER_SERVER=YOURSERVER
SQLSERVER_DATABASE=NHS_Analytics

# S3/MinIO (optional)
S3_ENDPOINT=https://s3.example.local
S3_BUCKET=nhs-secure
S3_ACCESS_KEY=
S3_SECRET_KEY=

# App/API
API_KEY=rotate_me

3) Audit logging (pick Python or SQL)​

audit.py
import json, logging, os, time, uuid

logger = logging.getLogger("audit")
handler = logging.StreamHandler()
handler.setFormatter(logging.Formatter('%(message)s'))
logger.addHandler(handler)
logger.setLevel(logging.INFO)

def audit(event: str, dataset: str, actor: str = "service", count: int | None = None, purpose: str = "analytics"):
entry = {
"ts": time.strftime("%Y-%m-%dT%H:%M:%SZ", time.gmtime()),
"event": event, # read/write/export/delete
"dataset": dataset,
"actor": actor, # service/user id (no PHI)
"purpose": purpose,
"count": count,
"corr_id": str(uuid.uuid4())
}
logger.info(json.dumps(entry))

if __name__ == "__main__":
audit("read", "dbo.vw_PracticeKPI", actor="svc-analytics", count=1203)

4) Masking & suppression​

examples/masking.sql
-- Pseudonymise an identifier
SELECT HASHBYTES('SHA2_256', CAST(NHS_NUMBER AS VARBINARY(32))) AS nhs_hash, *
FROM dbo.patient_demo;

-- Partial mask of free text (if present)
SELECT practice_id, LEFT(note, 0) AS note_redacted -- store empty in exports
FROM dbo.notes_export;

-- Small-number suppression example
SELECT org_code, CASE WHEN COUNT(*) < 5 THEN NULL ELSE COUNT(*) END AS count_suppressed
FROM dbo.rare_event
GROUP BY org_code;

5) Secure transfer (SFTP or S3/MinIO)​

sftp_upload.sh
#!/usr/bin/env bash
set -euo pipefail
sftp -b - "$SFTP_USER@$SFTP_HOST" <<EOF
put out/aggregates.csv secure/inbox/aggregates.csv
bye
EOF

▶️ Run (local demo)​

python audit.py
python export_guard.py
bash sftp_upload.sh # or
bash s3_upload.sh

🗓️ Week‑one build (repeatable, safe)​

Day 1 — Scope & DPIA

  • Confirm purpose, legal basis, data flows; log DPIA ref in README.
  • Create data dictionary and owners.

Day 2 — Secrets & roles

  • Move secrets to env/secret store; use least‑privilege service accounts.
  • Add audit logs (Python or SQL) to all read/write steps.

Day 3 — Masking & suppression

  • Add SQL/Python masking and small‑number rules to exports.
  • Add “data last updated” and sample size to reports.

Day 4 — Transfer & retention

  • Configure SFTP or S3/MinIO with encryption; document retention/deletion.
  • Add PR checks requiring IG checklist completion.

Day 5 — Review & share

  • Peer review with IG + product team; publish a living guidance page (Docusaurus/MkDocs).

🛡️ Always‑on IG checklist​

  • De‑identified/synthetic data in development examples
  • No secrets in code or Git; rotate keys regularly
  • Least‑privilege access; role‑based permissions; audit logs enabled
  • Aggregation first; small‑number suppression in outputs
  • DPIA reference and data dictionary kept up to date
  • Avoid PHI in logs, tickets, and commit messages

📏 Measuring impact​

  • Coverage: % projects with completed IG checklist & DPIA reference
  • Security: zero committed secrets; time to rotate compromised keys
  • Auditability: % pipelines logging reads/writes with correlation IDs
  • Timeliness: time from request → approved secure share
  • Quality: leakage incidents (target: zero); suppression rule adherence

📚 References & next​

See also: Secrets & .env · Docker · GitHub · Evidence.dev · FastAPI · AWS · Azure

What’s next?

You’ve completed the Persona — Information Governance stage. Keep momentum: