Skip to main content

SQL β€” NHS Quickstart

πŸ“š ANSI SQL Β· T-SQL (SQL Server) Β· PostgreSQL Β· DuckDB Β· Views Β· Indexing Β· RLS
Why SQL in the NHS

SQL is the lingua franca for NHS data: warehouses, reporting marts, and analytics tools all speak it. Writing portable SQL means the same logic can run on SQL Server in your Trust, PostgreSQL in research, or DuckDB on your laptop.

Great for: BI Analysts Β· Data Scientists Β· Developers Β· Data Engineers.


βš™οΈ 10-minute connect (pick your engine)​

Client: SSMS or VS Code (SQL Tools / mssql).

Connection: SERVER=HOSTNAME; Trusted_Connection=Yes; Encrypt=Yes;

Named instance: SERVER=HOST\INSTANCE

Smoke test
SELECT SUSER_SNAME() AS whoami, DB_NAME() AS dbname, SYSDATETIMEOFFSET() AS now_utc;

πŸš€ β€œHello NHS” queries (portable patterns)​

Top practices by appointments (past 30d)

SELECT practice_id, COUNT(*) AS total_appointments
FROM appointments
WHERE start_time >= (CURRENT_DATE - INTERVAL '30 day')
GROUP BY practice_id
ORDER BY total_appointments DESC
FETCH FIRST 20 ROWS ONLY;

Reusable view (attendance rate)

CREATE VIEW vw_attendance AS
SELECT practice_id,
AVG(CASE WHEN attended THEN 1.0 ELSE 0.0 END) AS attendance_rate
FROM appointments
GROUP BY practice_id;

πŸ”’ Parameterised access (safe from apps)​

Avoid string concatenation. Use bound parameters from your app layer.

Python + SQLAlchemy
from sqlalchemy import create_engine, text
import pandas as pd

engine = create_engine("mssql+pyodbc:///?odbc_connect=...") # or postgresql+psycopg / duckdb://
q = text("""
SELECT practice_id, total_appointments
FROM vw_PracticeKPI_Monthly
WHERE month = :month
ORDER BY total_appointments DESC
""")
df = pd.read_sql(q, engine, params={"month":"2025-07"})

⚑ Indexing 101 (make dashboards fast)​

Choose indexes for WHERE, JOIN, and ORDER BY columns used by your dashboards.

-- Filter by start_time; group by practice_id
CREATE INDEX IX_appts_start_practice
ON dbo.appointments(start_time, practice_id);

Materialisation for speed

  • Use a scheduled job to populate a summary table (preferred).
  • Indexed views can help but have trade-offs; test carefully.

🧭 Row-level security (RLS) & access control​

Restrict users to the rows they are allowed to see.

  1. Create a predicate function that returns 1/0 for allowed rows.
  2. Attach a security policy to the table.
-- Example sketch; adapt to your Trust context
CREATE SCHEMA sec;
GO
CREATE FUNCTION sec.fn_limit_practice(@practice_id nvarchar(20))
RETURNS TABLE WITH SCHEMABINDING AS
RETURN SELECT 1 AS allow
WHERE @practice_id IN (SELECT practice_id FROM dbo.user_practices WHERE user_name = USER_NAME());
GO
CREATE SECURITY POLICY sec.policy_practice
ADD FILTER PREDICATE sec.fn_limit_practice(practice_id) ON dbo.appointments
WITH (STATE = ON);

πŸ•’ Temporal/audit patterns​

  • SQL Server: system-versioned temporal tables for history; or CDC if enabled by your DBA.
  • PostgreSQL: use audit triggers (e.g., created_at, valid_from/valid_to) or extensions (e.g., pgaudit); materialise snapshots.
  • DuckDB: append-only datasets with version columns; keep dated Parquet partitions.

πŸ“¦ Safe extracts for analysis​

Prefer columnar Parquet for extracts, regenerated on a schedule.

Extract β†’ Parquet (any engine)
import pandas as pd
# df = pd.read_sql(..., engine)
df.to_parquet("out/kpi_monthly.parquet", index=False)

Point Dash/Evidence.dev at this file for fast, reproducible loads.


πŸ›‘ IG & safety checklist​

  • Follow least-privilege: read-only roles for analysts; separate service accounts for apps.
  • Use parameterised queries; avoid ad-hoc concatenation from user input.
  • Apply small-number suppression and avoid free-text PHI in exports.
  • Log who ran what (e.g., via app layer) and keep a KPI definitions register.
  • Store connection strings in a secret store; never in code or Git.

πŸ“ Measuring impact​

  • Latency: query β†’ dashboard render time; nightly refresh duration.
  • Stability: failed refreshes; long-running query count.
  • Quality: validation pass rate; divergence between view vs. source.
  • Reusability: number of teams using the same views/materialised outputs.

πŸ”— See also​

See also: Python Β· DuckDB Β· Dash Β· Evidence.dev Β· Docker Β· GitHub Β· Secrets & .env

What’s next?

You’ve completed the Learn β€” SQL stage. Keep momentum: