🔍 Discover →📚 Learn →🛠 Build →📢 Share →🤖 AI & ML

BI Analyst

💼 Business Intelligence · SQL Server · Power BI · Dash/Evidence.dev

🎯 Why this matters now

The 10-Year Health Plan pushes three shifts:

Hospital → Community,
Analogue → Digital,
Sickness → Prevention

Data
AI
Genomics
Wearables
Robotics.

This path focuses on data + AI foundations you can ship now: cleaner datasets, faster KPIs, reproducible dashboards that support the NHS App as a “digital front door”.

👤 Role snapshot

You turn raw Trust data into decisions: operational KPIs, elective recovery metrics, access/flow, and population-health signals. Core stack is SQL Server → Power BI/Excel. Add lightweight open-source to:

Automate extracts & refreshes (Python)
Prototype dashboards without extra licences (Dash/Evidence.dev)
Version analysis safely (Git/GitHub)
Package shareable artefacts (CSV/Parquet + docs)

See also: SQL Server · Python · Dash · Evidence.dev · Secrets & .env

🎯 Outcomes to target (aligned to the Plan)

Digital accessFlow & accessCommunity shiftPreventionProductivity

Digital access: self-service bookings, secure messages, NHS App engagement
Flow & access: median wait, DNA rate, discharge readiness
Community shift: neighbourhood/virtual ward coverage
Prevention: recall completeness for LTCs; risk-triggered outreach
Productivity: refresh latency; % reports under version control; analyst hours saved

⚙️ 90-minute quickstart

Goal: publish a stable KPI view + tiny dashboard from SQL Server.

1) T-SQL view (≈15 min)

dbo.vw_PracticeKPI.sql
CREATE OR ALTER VIEW dbo.vw_PracticeKPI AS
SELECT
  ap.practice_id,
  COUNT(*) AS total_appointments,
  AVG(CASE WHEN ap.attended = 1 THEN 1.0 ELSE 0.0 END) AS attendance_rate,
  PERCENTILE_CONT(0.5) WITHIN GROUP (ORDER BY DATEDIFF(MINUTE, ap.booked_at, ap.start_time))
    OVER (PARTITION BY ap.practice_id) AS median_wait_minutes
FROM dbo.appointments AS ap
WHERE ap.start_time >= DATEADD(DAY, -30, SYSUTCDATETIME())
GROUP BY ap.practice_id;
GO

Tip

Tip: Add/confirm indexes aligned to filters (e.g., start_time) for fast refresh.

2) Python extract → Parquet (≈25 min)

# extract_kpi.py
import os, urllib.parse, pandas as pd
from sqlalchemy import create_engine, text
from dotenv import load_dotenv; load_dotenv()

params = urllib.parse.quote_plus(
    "DRIVER={ODBC Driver 18 for SQL Server};"
    f"SERVER={os.getenv('SQLSERVER_SERVER')};"
    f"DATABASE={os.getenv('SQLSERVER_DATABASE')};"
    "Trusted_Connection=Yes;Encrypt=Yes;"
)
engine = create_engine(f"mssql+pyodbc:///?odbc_connect={params}")

sql = text("""
SELECT practice_id, total_appointments, attendance_rate, median_wait_minutes
FROM dbo.vw_PracticeKPI
""")
df = pd.read_sql(sql, engine)
os.makedirs("out", exist_ok=True)
df.to_parquet("out/kpi.parquet", index=False)
print(df.head())

# .env (local only — do not commit)
SQLSERVER_SERVER=YOURSERVER
SQLSERVER_DATABASE=NHS_Analytics

3) Minimal dashboard (≈25 min)

A) Dash (Python)

# app.py
import dash
from dash import html, dcc
import plotly.express as px, pandas as pd

df = pd.read_parquet("out/kpi.parquet")
fig = px.bar(df, x="practice_id", y="total_appointments",
             title="Appointments by Practice (30d)")

app = dash.Dash(__name__)
app.layout = html.Div([html.H2("NHS KPI (Demo)"), dcc.Graph(figure=fig)])

if __name__ == "__main__":
    app.run_server(debug=True)

B) Evidence.dev (SQL + Markdown)

File: src/pages/kpi.md

---
title: NHS KPI
---

```sql kpi
SELECT practice_id, total_appointments, attendance_rate, median_wait_minutes
FROM dbo.vw_PracticeKPI
ORDER BY total_appointments DESC
```

# Appointments by Practice

<Table data={kpi} />

Run

python extract_kpi.py
python app.py

🗓️ Week-one build (repeatable, safe)

Day 1 – Data contracts

One authoritative view per KPI; add created_utc, source_system, short definition.

Day 2 – Automation

Schedule extract_kpi.py; write to read-only share (Parquet/CSV).

Day 3 – Dashboard + governance

Dash/Evidence.dev page; show “Data last updated”; inline KPI definitions.

Day 4 – Version control & review

Put SQL + Python + dashboard in GitHub; enable PRs + basic CI.

Day 5 – IG & quality guardrails

Synthetic data in dev; secrets from Trust store in prod.
Validation: row counts vs yesterday; rate thresholds.

🧰 Open-source augmentations (pick 2)

Need	Add this	Why
Faster ad-hoc analysis	DuckDB	Fast SQL over CSV/Parquet; zero infra.
Thin, shareable reports	Evidence.dev	SQL + Markdown → static site; auditable.
Lightweight APIs	FastAPI	Serve one KPI endpoint for internal tools.
Reproducibility	Git + GitHub	Track changes; PR reviews; simple CI.

See also: DuckDB · Evidence.dev · FastAPI · Git · GitHub

🛡️ IG & safety checklist

Use synthetic or de-identified samples in dev.
Keep secrets in a secret store; never in code or git.
Document KPI logic (what, why, source, owner).
Aggregate before export; apply suppression rules.

See also: Secrets & .env

📏 Measuring impact

Latency: source load → dashboard refresh (target < 30 min).
Adoption: views/week; decisions/actions logged.
Quality: % successful refreshes; validation pass rate.
Productivity: hours saved via automation; PR cycle time.

BI Analyst

👤 Role snapshot​

🎯 Outcomes to target (aligned to the Plan)​

⚙️ 90-minute quickstart​

1) T-SQL view (≈15 min)​

Tip​

2) Python extract → Parquet (≈25 min)​

3) Minimal dashboard (≈25 min)​

A) Dash (Python)​

B) Evidence.dev (SQL + Markdown)​

Run​

🗓️ Week-one build (repeatable, safe)​

Day 1 – Data contracts​

Day 2 – Automation​

Day 3 – Dashboard + governance​

Day 4 – Version control & review​

Day 5 – IG & quality guardrails​

🧰 Open-source augmentations (pick 2)​

🛡️ IG & safety checklist​

📏 Measuring impact​