Git β NHS Quickstart
Reproducibility and auditability are IG wins. Git tracks every change to SQL, notebooks, and dashboards, supports peer review with pull requests, and lets you roll back safely.
Great for: Everyone (BI Analyst Β· Data Scientist Β· Developer Β· Data Engineer Β· IG).
βοΈ 10-minute install & setupβ
- Windows
- macOS / Linux
Install Git for Windows. Then:
git --version
git config --global user.name "Your Name"
git config --global user.email "you@nhs.net"
git config --global core.autocrlf true
Install via Homebrew / apt / dnf. Then:
git --version
git config --global user.name "Your Name"
git config --global user.email "you@nhs.net"
git config --global core.autocrlf input
π βHello NHSβ workflow (new repo)β
mkdir nhs-report && cd nhs-report
git init
echo "# NHS Report" > README.md
echo ".env" >> .gitignore
git add .
git commit -m "init: report skeleton"
Create a feature branch:
git switch -c feature/add-kpi-sql
Make changes, then commit:
git add sql/vw_PracticeKPI.sql
git commit -m "feat(sql): add vw_PracticeKPI (30d median wait, attendance)"
π Connect to a remoteβ
- GitHub (NHS org)
- Azure DevOps / GitLab
Create an empty repo, then:
git remote add origin https://github.com/ORG/nhs-report.git
git push -u origin feature/add-kpi-sql
Use the HTTPS URL provided by your platform:
git remote add origin https://dev.azure.com/ORG/nhs-report/_git/nhs-report
git push -u origin feature/add-kpi-sql
Open a Pull Request for review, then merge to main.
ποΈ Repo layout (analytics-friendly)β
nhs-report/
README.md
.gitignore
sql/
views/
seeds/ # small synthetic data
notebooks/
src/ # scripts (python/R)
dashboards/ # dash/shiny/evidence
tests/
Keep KPI definitions in README.md (what, why, source, owner).
π§Ή .gitignore for NHS projectsβ
Create .gitignore:
# secrets & creds
.env
*.key
*.pem
# local data
/data/*
!data/README.md
# notebooks caches
.ipynb_checkpoints/
# python
__pycache__/
*.pyc
.venv/
# R
.Rhistory
.Rproj.user
# node
node_modules/
dist/
Add a data/README.md explaining how to obtain data. Never commit confidential data.
π¦ Large files (CSV/Parquet)β
Prefer links or re-create via scripts. If you must version big non-sensitive assets:
- Use Git LFS (only for non-confidential, public-suitable files).
- Or store in a secure bucket/share and reference in the README with checksums.
git lfs install
git lfs track "*.parquet"
git add .gitattributes
π§ͺ Pre-commit checks (optional but recommended)β
Use pre-commit hooks to catch secrets and lint notebooks/SQL.
pip install pre-commit
.pre-commit-config.yaml (example):
repos:
- repo: https://github.com/zricethezav/gitleaks
rev: v8.18.4
hooks:
- id: gitleaks
- repo: https://github.com/nbQA-dev/nbQA
rev: 1.9.0
hooks:
- id: nbqa-black
additional_dependencies: [black==24.4.0]
- id: nbqa-isort
Then:
pre-commit install
π IG & safety checklistβ
- Never commit secrets (
.env, keys, connection strings). - Use synthetic/de-identified samples in the repo.
- Document data sources and suppression rules.
- Protect
mainwith branch protection + PR reviews. - Record approvals in PRs for an audit trail.
π§ Everyday commandsβ
git status # what changed
git add <path> # stage changes
git commit -m "msg" # commit staged changes
git switch -c <name> # create/switch branch
git pull --rebase # update your branch
git push # publish to remote
git log --oneline --graph --decorate --all
π·οΈ Releases & tagsβ
Tag a snapshot of main used in production dashboards/pipelines:
git tag -a v0.1.0 -m "first KPI draft"
git push origin v0.1.0
π Measuring impactβ
- Review coverage: % of merges via PR.
- Recovery: time to revert a bad change.
- Security: zero secrets committed; gitleaks clean.
- Reproducibility: runs from clean clone with documented steps.
π See alsoβ
Whatβs next?
Youβve completed the Learn β Git stage. Keep momentum: