EDAIC Statistics Made Simple: The High-Yield Essentials
Master the small but predictable EDAIC statistics syllabus with plain-English explanations of data types, sensitivity, specificity, p-values, confidence intervals and study design—easy marks for Part 1.

Statistics questions in the EDAIC Part 1 written examination are among the most predictable and high-yield topics you will encounter. The syllabus is narrow, the question stems follow recognisable patterns, and the marks are there for the taking if you invest a few focused hours. This guide distils the edaic statistics essentials into plain English, covering everything from types of data through to sensitivity, specificity, p-values, confidence intervals and basic study design—the core material that appears year after year in edaic part 1 Paper A.
Why Statistics Matters in EDAIC Part 1
Statistics sits within the edaic basic sciences domain of Paper A, alongside physics, clinical measurement and equipment. It typically accounts for a handful of Multiple True/False (MTF) statements per sitting—not a large proportion, but enough to make a material difference to your overall score. Because the syllabus is finite and the question types recur, statistics offers a better return on study time than almost any other topic. You do not need a degree in mathematics; you need to recognise the terminology, understand the concepts at the level expected of a safe clinician interpreting research, and practise applying that knowledge to MTF stems.
Types of Data and Distributions
Categorical vs Continuous Data
Data are either categorical (qualitative) or continuous (quantitative). Categorical data describe qualities or groups: nominal data have no inherent order (blood group A, B, AB, O), while ordinal data have a meaningful sequence (ASA grade I–V, pain score 0–10). Continuous data are measured on a scale and can take any value within a range: examples include heart rate, blood pressure, and drug concentration.
Knowing which type of data you are dealing with determines the correct statistical test. Categorical data are analysed with chi-squared or Fisher's exact test; continuous data with t-tests, ANOVA or non-parametric equivalents depending on distribution.
Normal (Gaussian) Distribution
Many physiological variables follow a normal distribution: a symmetrical, bell-shaped curve defined by its mean and standard deviation (SD). In a normal distribution, approximately 68% of values lie within ±1 SD of the mean, 95% within ±2 SD, and 99.7% within ±3 SD. This "68–95–99.7 rule" appears regularly in EDAIC questions.
When data are not normally distributed—skewed by outliers or bounded at one end—non-parametric tests (Mann–Whitney U, Wilcoxon signed-rank) are more appropriate than parametric tests that assume normality.
Exam tip: If a question describes data as "normally distributed," expect statements about mean, SD and parametric tests. If it mentions "skewed" or "non-parametric," look for median, interquartile range and rank-based tests.
Descriptive vs Inferential Statistics
Descriptive statistics summarise and present data: measures of central tendency (mean, median, mode) and measures of spread (range, interquartile range, standard deviation, variance). The mean is sensitive to outliers; the median is more robust for skewed data. Standard deviation quantifies variability around the mean; variance is the square of the standard deviation.
Inferential statistics allow us to draw conclusions about a population from a sample. This is where hypothesis testing, p-values, confidence intervals and study design come into play. The EDAIC expects you to understand the logic of inference—what a p-value tells you, what a confidence interval represents, and how sample size affects precision—rather than to perform calculations by hand.
Sensitivity, Specificity and Predictive Values
These are the bread and butter of diagnostic test evaluation and appear in almost every EDAIC sitting.
Definitions
- Sensitivity (true positive rate): the proportion of patients with the disease who test positive. High sensitivity means few false negatives; a sensitive test is good for ruling out disease when negative (SnNout: Sensitivity, Negative, rule out).
- Specificity (true negative rate): the proportion of patients without the disease who test negative. High specificity means few false positives; a specific test is good for ruling in disease when positive (SpPin: Specificity, Positive, rule in).
- Positive predictive value (PPV): the probability that a patient with a positive test actually has the disease. PPV depends on prevalence: the higher the prevalence, the higher the PPV.
- Negative predictive value (NPV): the probability that a patient with a negative test is truly disease-free. NPV also depends on prevalence: the lower the prevalence, the higher the NPV.
The 2×2 Table
Construct a simple table:
| Disease Present | Disease Absent | |
|---|---|---|
| Test Positive | True Positive (TP) | False Positive (FP) |
| Test Negative | False Negative (FN) | True Negative (TN) |
- Sensitivity = TP / (TP + FN)
- Specificity = TN / (TN + FP)
- PPV = TP / (TP + FP)
- NPV = TN / (TN + FN)
Key point: Sensitivity and specificity are intrinsic properties of the test and do not change with disease prevalence. Predictive values are extrinsic and vary with prevalence.
Types of Error and Statistical Power
Type I and Type II Errors
When testing a hypothesis, two kinds of error are possible:
- Type I error (α): rejecting the null hypothesis when it is actually true—a false positive conclusion. The significance level (commonly 0.05) sets the acceptable risk of Type I error. A p-value < 0.05 means the probability of observing the data (or more extreme) if the null hypothesis were true is less than 5%.
- Type II error (β): failing to reject the null hypothesis when it is false—a false negative conclusion. The probability of correctly rejecting a false null hypothesis is the power of the study, defined as 1 − β. Adequate power (typically 80% or 90%) requires sufficient sample size.
Increasing sample size reduces Type II error and increases power. Lowering the significance threshold (e.g. from 0.05 to 0.01) reduces Type I error but increases the risk of Type II error unless sample size is increased accordingly.
P-Values and Confidence Intervals
The P-Value
The p-value is the probability of obtaining results at least as extreme as those observed, assuming the null hypothesis is true. It is not the probability that the null hypothesis is true, nor the probability that the result occurred by chance alone. A p-value < 0.05 is conventionally taken as evidence against the null hypothesis, but it does not measure the size or clinical importance of an effect.
Confidence Intervals
A 95% confidence interval (CI) is a range of values within which we are 95% confident the true population parameter lies. If a 95% CI for a difference between two means does not include zero, the difference is statistically significant at the 0.05 level. Confidence intervals convey both statistical significance and the precision of the estimate; a narrow CI indicates a precise estimate (large sample), while a wide CI indicates uncertainty (small sample).
Exam tip: When a question states "the 95% CI for the mean difference is 2.1 to 5.3 mmHg," you can immediately conclude that p < 0.05 because the interval excludes zero.
Basic Study Design
Understanding study types and their strengths and limitations is essential for interpreting the literature and answering EDAIC questions on evidence-based practice.
Observational Studies
- Case report / case series: descriptive; no control group; useful for rare events but cannot establish causation.
- Cross-sectional study: a snapshot at one point in time; measures prevalence; cannot determine temporal sequence.
- Case-control study: compares patients with a condition (cases) to those without (controls), looking back at exposures; efficient for rare diseases; prone to recall and selection bias.
- Cohort study: follows a group over time, comparing exposed vs unexposed; can establish temporal sequence; prospective cohorts are less prone to bias than retrospective.
Interventional Studies
- Randomised controlled trial (RCT): participants are randomly allocated to intervention or control; the gold standard for establishing causation; randomisation minimises confounding.
- Blinding: single-blind (participant unaware), double-blind (participant and investigator unaware), triple-blind (participant, investigator and analyst unaware); reduces bias.
- Crossover trial: each participant receives both intervention and control in random order; controls for inter-individual variation; requires a washout period.
Systematic Reviews and Meta-Analysis
A systematic review uses explicit, reproducible methods to identify, appraise and synthesise all relevant studies on a question. A meta-analysis pools numerical data from multiple studies to produce a summary estimate with greater statistical power. Both sit at the top of the evidence hierarchy when well conducted.
Frequently Asked Questions
What is the difference between sensitivity and positive predictive value?
Sensitivity is the proportion of diseased patients who test positive—an intrinsic test property. Positive predictive value is the proportion of positive tests that are true positives—it depends on disease prevalence and changes with the population tested.
How does sample size affect a confidence interval?
Larger samples produce narrower (more precise) confidence intervals. A study with a small sample will have a wide CI, reflecting greater uncertainty about the true population parameter, even if the point estimate is the same.
What does a p-value of 0.03 mean?
It means that if the null hypothesis were true, there is a 3% probability of observing a result at least as extreme as the one obtained. It does not mean there is a 3% chance the null hypothesis is true, nor that the finding is clinically important.
Why is randomisation important in a clinical trial?
Randomisation ensures that known and unknown confounding variables are distributed evenly between groups, allowing any observed difference in outcome to be attributed to the intervention rather than to baseline imbalances.
Final Thoughts
The edaic statistics syllabus is small, focused and eminently learnable. Master the definitions—sensitivity, specificity, PPV, NPV, Type I and Type II error, p-value, confidence interval—and practise applying them to MTF stems. Understand the hierarchy of study designs and the logic of hypothesis testing. A few hours of deliberate practice will translate directly into marks on Paper A, and the concepts will serve you throughout your career as you appraise the literature and apply evidence to patient care.
Start preparing for EDAIC Part I
Syllabus-mapped lessons, thousands of MTF questions, spaced-repetition flashcards and an AI study plan — in one platform.
Start free