The average systolic blood pressure (SBP) of adult American males is 120 mmHg, with a standard deviation of 15 mmHg.
Amphetamine is a drug used to treat ADHD; however, we suspect it may have a side effect that increases SBP.
To determine whether the drug affects SBP, we conduct a hypothesis test.
We collect data from treated individuals to test this hypothesis.
We use a power analysis to determine how many participants are needed to reliably detect the drug’s potential side effect on SBP
Hypothesis testing is a statistical method used to make inferences about a population based on a sample. It involves the following steps:
State a null and an alternative hypothesis
Null Hypothesis (\(H_0\)): The drug has no effect on SBP; that is, the mean SBP of the treated group is 120 mmHg.
Alternative Hypothesis (\(H_{1}\)): The drug has a side effect; that is, the mean SBP of the treated group is greater than 120 mmHg.
\[ \begin{array}{l} H_0 : & \mu_{treated} = 120 \\ H_{1} : &\mu_{treated} > 120 \end{array} \]
Select an statistical test
In this case, we can use a one-sample z-test or one-sample t-test to compare the sample mean to the known population mean under \(H_0\).
Collect data and calculate the test statistic
We gather a sample from the treated group, compute the sample mean (\(\bar{x}\)), and use it to calculate the test statistic for the chosen test.
Draw a conclusion
Based on the test statistic, we assess whether the sample provides enough evidence to conclude that the null hypothesis should be rejected in favor of the alternative.
In hypothesis testing, our decisions are based on random sample data, which introduces uncertainty. As a result, we might sometimes make mistakes. These mistakes are classified as Type I and Type II errors.
Type I error (\(\alpha\)): Rejecting the null hypothesis when it is actually true. Also known as a false positive.
Type II error (\(\beta\)): Failing to reject the null hypothesis when the alternative hypothesis is actually true. Also known as a false negative.
\[\text{Power} = 1 - \beta\]
In this example, we assume the population standard deviation is known (\(\sigma = 15\)), so we use a one-sample z-test to construct our test.
We take a random sample of size \(n\) from the treated group and calculate the sample mean \(\bar{x}\).
We then standardize the sample mean to form a test statistic called the z-statistic:
\[z = \frac{\bar{x} - \mu_0}{\sigma/\sqrt{n}}\]
This z-statistic follows a standard normal distribution under the null hypothesis, assuming the population standard deviation \(\sigma\) is known.
We set a significance level \(\alpha\) (e.g., 0.05), to control the probability of making a Type I error — rejecting the null hypothesis when it is actually true.
The critical value for the test is the value beyond which we reject \(H_0\):
\[\Pr(z > z_{1 - \alpha}) = \alpha\]
The power of a statistical test is the probability of correctly rejecting the null hypothesis when the alternative hypothesis is true.
In our example, the null hypothesis states that the mean SBP is 120 mmHg, while the alternative hypothesis states that it is greater than 120 mmHg.
If the true mean is actually 130 mmHg, then the power of the test is the probability that the test statistic falls in the rejection region, correctly leading us to reject the null hypothesis.
Figure 2 shows the distribution of the test statistic under the null hypothesis and under two alternative hypotheses, where the true mean SBP is assumed to be either 130 mmHg or 135 mmHg.
From the graph in the previous slides, we observe that under the alternative hypothesis, the distribution of the test statistic is shifted to the right, depending on the value of \(\mu_1\).
More specifically, the distribution of the test statistic under the alternative is normal with mean:
\[ \delta = \frac{\mu_{1} - \mu_0}{\sigma/\sqrt{n}} \] and standard deviation of 1.
The power of the test is the probability that the test statistic under the alternative distribution exceeds the critical value \(z_{1 - \alpha}\).
By setting the power to \(1 - \beta\) and using the cumulative distribution function (CDF) of the standard normal distribution, we can express the required sample size as:
\[ n = \left[\frac{z_{1 - \alpha} - z_{\beta}}{(\mu_{1} - \mu_0)/\sigma}\right]^2 \]
Effect size measures how large a difference or relationship is — not just whether it exists.
In power analysis, effect size is crucial. It directly affects how large your sample needs to be to detect a meaningful effect.
Effect sizes can be unscaled or scaled.
For example, in the one sample mean test, is simply the raw difference between the true mean and the mean under the null:
\[ d = \mu_{1} - \mu_0 \]
and the scaled effect size can be define as standardizes this difference by dividing by the standard deviation:
\[ d_{scaled} = \frac{\mu_{1} - \mu_0}{\sigma} \]
There are many different ways to define effect size, depending on the type of test and the context of the analysis.
A low-powered study may fail to detect real effects — leading to false negatives.
A high-powered study gives you greater confidence that real effects will be detected when they exist.
Power analysis helps you choose the minimum sample size needed to detect meaningful effects with a desired level of certainty.
Component | Description |
---|---|
\(\boldsymbol{\alpha}\) (alpha) | Type I error rate (false positive rate) |
\(\boldsymbol{\beta}\) (beta) | Type II error rate (false negative rate) |
Effect size | Magnitude of the effect or difference being tested |
Sample size | Larger sample sizes increase precision and power |
Variability | Lower variability makes effects easier to detect |
The maximum probability of making a Type I error; commonly set at 0.05 for a two-sided test.
Other choices should be justified. For example, in rare diseases where participants are hard to find, a higher significance level (e.g., 0.1) may be appropriate.
May also need adjustment for multiple comparisons.
Commonly used values are \(1 - \beta = 0.8\) or \(0.9\).
If a study cannot be repeated, consider \(1 - \beta = 0.95\).
Power below 0.8 should be justified. Is it worth even conducting the study then?
The effect size should reflect a clinically or scientifically meaningful difference.
A study should not be conducted if the expected difference is not important or relevant to the research question.
For sample size calculations, use either the smallest clinically relevant difference or the expected difference, whichever is larger.
If no established threshold for clinical relevance exists, use domain expertise to define an appropriate effect size.
Nuisance parameters are not the focus of the primary research question but still affect power and sample size calculations.
These values can often be obtained from prior studies, but they typically vary or may be poorly estimated.
Therefore, it is important to account for their uncertainty by conducting a sensitivity analysis to examine how the required sample size changes with different assumptions.
Generally, when comparing treatments, balanced groups will produce the most efficient test (smallest variance) and will thus require the smallest sample size overall
Efficiency loss is usually modest if the allocation ratio isn’t too extreme (e.g., 2:1 is often acceptable).
Additionally, if dropouts or missing data are expected, sample size should be increased to maintain power.
In this example, we visualize the power curve for a two-sample t-test using R.
\[ \begin{array}{l} H_0 : & \mu_1 - \mu_2 = 120 \\ H_{1} : & \mu_1 - \mu_2 \neq 120 \end{array} \]
The power curve shows how the power of the test changes with different sample sizes. The effect size is varying by color \(d = .2, 0.5, .7, .9\) (small to large effect sizes) and the significance level is set to 0.05 and 0.1.
Effect size (Cohen’s d) is difference between the means divided by the pooled standard deviation.
\[ d = \frac{\mu_1 - \mu_2}{S_{pooled}} \]
pwr
Packagepwr
package provides analytical power calculations for:
\[ \begin{array}{l} H_0 : & \mu = 120 \\ H_{1} : & \mu > 120 \end{array} \]
Power at \(\mu=130\),
Given: \(\sigma=15\), \(n = 16\), \(\alpha=0.05\).
#population standard deviation
sigma <- 15
#mu under null hypothesis
mu0 <- 120
#mu under alternative hypothesis
mu1 <- 130
# Effect size
d <- (mu1 - mu0) / sigma
# Power calculation for given sample size
pwr.norm.test(d = d, n = 16, sig.level = 0.05, alternative="greater")
Mean power calculation for normal distribution with known variance
d = 0.6666667
n = 16
sig.level = 0.05
power = 0.8465653
alternative = greater
# Required sample size for a given power
pwr.norm.test(d = d, power=0.8, sig.level = 0.05, alternative="greater")
Mean power calculation for normal distribution with known variance
d = 0.6666667
n = 13.91076
sig.level = 0.05
power = 0.8
alternative = greater
The sample size should be rounded up.
# Power function of the same test for different value of mu, effect size
mu <- seq(120, 140, length.out = 100)
# Effect side
d <- (mu - mu0) / sigma
plot(d,
pwr.norm.test(d = d, n = 16, sig.level = 0.05, alternative = "greater")$power,
type = "l", xlab = "Effect size (Cohen's d)", ylab = "Power", ylim = c(0,1))
abline(h = 0.05, lty = 2)
abline(h = 0.80, col = "blue", lty = 2)
Medium effect size (Cohen’s d = 0.5)
# Calculating n for two sided, two sample t-test
pwr.t.test(d = 0.5, power = 0.8, sig.level = 0.05,
type = "two.sample", alternative = "two.sided")
Two-sample t test power calculation
n = 63.76561
d = 0.5
sig.level = 0.05
power = 0.8
alternative = two.sided
NOTE: n is number in *each* group
One-Way ANOVA
Three groups, medium effect size (Cohen’s f = 0.25)
Balanced one-way analysis of variance power calculation
k = 3
n = 52.3966
f = 0.25
sig.level = 0.05
power = 0.8
NOTE: n is number in each group
#Power Calculation for correlation Test
pwr.r.test(r = 0.3, sig.level = 0.05, power = 0.8, alternative = "two.sided")
approximate correlation power calculation (arctangh transformation)
n = 84.07364
r = 0.3
sig.level = 0.05
power = 0.8
alternative = two.sided
Analytical methods rely on simplifying assumptions:
Simulations are helpful when:
Benefits of simulation-based power analysis:
Identify the hypothesis and the parameter of interest.
Define the data-generating process
Simulate data
rnorm()
, rbinom()
, etc.Fit the model
Evaluate significance
Repeat Repeat the process many times (e.g., 1,000 simulations):
replicate()
or loops to repeat the simulation.Estimate power
For example if we collect 1000 p-values, the estimated power is:
\[ \text{Power} = \dfrac{ \sum_{i=1}^{1000} I(p_i < \alpha) } {1000 } \] Where \(I(p_i<\alpha)\) is an indicator function for the \(i\)-th simulation that equals 1 if the p-value is less than \(\alpha\), and 0 otherwise.
rnorm(n, mean, sd)
— normal distributionrbinom(n, size, prob)
— binomial outcomesrunif(n, min, max)
— uniform distributionsample(x, size, replace = TRUE)
— random samplingt.test(x, y)
— for comparing meanslm(y ~ x)
— linear regressionglm(y ~ x, family=binomial)
— logistic regressionfor
loops — repeat steps manually or by conditionreplicate(n, expr)
— repeat simulation n
timesset.seed()
— for reproducibilitymean()
, sd()
, summary()
— for descriptive summariesTurn blocks of code into functions to avoid repetition and improve readability.
my_function <- function(arg1, arg2) { ... }
for
loop:for (i in 1:n) {
# repeat this code n times
}
nsim <- 1000
results <- numeric(nsim)
for (i in 1:nsim) {
results[i] <- my_function(arg1, arg2)
}
replicate()
for compact syntax:results <- replicate(nsim, my_function(arg1, arg2))
We write a function to sample data from a normal distribution and calculate the mean.
replicate()
We use the replicate()
function to repeat a simulation multiple times.
Here, we simulate systolic blood pressure (SBP) under the null hypothesis (\(\mu = 120\)) 1,000 times, and calculate the sample mean each time.
# Set seed for reproducibility
set.seed(101)
# Replicate 1000 times
nsim <- 1000
x_bars <- replicate(nsim, sample_mean(100, mean = 120, sd = 15))
x_bars[1:6]
[1] 119.4421 119.3700 120.0311 119.6180 116.8302 120.7440
This should closely approximate the theoretical value of \(15/\sqrt{100} = 1.5\)
Question: If we repeatedly take samples of size \(n = 10\) given the true population mean is 120 and conduct a hypothesis test each time, in what proportion of tests will the p-value be less than 0.05?
Hint: This proportion approximates the Type I error rate (\(\alpha\)).
# Function to sample data and calculate z-value and p-value
sample_p_value <- function(n, mu0 = 0, mu1 = 0, sd = 1) {
# Generate random sample from population with normal distribution
data <- rnorm(n, mu1, sd)
# Conduct one-sample t-test
z_value <- (mean(data) - mu0) / (sd / sqrt(n))
# p-value is the probability a standard normal greater than z_value
p_value <- pnorm(z_value, lower.tail = FALSE)
return(p_value)
}
#set seed
set.seed(101)
#set the number of simulations
nsim <- 1000
#replicate "nsim" times
p_values <- replicate(nsim, sample_p_value(10, mu0 = 120, mu1 = 120, sd = 15))
#power is the proportion of p-values less than 0.05
mean(p_values < 0.05)
[1] 0.039
Simulate the statistical power when the true population mean is 125, the null hypothesis mean is 120, and the sample size is 100. Use nsim = 1000
repetitions.
[1] 0.959
We use for
loops to iterate over multiple parameters, conditions, or simply repeat the same process many times. A loop can be used instead of replicate()
.
for
loops are more flexible and allow for complex operations, but they are generally less efficient than replicate()
for simple tasks.
set.seed(101)
# Set number of simulations
nsim <- 1000
# Initialize an empty vector to store p-values
p_values <- rep(NA, nsim)
# Run the simulation
for (i in 1:nsim) {
p_values[i] <- sample_p_value(n = 100, mu0 = 120, mu1 = 125, sd = 15)
}
# Estimate power as the proportion of p-values less than 0.05
mean(p_values < 0.05)
[1] 0.959
We want to estimate the power of a two-sample t-test for difference of BP between two group, treatment and control, using simulation.
From previous studies, we know that the standard deviation is approximately 15 mmHg and is expected to be similar between the two groups.
We aim to detect a 5 mmHg difference in systolic blood pressure between the treatment and control groups.
This is the steps for power simulations:
Step 1: Define the Hypothesis and Identify the Test Statistic:
Null hypothesis (\(H_0\)): \(\mu_1 - \mu_2 = 0\) (no difference in means)
Alternative hypothesis (\(H_1\)): $_1 - _2 $ (there is a difference)
Step 2: Define the data-generating process:
Population means: \(\mu_1 = 120\), \(\mu_2 = 125\)
Common standard deviation: \(\sigma = 15\)
Sample size per group: \(n = 50\)
Data are generated from normal distributions
Step 3: We simulate data
We simulate one dataset and store it in a data.frame.
The R code will be as follow:
set.seed(111)
# Parameters
n <- 50
mu1 <- 120
mu2 <- 125
sd <- 15
# Simulate data for both groups
group1 <- rnorm(n, mu1, sd)
group2 <- rnorm(n, mu2, sd)
# Create data frame
dat <- data.frame(BP = c(group1, group2),
treatment = rep(c("Yes", "No"), each = n))
dat[1:6, ]
BP treatment
1 123.52831 Yes
2 115.03896 Yes
3 115.32564 Yes
4 85.46482 Yes
5 117.43686 Yes
6 122.10417 Yes
Step 4: Conduct two-sample t-test and extract p-value
# two-sample t-test and return p-value
test_result <- t.test(BP ~ treatment, var.equal = TRUE,
conf.level = 0.95, data = dat)
test_result
Two Sample t-test
data: BP by treatment
t = 3.2839, df = 98, p-value = 0.00142
alternative hypothesis: true difference in means between group No and group Yes is not equal to 0
95 percent confidence interval:
4.13435 16.76217
sample estimates:
mean in group No mean in group Yes
127.5318 117.0836
[1] 0.001420294
Step 5 & 6: Repeat the Process and Evaluate Significance
We repeat this simulation 1000 times and store the resulting p-values.
#set seed
set.seed(111)
# Set parameters
n <- 50 # sample size for each group
mu1 <- 120 # mean of group 1
mu2 <- 125 # mean of group 2 (5 mmHg difference)
sd <- 15 # standard deviation of both groups
# Number of simulations
n_sim <- 1000
#initiate an empty vector for p-values
p_values <- rep(NA, nsim)
for(i in 1:nsim) {
# Generate two random samples from normal distributions
group1 <- rnorm(n, mu1, sd)
group2 <- rnorm(n, mu2, sd)
# Create a data frame
dat <- data.frame(BP = c(group1, group2), treatment = rep(c("Yes", "No"), each = n))
# two-sample t-test and return p-value
test_result <- t.test(BP ~ treatment, var.equal = TRUE,
conf.level = 0.95, data = dat)
#extract p-value
p_values[i] <- test_result$p.value
}
Step 7: Estimate power
Let’s create a reusable function that simulates the power of a two-sample t-test, given input parameters such as sample size, effect size, standard deviation, and significance level.
This function makes it easy to explore how power changes across different scenarios.
To be more flexible we include extra options:
1- Use the alpha
argument when calculating power (instead of hardcoded 0.05).
2- Include optional reproducibility with set.seed().
3- Add the option var.equal.
# Function to simulate power for a two-sample t-test
sample_t_test_power <- function(n, mu1 = 0, mu2 = 0, sd = 1,
alpha = 0.05, var.equal = TRUE,
nsim = 1000, seed = NULL) {
# Set seed for reproducibility
if (!is.null(seed)) set.seed(seed)
# Preallocate vector to store p-values
p_values <- rep(NA, nsim)
# Run simulation
for (i in 1:nsim) {
group1 <- rnorm(n, mu1, sd)
group2 <- rnorm(n, mu2, sd)
dat <- data.frame(BP = c(group1, group2),
treatment = rep(c("Yes", "No"), each = n))
result <- t.test(BP ~ treatment, data = dat,
var.equal = var.equal, conf.level = 1 - alpha)
p_values[i] <- result$p.value
}
# Return estimated power
mean(p_values < alpha)
}
replicate()
“optional”# Function to simulate a two-sample t-test and return p-value
sample_t_test_p_value <- function(n, mu1 = 120, mu2 = 125, sd = 15) {
group1 <- rnorm(n, mu1, sd)
group2 <- rnorm(n, mu2, sd)
t.test(group1, group2, var.equal = TRUE)$p.value
}
# Simulation setup
set.seed(111)
n_sim <- 1000
n <- 50 # sample size per group
# Run simulations using replicate
p_values <- replicate(n_sim, sample_t_test_p_value(n))
# Estimate power as proportion of p < 0.05
power_estimate <- mean(p_values < 0.05)
power_estimate
[1] 0.402
# Plot the distribution of p-values from the two-sample t-test
#| fig-height: 8
#| fig-asp: .6
#| fig-align: 'center'
#| fig-alt: 'Power Simulation for Two-Sample t-Test'
#| fig-cap: 'Power Simulation for Two-Sample t-Test'
#| fig-ncol: 2
# Function to simulate two-sample t-test and return p-value
sample_t_test_t_value <- function(n, mu1 = 0, mu2 = 0, sd = 1) {
# Generate two random samples from normal distributions
group1 <- rnorm(n, mu1, sd)
group2 <- rnorm(n, mu2, sd)
# Conduct two-sample t-test
t_test_result <- t.test(group1, group2, alternative = "two.sided")
return(t_test_result$statistic)
}
# Set parameters
n <- 25 # sample size for each group
mu1 <- 120 # mean of group 1
mu2 <- 125 # mean of group 2 (5 mmHg difference)
sd <- 15 # standard deviation of both groups
# Number of simulations
n_sim <- 1000
# Simulate p-values for two-sample t-test
t_values_t_test <- replicate(n_sim, sample_t_test_t_value(n, mu1, mu2, sd))
hist(t_values_t_test , breaks = 30, main = "Distribution of T statistics from Two-Sample t-Test", xlab = "T", col = "lightblue")
Sample size is a key input in power simulations and must be set at the beginning of each simulation.
To determine the sample size required for a desired level of power, we run simulations across a range of sample sizes.
We then plot the estimated power for each sample size to visualize how power increases with sample size.
This approach helps identify the minimum sample size needed to achieve a target power (e.g., 0.8 or 0.9).
We can make a loop to simulate the power for different sample sizes or we can use vectorized operations for code efficiency.
# Let n be from 10 to 200 for each group
n_values <- 10:200
#use vectorized operation sapply
powers <- sapply(n_values, function(n) sample_t_test_power(n = n, mu1 = 120, mu2 = 125, sd = 15, nsim = 1000))
plot(powers ~ n_values, xlab = "Sample size", ylab = "Power")
abline(h = 0.8, col = "red", lty = 2)
In the two-sample t-test example, we generated data from two normal distributions and compared group means.
In this example, we simulate the same comparison using a linear regression model.
When the outcome is continuous and the exposure is binary (e.g., treatment vs. control), linear regression can be used to estimate the difference in means.
The model is specified as:
\[ Y_i = \beta_0 + \beta_1 X_i + \epsilon_i \]
Where:
\(Y_i\) is the outcome variable
\(X_i\) is a binary exposure variable
\(\beta_0\) is the intercept
\(\beta_1\) is the coefficient for the exposure variable
\(\epsilon_i\) is the error term
In term of \(\mu_1\) and \(\mu_2\) the above equation will be:
\[ Y_i = \mu_1 + (\mu_2 - \mu_1) X_i + \epsilon_i \]
This is equivalent to
\[ Y_i = \mu_1(1 - X_i) + \mu_2 X_i + \epsilon_i\]
Where \(\mu_1\) is the mean for the control group and \(\mu_2\) is the mean for the treatment group.
# Function to simulate data for linear regression and return p-value
power_lm_p_value <- function(n, mu1 = 0, mu2 = 0, sd = 1) {
# Generate binary exposure variable
X <- rbinom(n, 1, 0.5) # 50% chance of being in either group
# Generate outcome variable based on the linear model
Y <- mu1 * (1 - X) + mu2 * X + rnorm(n, 0, sd)
# Fit linear regression model
lm_result <- lm(Y ~ X)
#extract p-value for the exposure variable
p_value <- summary(lm_result)$coefficients["X", "Pr(>|t|)"]
return(p_value) # p-value for the exposure variable
}
# Set parameters
n <- 300 # sample size for both group
mu1 <- 120 # mean of group 1
mu2 <- 125 # mean of group 2 (5 mmHg difference)
sd <- 15 # standard deviation of both groups
# Number of simulations
n_sim <- 1000
# Simulate p-values for linear regression
p_values_lm <- replicate(n_sim, power_lm_p_value(n, mu1, mu2, sd))
# Calculate power as the proportion of p-values less than 0.05
power_lm <- mean(p_values_lm < 0.05)
power_lm
[1] 0.822
# Function with a covariate (e.g., weight)
power_lm_p_value_cov <- function(n, mu1 = 0, mu2 = 0, b1 = 0, sd = 1) {
X <- rbinom(n, 1, 0.5)
w <- rnorm(n, 170, 20)
Y <- mu1 * (1 - X) + mu2 * X + b1 * w + rnorm(n, 0, sd)
lm_result <- lm(Y ~ X + w)
summary(lm_result)$coefficients["X", "Pr(>|t|)"]
}
# Parameters
b1 <- 5
p_values_cov <- replicate(n_sim, power_lm_p_value_cov(n, mu1, mu2, b1, sd))
power_lm_cov <- mean(p_values_cov < 0.05)
power_lm_cov
[1] 0.822
In this section, we extend power simulation methods to more advanced scenarios, including:
Paired t-test: Simulating power for detecting a mean difference in paired data (e.g., before vs. after treatment).
Logistic Regression with a Binary Predictor: Estimating power to detect an association between a binary exposure and a binary outcome.
Poisson Regression: Estimating power for count data modeled with Poisson regression (e.g., number of events per time unit).
The paired t-test is used when measurements are taken on the same subject before and after a treatment, or when observations are naturally paired (e.g., left vs. right eye).
To simulate paired data, we need to account for the correlation between paired observations, which often requires simulating from a multivariate normal distribution.
This makes the simulation slightly more complex than for independent samples, but still manageable using R functions such as MASS::mvrnorm()
.
# Simulate power for a paired t-test using multivariate normal data
# Requires the MASS package for mvrnorm()
library(MASS)
power_paired_ttest <- function(mu1 = 0, mu2 = 1, # Means of the two conditions
sigma1 = 1, sigma2 = 1, # Standard deviations
rho = 0.5, # Correlation between paired values
alpha = 0.05, # Significance level
n = 20, # Number of paired observations
nsim = 1000, # Number of simulations
varequal = TRUE, # Whether variances are assumed equal (for completeness)
seed = 12345) {
# Compute the covariance from correlation and standard deviations
cov12 <- rho * sigma1 * sigma2
# Create covariance matrix
Sigma <- matrix(c(sigma1^2, cov12,
cov12, sigma2^2), nrow = 2)
# Set random seed for reproducibility
set.seed(seed)
# Initialize vector to store whether null was rejected
reject <- logical(nsim)
for (i in seq_len(nsim)) {
# Simulate n paired observations from bivariate normal
y <- mvrnorm(n, mu = c(mu1, mu2), Sigma = Sigma)
# Paired t-test
p <- t.test(y[, 1], y[, 2], paired = TRUE)$p.value
# Store whether the null hypothesis is rejected
reject[i] <- (p < alpha)
}
# Calculate power
power <- mean(reject)
cat("Estimated power:", round(power, 4), "\n")
return(power)
}
# Example usage
power_paired_ttest(mu1 = 0, mu2 = 1, sigma1 = 1, sigma2 = 1, rho = 0.5, nsim = 10000)
Estimated power: 0.9892
[1] 0.9892
Estimated power: 0.85
[1] 0.85
We simulate binary outcome data using the logistic regression model:
\[ \log\left(\frac{p}{1-p}\right) = \beta_0 + \beta_1 X \]
Where:
To estimate power:
# Function to estimate power for logistic regression
power_logistic <- function(p0 = 0.3, # probability of success when x = 0
b0 = 0, # intercept (logit scale)
b1 = 1, # effect size (log odds ratio)
n = 40, # sample size per simulation
alpha = 0.05, # significance level
nsim = 1000) { # number of simulations
# Create a large synthetic population
popsize <- 1e6
pop <- data.frame(x = rep(0:1, popsize / 2))
# Compute log-odds and convert to probabilities
pop$logodds <- b0 + b1 * pop$x
pop$p <- exp(pop$logodds)/(1+exp(pop$logodds)) # logistic transformation
# Simulate binary outcome from Bernoulli trials
pop$y <- rbinom(popsize, size = 1, prob = pop$p)
# Print average probabilities for x = 0 and x = 1
cat("Mean p for x=0:", mean(pop$p[pop$x == 0]), "\n")
cat("Mean p for x=1:", mean(pop$p[pop$x == 1]), "\n")
# Initialize storage for results
reject <- numeric(nsim)
b1_sim <- numeric(nsim)
# Simulation loop
for (i in 1:nsim) {
samp <- pop[sample(1:popsize, size = n), ] # draw sample
fit <- glm(y ~ x, data = samp, family = binomial) # fit model
b1_sim[i] <- coef(fit)["x"] # store estimated coefficient
# store whether null was rejected
reject[i] <- summary(fit)$coefficients["x", "Pr(>|z|)"] < alpha
}
# Compute and print estimated power and average coefficient
power <- mean(reject)
cat("Estimated power:", round(power, 4), "\n")
cat("Average estimated b1:", round(mean(b1_sim), 3), "\n")
}
# Run the power simulation
power_logistic(nsim = 1000)
Mean p for x=0: 0.5
Mean p for x=1: 0.7310586
Estimated power: 0.31
Average estimated b1: 1.161
Mean p for x=0: 0.5
Mean p for x=1: 0.7310586
Estimated power: 0.638
Average estimated b1: 1.011
We simulate count data using a Poisson regression model:
\[ \log(\lambda_i) = \beta_0 + \beta_1 X_i \]
Where:
power_poisson <- function(b0 = 0, b1 = 1,
n = 40, alpha = 0.05, nsim = 1000) {
# Step 1: Create a large synthetic population
popsize <- 1e6
pop <- data.frame(x = rep(0:1, popsize / 2)) # Binary predictor (0 or 1)
# Step 2: Define expected counts (lambda) based on Poisson regression model
pop$lambda <- exp(b0 + b1 * pop$x)
# Step 3: Generate Poisson-distributed outcomes for the entire population
pop$y <- rpois(popsize, lambda = pop$lambda)
# Print average counts by group (to show expected difference)
cat("Mean y for x = 0:", mean(pop$y[pop$x == 0]), "\n")
cat("Mean y for x = 1:", mean(pop$y[pop$x == 1]), "\n")
# Step 4: Initialize vectors to track results
reject <- rep(0, nsim) # 1 if null hypothesis is rejected
b1_sim <- rep(0, nsim) # Stores estimated b1 from each simulation
# Step 5: Run simulations
for (i in 1:nsim) {
# Sample n individuals from the population
samp <- pop[sample(1:popsize, size = n), ]
# Fit Poisson regression model to sampled data
m <- glm(y ~ x, data = samp, family = "poisson")
# Store estimated coefficient for x
b1_sim[i] <- coef(m)["x"]
# Check if p-value for x is less than alpha
reject[i] <- summary(m)$coefficients["x", 4] < alpha
}
# Step 6: Estimate power as proportion of simulations with significant result
power <- sum(reject) / nsim
cat("Power is", power, "\n")
# Also report average estimated effect size (b1)
cat("Mean b1 is", mean(b1_sim), "\n")
}
# Run the function with default parameters
power_poisson(nsim = 1000)
Mean y for x = 0: 0.998118
Mean y for x = 1: 2.721098
Power is 0.983
Mean b1 is 1.030355