6.3: Parametric Hypothesis Tests

Open In Colab  


Hypothesis tests and confidence intervals are two examples of statistical inference. There is an unknown population parameter, and we would like estimate or assess claims about the parameter. We collect sample data, and:

We can apply Monte Carlo resampling (bootstrapping) methods and/or parametric methods (using the Central Limit Theorem) to model a sampling distribution that is the foundation for constructing a confidence interval. Similarly, for hypothesis tests we have both resampling and parametric methods for measuring the significance of test statistics.

What is Significant Enough?


The general process form performing a hypothesis test is informally:

  1. State the null and alternative hypotheses in terms of population parameter(s).
  2. Collect data from a sample and calculate a test statistic.
  3. Calculate the p-value to measure the significance of the test statistic.
  4. Make a conclusion (if possible).
  5. Clearly summarize the results for a general audience.

We have discussed Steps 1 and 2 and used resampling methods as one method to calculate p-values in Step 3. Refer to Appendix A for a summary of the steps outlined above. Before investigating parametric methods for computing p-values, let’s discuss steps and 4 and 5:

How do we use p-values to decide whether the evidence is significant enough to reject \(H_0\) and accept the claim we hoped to prove in \(H_a\)?

  • The smaller the p-value, the stronger the evidence contradicting \(H_0\).
  • How small does the p-value need to be in order to claim the evidence is strong enough to reject \(H_0\)?

The Significance Level


The significance level of a test, denoted \({\color{dodgerblue}{\alpha}}\), is the value we choose that is used to determine whether the p-value is small enough to claim the result is statistically significant and reject \(H_0\).

  • If p-value \(\leq \alpha\), we reject \(H_0\).
  • If p-value \(> \alpha\), we do not reject \(H_0\).

Generally speaking, \(H_0\) is a claim we currently accept as true. \(H_a\) is some new and interesting result that if true would contradict the currently accepted belief in \(H_0\). We typically require compelling evidence, beyond a “reasonable doubt”, to reject the currently accepted claim in \(H_0\) in favor of a new and competing claim in \(H_a\).

  • The default significance level is typically 5% (or \(\alpha = 0.05\)).
  • Some other (less) common significance levels are \(\alpha = 0.1\), \(0.01\) or \(0.001\).
  • The smaller we set \(\alpha\), the more certainty we require to reject \(H_0\).
Note

The significance level is not something we compute. We choose the significance level for the test, and the significance level should be determined prior to our analysis. Do not first calculate the p-value, and then retroactively choose the significance level to ensure the result is significant.

Summarizing the Results


There are two possible results with hypothesis tests:

  • If p-value \(\leq \alpha\), the test is statistically significant.

    • There is strong enough evidence to reject \(H_0\).
    • And thus, we accept the competing claim in \(H_a\).
  • If p-value \(> \alpha\), the test is not statistically significant and there is not sufficient evidence to reject \(H_0\):

    • We fail to reject \(H_0\) (which is different from accepting \(H_0\)).
    • The test is inconclusive regarding the claims in \(H_0\) and \(H_a\).

In the end, we want to be sure we communicate the results clearly, in proper context, to a more general audience that may not have an advanced background in statistics and mathematics.

Question 1


Tropical Cyclone Structure

Credit: Kelvinsong, CC BY 3.0, via Wikimedia Commons

Pressure is a common measurement used to characterize the strength of a storm. The lower the storm pressure, the higher the wind speeds, and the more dangerous the storm. Let \(\mu\) denote the mean pressure (in millibars) of all storms in the North Atlantic. Suppose we set up the following hypothesis to test claims about the value of \(\mu\).

  • \(H_0\): \(\mu = 950\). The mean pressure of all storms in the North Atlantic is 950 millibars.
  • \(H_a\): \(\mu \ne 950\). The mean pressure of all storms in the North Atlantic is not 950 millibars.

We collect a random sample of \(n\) storm pressure observations with a sample mean larger than 950 millibars that has a p-value equal to \(0.012\).

Question 1a


Summarize the results if we perform the hypothesis test using a 5% significance level. Be sure to explain in the context of the example using terminology a more general audience would understand.

Solution to Question 1a






Question 1b


Summarize the results if we perform the hypothesis test using a 10% significance level. Be sure to explain in the context of the example using terminology a more general audience would understand.

Solution to Question 1b






Question 1c


Summarize the results if we perform the hypothesis test using a 1% significance level. Be sure to explain in the context of the example using terminology a more general audience would understand.

Solution to Question 1c






Question 1d


Suppose we instead we want to show the mean storm pressure is greater than 950 millibars.

  • \(H_0\): \(\mu = 950\). The mean pressure of all storms in the North Atlantic is 950 millibars.
  • \(H_a\): \(\mu > 950\). The mean pressure of all storms in the North Atlantic is greater than 950 millibars.

Assume we still have the same sample of size \(n\) with the same sample mean (which is greater than 950 millibars) as in Question 1. Recall this sample has a p-value equal to \(0.012\) for a two-tailed test. Using the same sample, we now test the one-tailed hypotheses instead.

  • What would be the p-value for this same sample be if we use this one-tailed test instead?
  • Summarize the result of the one-tailed test in practical terms if we use a significance level of 5%.
  • Summarize the result of the one-tailed test in practical terms if we use a significance level of 1%.

Solution to Question 1d






Test For a Single Proportion


Question 2


Paul the Oracle Octopus

Credit: Mbz1, CC BY-SA 3.0, via Wikimedia Commons

In the 2010 World Cup in Germany, Paul the octopus (aka the Oracle Octopus) correctly predicted the correct outcome in all 8 of the matches he predicted. Paul beat out his rival Mani, Singapore’s psychic parakeet, who predicted six matches in a row before missing a prediction. Below is an excerpt from a recent article1 about animal clairvoyance and the World Cup.

No FIFA World Cup would be complete without “psychic” animals predicting the winners, and Qatar 2022 has been no exception. From “clairvoyant” camels to “mystic” elephants and “cryptic” rats, a range of animals – big and small – have tried their paws, hooves and tentacles at predicting the score line.

It all started with Paul, the “psychic” octopus. The eight-tentacled icon put TV pundits to shame with an incredible string of correct World Cup winner predictions from his glass tank at the Aquarium Sea Life Centre in Oberhausen, Germany. The tentacled tipster had an incredible success rate: he correctly predicted eight world cup matches at South Africa’s tournament in 2010, including Spain beating the Netherlands in the World Cup final.

In 2010, Paul the Octopus “predicted” 8 matches and made 8 correct predictions. Is this evidence that Paul actually has psychic powers?

Question 2a


Set up the null and alternative hypotheses in terms of the proportion of all World Cup matches ever played that Paul correctly predicts. State hypotheses both in words and using appropriate notation.

Solution to Question 2a







Question 2b


How unusual would this be if Paul was just randomly guessing? Compute the p-value corresponding to this sample of \(n=8\) predictions all of which are correct.

Solution to Question 2b







Question 2c


Summarize the result of the test to a general audience if a 5% significance level is chosen.

Solution to Question 2c







p-value for a Single Proportion


Let \(X\) be a binomial random variable and let \(p_0\) denote the value of \(p\) claimed in \(H_0\). If we observe \(X=x\) successes out of a sample of \(n\) independent and identical trials, then we can find the p-value using the binomial null distribution \({\color{dodgerblue}{X \sim \mbox{Binom}(n, {\color{tomato}{p_0}})}}\).

Test for a Single Mean: Known \(\sigma^2\)


Question 3


The mean height of all adult males in the United Kingdom is claimed2 to be \(68.5\) inches (\(5\) foot \(8.5\) inches or \(173.9\) cm) with a standard deviation of \(2.5\) inches (or \(6.35\) cm). A physician suspects males in her town seem to be taller than average when compared to the population of all adult males in the UK. She collects data from a random sample of \(n=25\) adult males from the town and calculates the mean height of the sample is \(69.25\) inches.

Question 3a


Set up the null and alternative hypotheses (both in words and using appropriate notation) to test the physician’s claim that adult males in the town are taller than the national average height for all adult males in the UK.

Solution to Question 3a


  • \(H_0\): ??

  • \(H_a\): ??




Question 3b


Compute the test statistic.

Solution to Question 3b







Question 3c


What is a reasonable null distribution to use to perform this test? Standardize the test statistic (give the \(z\)-score) from Question 3b. Interpret the meaning of the standardized test statistic.

Solution to Question 3c







Question 3d


Compute the p-value and interpret the meaning in practical terms.

Solution to Question 3d







Question 3e


Shade area(s) under the graph of a null distribution corresponding to the p-value. Either make an informal sketch on paper or see Appendix B to plot in R.

Solution to Question 3e







Question 3f


  • If a 5% significance level is chosen, summarize the result in practical terms.
  • If a 10% significance level is chosen, summarize the result in practical terms.

Solution to Question 3f







p-value for a Single Mean: Known \(\sigma^2\)


Suppose a random sample size \(n\) is picked from a population with known population variance \(\sigma^2\) but unknown mean \(\mu\). If we are doing a hypothesis test on a single mean with null claim \({\color{tomato}{H_0: \mu = \mu_0}}\), then as long as the population is symmetric or the sample size is large enough \((n \geq 30)\), we can use the Central Limit Theorem for means to:

  • Model the null distribution with the sampling distribution \({\color{dodgerblue}{\overline{X} \sim N \left( {\color{tomato}{\mu_0}}, \frac{\sigma}{\sqrt{n}} \right)}}\).
  • Calculate the standardized test statistic which is the \(z\)-score of the sample mean:

\[{\color{dodgerblue}{z = \frac{\mbox{sample stat}- {\color{tomato}{\mbox{null claim}}}}{\mbox{SE}(\overline{X})} = \dfrac{\bar{x} - {\color{tomato}{\mu_0}}}{\frac{\sigma}{\sqrt{n}}}}}.\]

Test for a Single Mean: Unknown \(\sigma^2\)


As with confidence intervals, when estimating an unknown population mean, we often do not know the population variance. Nevertheless, we can still conduct a hypothesis test on a single mean, but there will be some additional uncertainty due to our need to estimate \(\sigma^2\). Below we work through an example using the storms data frame in the dplyr package to devise a method for computing p-values under these circumstances.

Picking a Random Sample of Storm Pressures


The storms data set is from the NOAA Hurricane Best Track Data. We will perform a hypothesis test to test claims about the mean storm pressure, so we will need to analyze the variable pressure.

  • Run the code cell below to load the dplyr package (which should already be installed).
library(dplyr)  # load dplyr package
  • Run the code cell below to pick a random sample of \(n=32\) storm pressures from storms.
my.sample <- sample(storms$pressure, size=32, replace=FALSE)

Question 4


It is claimed3 that the average pressure of all North Atlantic storms is 950 millibars. You believe this claim is inaccurate and would like to show the average pressure of all storms is not 950 millibars.

Question 4a


Set up the null and alternative hypotheses both in words and using appropriate notation.

Solution to Question 4a


  • \(H_0\): ??

  • \(H_a\): ??




Question 4b


Compute the test statistic.

Solution to Question 4b







Question 4c


What is a reasonable standardized null distribution to use to perform this test? Standardize the test statistic from Question 4b and interpret its meaning.

Solution to Question 4c







Question 4d


Compute the p-value and interpret the meaning in practical terms.

Solution to Question 4d







Question 4e


Shade area(s) under the graph of a null distribution corresponding to the p-value in Question 4d. Either make an informal sketch on paper or see Appendix B to plot in R.

Solution to Question 4e







Question 4f


If a 5% significance level is chosen, summarize the result in practical terms.

Solution to Question 4f







p-value for a Single Mean: Unknown \(\sigma^2\)


Suppose a random sample size \(n\) is picked from a population with unknown population mean and variance. If we are doing a hypothesis test on a single mean with null claim \(H_0: \mu = \mu_0\), then as long as the population is symmetric or the sample size is large enough \((n \geq 30)\):

  • The standardized test statistic is called the t-test statistic:

\[{\large \boxed{{\color{dodgerblue}{{\color{tomato}{t}} = \frac{\mbox{sample stat}-\mbox{null claim}}{\mbox{SE}(\overline{X})} = \dfrac{\bar{x} - \mu_0}{\frac{{\color{tomato}{s}}}{\sqrt{n}}}}}}}.\]

  • The null distribution is the distribution of t-test statistics that we model using a \(t\)-distribution with \(n-1\) degrees of freedom.

In R, we can use the command t.test(x, mu = [null], alt = [direction]).

  • Sample data is stored in the vector x.
  • Set the option mu equal to the value, \(\mu_0\), claimed in \(H_0\).
  • Set the option alt based on the inequality used in \(H_a\).
    • Use "greater" for right-tailed test.
    • Use "less" for left-tailed test.
    • Use "two.sided" for a two-tailed test.
    • If you do not indicate any alt option, the default is a two-tailed test.

Question 5


Check your results for the hypothesis test in Question 4 using the t.test() function.

Solution to Question 5


Fill in the options for the t.test() function in the code cell below.

t.test(??)




Question 6


The output of t.test() gives both a p-value and a 95% confidence interval (by default). Let’s interpret the confidence interval and see if we obtain a result that is consistent with our summary in Question 4f.

Question 6a


Based on the output of your code in Question 5, what is a 95% confidence interval for the mean pressure of all North Atlantic storms?

Solution to Question 6a


A 95% confidence interval for the mean pressure of all storms is from ?? millibars to ?? millibars.



Question 6b


Based on the 95% confidence interval Question 6a, is 950 millibars (the null claim for \(\mu\) in \(H_0\)) a plausible estimate for \(\mu\)? Is this consistent with your answer in Question 4f? Explain why or why not.

Solution to Question 6b





Connection Between CI’s and Two-Tailed Tests


If we are performing a two-tailed hypothesis test using a significance level \(\alpha=0.05\), then we can reject the null hypothesis if either:

  • The \(p\)-value of our sample is less than or equal to \(\alpha=0.05\), or
  • The value \(\mu_0\) claimed in \(H_0\) is NOT inside a 95% confidence interval.
  • The two statements above are equivalent to one another. We do not need to check both!

We can adjust the statements above for a hypothesis test performed at other significance levels. For example, if we are conducting a two-tailed test at a 1% significance level, we can use 99% confidence interval instead of calculating a p-value.

Caution

Confidence intervals include the middle 95% of samples by excluding the most extreme values in both tails, each with area \(\frac{\alpha}{2}\). If we are performing a one-tailed test, then we only focus on area in one tail of the null distribution and if the area in one tail is less than or equal to \(\alpha\). Thus, we cannot interpret confidence intervals in the same fashion for one-tailed tests.

Appendix A: Summary of Hypothesis Tests


  1. State the hypotheses and identify (from the alternative claim in \(H_a\)) if it is a one or two-tailed test.

    • \(H_0\) is the “boring” claim. Express using an equal sign \(=\).
    • \(H_a\) is the claim we want to show is likely true. Use inequality sign (\(>\), \(<\), or \(\ne\)).
    • State both \(H_0\) and \(H_a\) in terms of population parameters such as \(\mu\) and \(p\).
  2. Compute the test statistic.

    • If the observed sample contradicts the null claim, the result is significant.
    • A standardized test statistic measures how many SE’s the observed stat is from the null claim.
    • A standardized test statistic with a large absolute value is supporting evidence to reject \(H_0\).
  3. Using the null distribution, compute the p-value. The p-value is the probability of getting a sample with a test statistic as or more extreme than the observed sample assuming \(H_0\) is true.

    • The p-value is the area in one or both tails beyond the test statistic.
    • The p-value is a probability, so we have \(0 < \mbox{p-value} < 1\).
    • The smaller the p-value, the stronger the evidence to reject \(H_0\).
  4. Based on the significance level, \(\alpha\), make a decision to reject or not reject the null hypothesis

    • If p-value \(\leq \alpha\), we reject \(H_0\).
    • If p-value \(> \alpha\), we do not reject \(H_0\).
  5. Summarize the results in practical terms, in the context of the example.

    • If we reject \(H_0\), this means there is enough evidence to support the claim in \(H_a\).
    • If we do not reject \(H_0\), this means there is not evidence to reject \(H_0\) nor support \(H_a\). The test is inconclusive.

Appendix B: Illustrating p-values in R


The plots below require the package ggplot2 that is loaded in the code cell below. Be sure to first run the code cell below to load ggplot2 before running any of the code cells that follow.

Warning

If you receive an error running the code cells below, it is possible you have not installed the ggplot2 package. Run the command install.packages("ggplot2") in the R console and try running the code cells below again.

library(ggplot2)

Illustrating p-values: Known Population Variance


If we are performing a hypothesis test on a single mean for a population whose variance is known, then we can either use:

  • The null distribution is \({\color{dodgerblue}{\overline{X} \sim N \left( \mu_0, \frac{\sigma}{\sqrt{n}} \right)}}\) with test statistic is \(\bar{x}\), or
  • The standardized normal distribution \(Z \sim N(0,1)\) with standardized test statistic

\[{\color{dodgerblue}{z = \dfrac{\bar{x} - \mu_0}{\frac{\sigma}{\sqrt{n}}}}}.\]

  • In the code cell below, enter values for the mean and standard error of the null distribution as well as the test statistic.
null.mean <- ??  # mean of the null distribution
null.se <- ??  # standard error of the null distribution
test.stat <- ??  # test statistic

Two-Tailed Test: Known Variance


To illustrate the p-value for a two-tailed test:

  • Be sure you have already loaded the ggplot2 package and defined null.mean, null.se and test.stat.
  • Run the code cell below. No edits are needed.
################################################
# for a two-tailed test run the code cell below
################################################
x.diff <- abs(null.mean - test.stat)
lower.x <- null.mean - x.diff
upper.x <- null.mean + x.diff

end.diff <- max(x.diff, 4*null.se)
xmax <- null.mean + end.diff
xmin <- null.mean - end.diff


ggplot(NULL, aes(c(xmin, xmax))) +
  geom_area(stat = "function",   fun = dnorm,
            args = list(mean = null.mean, sd = null.se),
            color = "black", fill = NA,
            xlim = c(lower.x, upper.x)) +
  geom_area(stat = "function",   fun = dnorm,
            args = list(mean = null.mean, sd = null.se),
            color = "black", fill = "firebrick2",
            xlim = c(xmin, lower.x)) +
  geom_area(stat = "function",   fun = dnorm,
            args = list(mean = null.mean, sd = null.se),
            color = "black", fill = "firebrick2",
            xlim = c(upper.x, xmax)) +
  geom_vline(xintercept = c(lower.x, upper.x), linetype="dashed",
                color = "firebrick2", linewidth = 1) +
  labs(x = "test statistic", y = "") +
  scale_y_continuous(breaks = NULL) +
  scale_x_continuous(breaks=c(lower.x,  upper.x)) +
  geom_hline(yintercept=0) +
  theme_bw() +
  theme(axis.text.x=element_text(size=15, color = "firebrick2"))

Left-Tailed Test: Known Variance


To illustrate the p-value for a left-tailed test:

  • Be sure you have already loaded the ggplot2 package and defined null.mean, null.se and test.stat.
  • Run the code cell below. No edits are needed.
################################################
# for a left-tailed test run the code cell below
################################################
xmin <- min(null.mean - 4*null.se, test.stat)
xmax <- null.mean + 4*null.se


ggplot(NULL, aes(c(xmin, xmax))) +
  geom_area(stat = "function",   fun = dnorm,
            args = list(mean = null.mean, sd = null.se),
            color = "black", fill = NA,
            xlim = c(test.stat, xmax)) +
  geom_area(stat = "function",   fun = dnorm,
            args = list(mean = null.mean, sd = null.se),
            color = "black", fill = "firebrick2",
            xlim = c(xmin, test.stat)) +
  geom_vline(xintercept = test.stat, linetype="dashed",
                color = "firebrick2", linewidth = 1) +
  labs(x = "test statistic", y = "") +
  scale_y_continuous(breaks = NULL) +
  scale_x_continuous(breaks=test.stat) +
  geom_hline(yintercept=0) +
  theme_bw() +
  theme(axis.text.x=element_text(size=15, color = "firebrick2"))

Right-Tailed Test: Known Variance


To illustrate the p-value for a right-tailed test:

  • Be sure you have already loaded the ggplot2 package and defined null.mean, null.se and test.stat.
  • Run the code cell below. No edits are needed.
################################################
# for a right-tailed test run the code cell below
################################################
xmax <- max(null.mean + 4*null.se, test.stat)
xmin <- null.mean - 4*null.se


ggplot(NULL, aes(c(xmin, xmax))) +
  geom_area(stat = "function",   fun = dnorm,
            args = list(mean = null.mean, sd = null.se),
            color = "black", fill = NA,
            xlim = c(xmin, test.stat)) +
  geom_area(stat = "function",   fun = dnorm,
            args = list(mean = null.mean, sd = null.se),
            color = "black", fill = "firebrick2",
            xlim = c(test.stat, xmax)) +
  geom_vline(xintercept = test.stat, linetype="dashed",
                color = "firebrick2", linewidth = 1) +
  labs(x = "test statistic", y = "") +
  scale_y_continuous(breaks = NULL) +
  scale_x_continuous(breaks=test.stat) +
  geom_hline(yintercept=0) +
  theme_bw() +
  theme(axis.text.x=element_text(size=15, color = "firebrick2"))

Illustrating p-values: Unknown Population Variance


If we are performing a hypothesis test on a single mean for a population whose variance is unknown, then we use \(\mathbf{t_{n-1}}\), a \(t\)-distribution with \(n-1\) degrees of freedoms, for the null distribution and have t-test statistic

\[{\color{dodgerblue}{{\color{tomato}{t}} = \dfrac{\bar{x} - \mu_0}{\frac{{\color{tomato}{s}}}{\sqrt{n}}}}}.\]

  • In the code cell below, enter the value of the t-test statistic and the degrees of freedom.
test.stat <- ??  # t-test statistic
deg.free <- ??  # degrees of freedom  

Two-Tailed Test: Unknown Variance


To illustrate the p-value for a two-tailed test:

  • Be sure you have already loaded the ggplot2 package and defined test.stat and deg.free.
  • Run the code cell below. No edits are needed.
################################################
# for a two-tailed test run the code cell below
################################################
v <- deg.free
end.t <- qt(0.9997, v)
test.stat <- abs(test.stat)
xmax <- max(end.t, test.stat)
xmin <- -xmax


ggplot(NULL, aes(c(xmin, xmax))) +
  geom_area(stat = "function",   fun = dt,
            args = list(df = v),
            color = "black", fill = NA,
            xlim = c(-test.stat, test.stat)) +
  geom_area(stat = "function",   fun = dt,
            args = list(df = v),
            color = "black", fill = "firebrick2",
            xlim = c(xmin, -test.stat)) +
  geom_area(stat = "function",   fun = dt,
            args = list(df = v),
            color = "black", fill = "firebrick2",
            xlim = c(test.stat, xmax)) +
  geom_vline(xintercept = c(-test.stat, test.stat), linetype="dashed",
                color = "firebrick2", linewidth = 1) +
  labs(x = "test statistic", y = "") +
  scale_y_continuous(breaks = NULL) +
  scale_x_continuous(breaks=c(-test.stat,  test.stat)) +
  geom_hline(yintercept=0) +
  theme_bw() +
  theme(axis.text.x=element_text(size=15, color = "firebrick2"))

Left-Tailed Test: Unknown Variance


To illustrate the p-value for a left-tailed test:

  • Be sure you have already loaded the ggplot2 package and defined test.stat and deg.free.
  • Run the code cell below. No edits are needed.
################################################
# for a left-tailed test run the code cell below
################################################
v <- deg.free
end.t <- qt(0.9997, v)
xmin <- min(-end.t, test.stat)
xmax <- end.t


ggplot(NULL, aes(c(xmin, xmax))) +
  geom_area(stat = "function",   fun = dt,
            args = list(df = v),
            color = "black", fill = NA,
            xlim = c(test.stat, xmax)) +
  geom_area(stat = "function",   fun = dt,
            args = list(df = v),
            color = "black", fill = "firebrick2",
            xlim = c(xmin, test.stat)) +
  geom_vline(xintercept = test.stat, linetype="dashed",
                color = "firebrick2", linewidth = 1) +
  labs(x = "test statistic", y = "") +
  scale_y_continuous(breaks = NULL) +
  scale_x_continuous(breaks=test.stat) +
  geom_hline(yintercept=0) +
  theme_bw() +
  theme(axis.text.x=element_text(size=15, color = "firebrick2"))

Right-Tailed Test: Unknown Variance


To illustrate the p-value for a right-tailed test:

  • Be sure you have already loaded the ggplot2 package and defined test.stat and deg.free.
  • Run the code cell below. No edits are needed.
################################################
# for a right-tailed test run the code cell below
################################################
v <- deg.free
end.t <- qt(0.9997, v)
xmax <- max(end.t, test.stat)
xmin <- -end.t

ggplot(NULL, aes(c(xmin, xmax))) +
  geom_area(stat = "function",   fun = dt,
            args = list(df = v),
            color = "black", fill = NA,
            xlim = c(xmin, test.stat)) +
  geom_area(stat = "function",   fun = dt,
            args = list(df = v),
            color = "black", fill = "firebrick2",
            xlim = c(test.stat, xmax)) +
  geom_vline(xintercept = test.stat, linetype="dashed",
                color = "firebrick2", linewidth = 1) +
  labs(x = "test statistic", y = "") +
  scale_y_continuous(breaks = NULL) +
  scale_x_continuous(breaks=test.stat) +
  geom_hline(yintercept=0) +
  theme_bw() +
  theme(axis.text.x=element_text(size=15, color = "firebrick2"))

Creative Commons License

Statistical Methods: Exploring the Uncertain by Adam Spiegler is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.


  1. “The ‘psychic’ animals predicting who will win the World Cup”, ABC News, accessed July 13, 2023.↩︎

  2. “Height, Weight, and Body Mass of the British Population Since 1820” by Roderick Floud, National Bureau of of Economic Research, October 1998.↩︎

  3. The University of Arizona Hydrology and Atmospheric Sciences, accessed July 13, 2023.↩︎