A confidence interval is a range of values that is likely to contain a population parameter with a certain level of confidence.
This tutorial explains how to calculate the following confidence intervals in R:
1. Confidence Interval for a Population Mean
2. Confidence Interval for a Difference in Population Means
Let’s jump in!
Example 1: Confidence Interval for Population Mean in SAS
Suppose we have the following dataset that contains the height (in inches) of a random sample of 12 plants that all belong to the same species:
/*create dataset*/ data my_data; input Height; datalines; 14 14 16 13 12 17 15 14 15 13 15 14 ; run; /*view dataset*/ proc print data=my_data;
Suppose we would like to calculate a 95% confidence for the true population mean height of this species.
We can use the following code in SAS to do so:
/*generate 95% confidence interval for population mean*/ proc ttest data=my_data alpha=0.05; var Height; run;
The value for Mean shows the sample mean and the values under 95% CL Mean show the 95% confidence interval for the population mean.
From the output we can see that the 95% confidence interval for the mean weight of plants in this population is [13.4624 inches, 15.2042 inches].
Example 2: Confidence Interval for Difference in Population Means in SAS
Suppose we have the following dataset that contains the height (in inches) of a random sample of plants that belong to two different species:
/*create dataset*/
data my_data2;
input Species $ Height;
datalines;
A 14
A 14
A 16
A 13
A 12
A 17
A 15
A 14
A 15
A 13
B 15
B 14
B 19
B 19
B 17
B 18
B 20
B 19
B 17
B 15
;
run;
/*view dataset*/
proc print data=my_data2;
Suppose we would like to calculate a 95% confidence for difference in population mean height between species A and species B.
We can use the following code in SAS to do so:
/*sort data by Species to ensure confidence interval is calculated correctly*/
proc sort data=my_data2;
by Species;
run;
/*generate 95% confidence interval for difference in population means*/
proc ttest data=my_data2 alpha=0.05;
class Species;
var Height;
run;
The first table we need to look at in the output is Equality of Variances, which tests whether or not the variance between each sample is equal.
Since the p-value is not less than .05 in this table, we can assume that the variances between the two groups is equal.
Thus, we can look at the row that uses Pooled variance to find the 95% confidence interval for difference in population means.
From the output we can see that the 95% confidence interval for the difference in population means is [-4.6895 inches, -1.1305 inches].
This tells us we can be 95% confident that the true difference between the mean height of plants in species A compared to species B is between -4.6895 inches and -1.1305 inches.
Since 0 is not in this confidence interval, this indicates that there is a statistically significant difference between the two population means.
Additional Resources
The following tutorials explain how to perform other common tasks in SAS:
How to Perform a One Sample t-Test in SAS
How to Perform a Two Sample t-Test in SAS
How to Perform a Paired Samples t-Test in SAS