Home » How to Create a Relative Frequency Histogram in R

How to Create a Relative Frequency Histogram in R

by Erma Khan

relative frequency histogram is a graph that displays the relative frequencies of values in a dataset.

This tutorial explains how to create a relative frequency histogram in R by using the histogram() function from the lattice, which uses the following syntax:

histogram(x, type)

where:

  • x: data
  • type: type of relative frequency histogram you’d like to create; options include percent, count, and density.

Default Histogram

First, load the lattice package:

library(lattice)

By default, this package creates a relative frequency histogram with percent along the y-axis:

#create data
data #create relative frequency histogram
histogram(data)

Relative frequency histogram in R

Modifying the Histogram

We can modify the histogram to include a title, different axes labels, and a different color using the following arguments:

  • main: the title
  • xlab: the x-axis label
  • ylab: the y-axis label
  • col: the fill color to use in the histogram

For example:

#modify the histogram
histogram(data,
          main='Points per Game by Player',
          xlab='Points per Game',
          col='steelblue')

Relative frequency histogram in R using lattice package

Modifying the Numbers of Bins

We can specify the number of bins to use in the histogram using the breaks argument:

#modify the number of bins
histogram(data,
          main='Points per Game by Player',
          xlab='Points per Game',
          col='steelblue',
          breaks=15)

Relative frequency histogram with adjusted bins in R

The more bins you specify, the more you will be able to get a granular look at your data. Conversely, the fewer number of bins you specify, the more aggregated the data will become:

#modify the number of bins
histogram(data,
          main='Points per Game by Player',
          xlab='Points per Game',
          col='steelblue',
          breaks=3)

Relative frequency histogram in R

Related: Use Sturges’ Rule to identify the optimal number of bins to use in a histogram.

Related Posts