Plotting distributions (ggplot2)

    Solution

    This sample data will be used for the examples below:

    The function is supposed make the same graphs as ggplot, but with a simpler syntax. However, in practice, it’s often easier to just use ggplot because the options for qplot can be more confusing to use.

    1. ## Basic histogram from the vector "rating". Each bin is .5 wide.
    2. ## These both result in the same output:
    3. ggplot(dat, aes(x=rating)) + geom_histogram(binwidth=.5)
    4. # qplot(dat$rating, binwidth=.5)
    5. # Draw with black outline, white fill
    6. ggplot(dat, aes(x=rating)) +
    7. geom_histogram(binwidth=.5, colour="black", fill="white")
    8. # Density curve
    9. # Histogram overlaid with kernel density curve
    10. ggplot(dat, aes(x=rating)) +
    11. geom_histogram(aes(y=..density..), # Histogram with density instead of count on y-axis
    12. binwidth=.5,
    13. colour="black", fill="white") +
    14. geom_density(alpha=.2, fill="#FF6666") # Overlay with transparent density plot

    plot of chunk unnamed-chunk-3plot of chunk unnamed-chunk-3

    Add a line for the mean:

    1. # Overlaid histograms
    2. ggplot(dat, aes(x=rating, fill=cond)) +
    3. geom_histogram(binwidth=.5, alpha=.5, position="identity")
    4. # Interleaved histograms
    5. ggplot(dat, aes(x=rating, fill=cond)) +
    6. geom_histogram(binwidth=.5, position="dodge")
    7. ggplot(dat, aes(x=rating, colour=cond)) + geom_density()
    8. # Density plots with semi-transparent fill
    9. ggplot(dat, aes(x=rating, fill=cond)) + geom_density(alpha=.3)

    plot of chunk unnamed-chunk-5plot of chunk unnamed-chunk-5

    Add lines for each mean requires first creating a separate data frame with the means:

    plot of chunk unnamed-chunk-6

    Using facets:

    1. ggplot(dat, aes(x=rating)) + geom_histogram(binwidth=.5, colour="black", fill="white") +
    2. facet_grid(cond ~ .)
    3. # With mean lines, using cdat from above
    4. ggplot(dat, aes(x=rating)) + geom_histogram(binwidth=.5, colour="black", fill="white") +
    5. facet_grid(cond ~ .) +
    6. geom_vline(data=cdat, aes(xintercept=rating.mean),
    7. linetype="dashed", size=1, colour="red")

    See ) for more details.

    plot of chunk unnamed-chunk-8plot of chunk unnamed-chunk-8

    It’s also possible to add the mean by using stat_summary.

    1. # Add a diamond at the mean, and make it larger
    2. stat_summary(fun.y=mean, geom="point", shape=5, size=4)