Data Visualization

Leykun Getaneh (MSc)

NDMC, EPHI


July 21 - 25, 2025

What is data visualization?

  • Data visualization is the presentation of data in a pictorial or graphical format, and
  • A data visualization tool is the software that generates this presentation.
  • Effective data visualization provides users with intuitive means to
    • interactively explore and analyze data,
    • enabling them to effectively identify interesting patterns,
    • infer correlations and causalities, and
    • supports sense-making activities.
  • Good visual presentations tend to enhance the message of the visualization.

What is data visualization?

  • What are the key principles, methods, and concepts required to visualize data for publications, reports, or presentations?

  • The effectiveness of data visualization depends on several factors

  • What would you like to communicate?

  • Who is your audience? Researchers? Journalists? General public? Grant reviewers?

  • What is the best way to represent your data and your message?

    • Is it through a box plot?
    • Should you use blue or red?
    • What scale should you use?
    • Should you add or should you remove information?

Important packages to create figures

A few packages to create figures in R are

  • ggplot2grammer of graphics
  • cowplot for composing ggplots
  • ggtext for advanced text rendering
  • ggthemes for additional themes
  • grid for creating graphical objects
  • gridExtra additional functions grid
  • patchwork for multi-panel plots
  • ggiraph interactive visualizations
  • highcharterinteractive visualizations
  • plotly interactive visualizations

The basic components of plot using ggplot2 Package

  • ggplot2 is a system for declaratively creating graphics, based on the Grammar of Graphics.

  • You provide the data, tell ggplot2 how to map variables to aesthetics, what graphical primitives to use, and it takes care of the details.

Why ggplot2?

  • A grammar of graphics is a grammar used to describe and create a wide range of statistical graphics.
  • The promise of a grammar for graphics.
  • Easy to manage, save, etc.
  • Graphs are composed of layers.
  • Easy to add stuff to existing graphs.
  • ggplot2 graphics take less work to make beautiful and eye-catching graphics.
  • Enables the creation of reproducible visualization patterns.
  • Publication quality & beyond

ggplot2 mechanics: the basics

A ggplot is built up from a few basic elements:

  1. Data: The raw data that you want to plot.
  2. Geometries geom_: The geometric shapes that will represent the data.
  3. Aesthetics aes(): Aesthetics of the geometric and statistical objects, such as position, color, size, shape, and transparency
  4. Scales scale_: Maps between the data and the aesthetic dimensions, such as data range to plot width or factor values to colors.
  5. Statistical transformations stat_: Statistical summaries of the data, such as quantiles, fitted curves, and sums.
  6. Coordinate system coord_: The transformation used for mapping data coordinates into the plane of the data rectangle.
  7. Facets facet_: The arrangement of the data into a grid of plots.
  8. Visual themes theme(): The overall visual defaults of a plot, such as background, grids, axes, default typeface, sizes and colors.

Components of the layered grammar

  • Layer
    • Data
    • Mapping
    • Statistical transformation (stat)
    • Geometric object (geom)
    • Position adjustment (position)
  • Scale
  • Coordinate system (coord)
  • Faceting (facet)

Data

  • Data defines the source of the information to be visualized.
  • Must be a data.frame
  • Gets pulled into the ggplot() object

Aesthetics (aes()) (a.k.a. mapping)

  • x, y: variables
  • colour: colours the lines of geometries
  • fill: fill geometries or fill color
  • group: groups based on the data
  • shape: shape of point, an integer value 0 to 24, or NA
  • linetype: type of line, a integer value 0 to 6 or a string
  • size: sizes of elements, a non-negative numeric value
  • alpha: changes the transparency,a numeric value 0 to 1

Data

Code
# data and aesthetics
ggplot(data, mapping = aes(x, y, ...))
  • shape values

“shape: shape value”
  • line type value

“shape: shape value”

Geometries (geom_*()) function

The general syntax is:

  • ggplot(data = data, mapping = aes(mapings))+ geom_function()

  • Geom Components

    Geom Description Input
    geom_histogram Histograms Continous x
    geom_bar Bar plot with frequncies Discrete x
    geom_point Points/scattorplots Discrete/continuous x and y
    geom_boxplot Box plot Disc. x and cont. y
    geom_smooth function line based on data
    geom_line Line plots Discrete/continuous x and y
    geom_abline Reference line intercept and slope value
    geom_hline geom_vline Reference lines xintercept or yintercept

geom_*() functions

Positions

  • geom_bar(position = "<position >")
  • When we have aesthetics mapped, how are they positioned?
  • bar: dodge, fill, stacked (default)
  • point: jitter

Facets

facet_grid vs facet_wrap

  • facet_grid() facets the plot with a variable in a single direction (horizontal or vertical)
  • facet_wrap() simply places the facets next to each other and wraps them according to the provided number of columns and/or rows.

The following table describes how facet formulas work in facet_grid() and facet_wrap():

Type Formula Description
Grid facet_grid(. ~ x) Facet horizontally across x values
Grid facet_grid(y ~ .) Facet vertically across y values
Grid facet_grid(y ~ x) Facet 2-dimensionally
Wrap facet_wrap(~ x) Facet across x values
Wrap facet_wrap(~ x + y) Facet across x and y values

Facets

  • Statistics (stat_*()) computed on the data.
    • stat_*()-like functions perform computations such as means, counts, linear models, and other statistical summaries of data.
  • Coordinates (coord_*()) establish representation rules to print the data
    • coord_cartesian() for the Cartesian plane;
    • coord_polar() for circular plots;
    • coord_map() for different map projections.

Themes

Code
plot + theme_gray(base_size = 11, base_family = "")
  • Theme is what controls the overall appearance of the ggplot visualiation.
  • ggplot2 offers several predefined themes that can be quickly applied to the ggplot object. see the details in section Themes

Practice with ggplot2

  1. Create a simple plot object: plot.object <- ggplot()
  2. Add geometric layers: plot.object <- plot.object + geom_*()
  3. Add appearance layers: plot.object <- plot.object + coord_*() + theme()
  4. Repeat steps 2 and 3 until satisfied, then print: plot.object or print(plot.object)

Practice with ggplot2

  • dataset to practice: palmerpenguins

We will use the palmerpenguins data set:

This data set contains size measurements for three penguin species observed on three islands in the Palmer Archipelago, Antarctica.

Let us take a look at the variables in the penguins data set:

Code
library(palmerpenguins)
data(penguins)
#str(penguins)

Practice with ggplot2

  • species, island, and sex are factor variables,
  • bill measurements depicted in the image are numeric variables,
  • two integer variables (flipper length and body mass).
  • Prepare data for ggplot2
  • ggplot2 requires you to prepare the data as an object of class data.frame or tibble (common in the tidyverse).
Code
library(tibble)
class(penguins) # all set!
[1] "tbl_df"     "tbl"        "data.frame"
Code
peng <- as_tibble(penguins) # acceptable
class(peng)
[1] "tbl_df"     "tbl"        "data.frame"

Practice with ggplot2

More complex plots in ggplot2 require the long data frame format.

  • Scientific questions about penguins
  • Scientific questions

  • Is there a relationship between the length & the depth of bills?

  • Does the size of the bill & flipper vary together ?

  • How are these measures distributed among the 3 penguin species ?

How can we graphically address these questions with ggplot2?

ggplot() layers

Code
library(ggplot2)
ggplot(data = penguins)

Code
ggplot(data = penguins, aes(x = bill_length_mm, y = bill_depth_mm))

Code
ggplot(data = penguins,
       aes(x = bill_length_mm, y = bill_depth_mm)) +
  geom_point()

Code
ggplot(data = penguins,
       aes(x = bill_length_mm, y = bill_depth_mm)) +
  geom_point() +  facet_wrap(~species) +
  coord_trans(x = "log10", y = "log10")

Let us explore how some of this data is structured by species:

Code
ggplot(data = penguins,               # Data
       aes(x = bill_length_mm,        # Your X-value
           y = bill_depth_mm,         # Your Y-value
           col = species)) +          # Aesthetics
  geom_point(size = 5, alpha = 0.8) + # Point
  geom_smooth(method = "lm")         # Linear regression

Customize Our Plot

  • Here are some key aspects you can customize:

Axes, Titles and Legends

Title and axes components: changing size, colour and face

Change Axis Titles: Axes, Titles and Legends

  • Customizing Axis Labels with labs()
    • used to modify plot labels, including x-axis, y-axis, and plot title.
Code
ggplot(data = penguins) +
  geom_point(aes(x = bill_length_mm, y= bill_depth_mm, color= species)) +
      labs(x = "Bill length (mm)", y = "Bill depth (mm)")

Axes, Titles and Legends

  • xlab() and ylab(): These functions specifically set the x-axis and y-axis labels, respectively.
Code
ggplot(data = penguins) +
  geom_point(aes(x = bill_length_mm, y= bill_depth_mm, color=species)) + 
  xlab("Bill length (mm)")+ ylab("Bill depth (mm)")

Axes, Titles and Legends

Increasing Space Between Axis and Axis Titles

  • element_text(): While primarily used in theme() for overall theme customization, element_text() can be used to specify text properties such as size, color, and font face for axis labels.
Code
ggplot(data = penguins) +
  geom_point(aes(x = bill_length_mm, y= bill_depth_mm, color= species)) +
      labs(x = "Bill length (mm)", y = "Bill depth (mm)")+
      theme(axis.title = element_text(size = 15, face= "italic"))

Axes, Titles and Legends

  • To change vertical alignment using vjust which controls the vertical alignment, typically ranging between 0 and 1, but can extend beyond this range.
Code
ggplot(data = penguins) +
  geom_point(aes(x = bill_length_mm, y= bill_depth_mm, color= species)) +
      labs(x = "Bill length (mm)", y = "Bill depth (mm)")+
  theme(axis.title.x = element_text(vjust = 0, size = 15),
        axis.title.y = element_text(vjust = 2, size = 15))

Axes, Titles and Legends

  • To adjust the space on the y-axis, change the right margin, not the bottom margin.
  • The face argument can be set to bold, italic, or bold.italic:
Code
ggplot(data = penguins) +
  geom_point(aes(x = bill_length_mm, y= bill_depth_mm, color= species)) +
      labs(x = "Bill length (mm)", y = "Bill depth (mm)")+
  theme(axis.title= element_text(color= "sienna", size= 15, face= "bold"),
        axis.title.y = element_text(face = "bold.italic"))

Axes, Titles and Legends

  • angle, hjust and vjust can rotate any text element. hjust and vjust used to adjust the position horizontally (0 = left, 1 = right) and vertically (0 = top, 1 = bottom):
Code
ggplot(data = penguins) +
  geom_point(aes(x = bill_length_mm, y= bill_depth_mm, color= species)) +
      labs(x = "Bill length (mm)", y = "Bill depth (mm)")+
theme(axis.text.x = element_text(angle =50, vjust= 1, hjust =1, size= 12))

Axes, Titles and Legends

  • element_blank() used to remove axis text and ticks,
Code
ggplot(data = penguins) +
  geom_point(aes(x = bill_length_mm, y = bill_depth_mm, color= species)) +
      labs(x = "Bill length (mm)", y = "Bill depth (mm)")+
  theme(axis.ticks.y = element_blank(), axis.text.y = element_blank())

Axes, Titles and Legends

  • The element_blank() function is used to remove an element entirely but to remove axis titles by setting them to NULL or empty quotes " " in the labs() function:
Code
ggplot(data = penguins) +
  geom_point(aes(x = bill_length_mm, y= bill_depth_mm, color= species)) +
      labs(x = "Bill length (mm)", y = "Bill depth (mm)")+
 labs(x = NULL, y = "")

💡 Using NULL removes the element, while empty quotes ” ” keep the space for the axis title but print nothing.

Adding Title

  • To customize titles in ggplot2, you can use a combination of ggtitle(), labs(), and theme() functions. Below is a list of the main functions and their key arguments for title customization:

Main Functions and Arguments

  1. ggtitle(): used to label the text for the main title.
    • Example: ggtitle("Main Title")
  2. labs():
    • title: The text for the main title.
    • subtitle: The text for the subtitle.
    • caption: The text for the caption.
    • tag: The text for a tag.
    • Example: labs(title = "Main Title", subtitle = "Subtitle", caption = "Caption", tag = "Fig. 1")

Adding Title

  1. theme(): Customize the appearance of the text elements.
    • plot.title: Customize the main title text, subtitle, caption and tag text. Example:
    • theme(plot.title = element_text(face = "bold", size = 14, hjust = 0.5))
    • theme(plot.subtitle = element_text(size = 12, hjust = 0.5))
    • theme(plot.caption = element_text(size = 10, hjust = 0))
    • theme(plot.tag = element_text(size = 8, hjust = 1))
    • element_text(face, size, family, hjust, vjust, margin, lineheight): Control the font face, size, family, alignment, margin, and line height.

Adding Title

Code
ggplot(data = penguins) +
  geom_point(aes(x = bill_length_mm, y= bill_depth_mm, color = species)) +
      labs(x = "Bill length (mm)", y = "Bill depth (mm)", 
           title = "Relationship between bill length and depth", 
           subtitle = "for different penguin species", 
           caption = "scatter plot", tag = "Fig. 1") 

Bold Title and Margin

Code
ggplot(data = penguins) +
  geom_point(aes( x = bill_length_mm, y= bill_depth_mm, color= species)) +
      labs(x = "Bill length (mm)", y = "Bill depth (mm)",  
           title = "Relationship between bill length and depth")+ 
  theme(plot.title = element_text(face = "bold", 
                                  margin =margin(10,0,10,0), size= 14)) 

Legends

  • One nice thing about ggplot2 is that it adds a legend by default when mapping a variable to an aesthetic. You can see that by default the legend title is what we specified in the color argument:
Code
ggplot(data = penguins) +
  geom_point(aes(x = bill_length_mm, y = bill_depth_mm, color= species)) +
      labs(x = "Bill length (mm)", y = "Bill depth (mm)")

Legends

The main functions and methods to customize legends in ggplot2:

  • To Turn Off the Legend: we can use the following code

    • theme(legend.position = "none")
    • guides(color = "none")
    • scale_color_discrete(guide = "none")
Code
ggplot(data = penguins) +
  geom_point(aes(x = bill_length_mm, y= bill_depth_mm, color= species)) +
      labs(x = "Bill length (mm)", y = "Bill depth (mm)")+
 theme(legend.position = "none")

Legends

Code
ggplot(data = penguins) +
  geom_point(aes(x = bill_length_mm, y= bill_depth_mm, color= species)) +
      labs(x = "Bill length (mm)", y = "Bill depth (mm)")+  
  guides(color = "none")

To Remove Legend Titles

  • theme(legend.title = element_blank())
  • scale_color_discrete(name = NULL)
  • labs(color = NULL)
Code
ggplot(data = penguins) +
  geom_point(aes(x = bill_length_mm, y= bill_depth_mm, color= species)) +
      labs(x = "Bill length (mm)", y = "Bill depth (mm)")+
theme(legend.title = element_blank())

To Change Legend Position

-   `theme(legend.position = "top")`
-   `theme(legend.position = c(x, y), legend.background = element_rect(fill = "transparent"))` to add legend inside the plot
Code
ggplot(data = penguins) +
  geom_point(aes(x = bill_length_mm, y= bill_depth_mm, color= species)) +
      labs(x = "Bill length (mm)", y = "Bill depth (mm)")+
theme(legend.position = "top")

To Remove Legend Titles

Code
ggplot(data = penguins) +
  geom_point(aes(x = bill_length_mm, y= bill_depth_mm, color = species)) +
      labs(x = "Bill length (mm)", y = "Bill depth (mm)")+
theme(legend.position = c(.15, .15),  
      legend.background = element_rect(fill = "transparent"))

Change Order of Legend Keys

-   `factor(penguins$species, levels = c("Chinstrap", "Gentoo", "Adelie"))`
Code
library(dplyr)
penguins1 <- penguins %>%
  mutate(species=factor(species, levels=c("Chinstrap", "Gentoo","Adelie")))
ggplot(data = penguins1) +
  geom_point(aes(x = bill_length_mm, y= bill_depth_mm, color= species)) +
      labs(x = "Bill length (mm)", y = "Bill depth (mm)")

theme

theme()

theme

Default theme: The default theme is theme_gray().

theme

  • The predefined theme takes two arguments for the base font size (base_size) and font family (base_family).
  • base_size input is a number, and base_family is a string (e.g. “serif”, “sans”, “mono”).
  • In addition, ggthemes pacakge offers additional predefined themes.
  • We will start with 8 predefined themes provided by ggplot2:
  • plot+theme_gray()
  • plot+theme_bw()
  • plot+theme_linedraw()
  • plot+theme_light()
  • plot + theme_dark()
  • plot + theme_minimal()
  • plot + theme_classic()
  • plot + theme_void()

theme

  • theme() has many arguments to control and modify individual components of a plot theme, including:
  • all line, rectangular, text and title elements
  • aspect ratio of the panel
  • axis title, text, ticks, and lines
  • legend background, margin, text, title, position, and more
  • panel aspect ratio, border, and grid lines

Backgrounds & Grid Lines

The main functions to customize the background of the plot in the provided code and explanation involve modifying elements of the theme function in ggplot2. Here are the key functions and elements used:

  • Changing the Panel Background Color

The panel background refers to the area where the data is plotted.

  • panel.background: Adjusts the background color and outline of the panel area.
  • theme(panel.background = element_rect(fill = "#64D2AA", color = "#64D2AA", linewidth = 2))
Code
ggplot(data = penguins) +
  geom_point(aes(x= bill_length_mm, y= bill_depth_mm, color= body_mass_g)) +
      labs(x = "Bill length (mm)", y = "Bill depth (mm)")+
  theme(panel.background = element_rect(
    fill = "#64D2AA", color = "#64D2AA", linewidth = 2))

Changing the Panel Border Color

The panel border is an overlay on top of the panel.background which outlines the panel.

  • panel.border: Sets the border properties of the panel.

    theme(panel.border = element_rect(fill = "#64D2AA99", color = "#64D2AA", linewidth = 2))

Code
ggplot(data = penguins) +
    geom_point(aes(x= bill_length_mm, y=bill_depth_mm, color=body_mass_g)) +
          labs(x = "Bill length (mm)", y = "Bill depth (mm)")+
      theme(panel.border = element_rect(
        fill = "#64D2AA99", color = "#64D2AA",linewidth = 2))

Changing Grid Lines

Grid lines help in referencing the data points against the axes.

  • panel.grid: Changes properties for all grid lines.
  • panel.grid.major: Changes properties for major grid lines.
  • panel.grid.minor: Changes properties for minor grid lines.
  • panel.grid.major.x and panel.grid.major.y: Change properties for major grid lines on the x and y axes separately.
  • panel.grid.minor.x and panel.grid.minor.y: Change properties for minor grid lines on the x and y axes separately.
Code
ggplot(data = penguins) +
  geom_point(aes(x = bill_length_mm, y= bill_depth_mm, color= species)) +
      labs(x = "Bill length (mm)", y = "Bill depth (mm)")+
  theme(panel.grid.major = element_line(color = "gray10",linewidth = .5),       
        panel.grid.minor = element_line(color = "gray70", linewidth = .25))

Changing Grid Lines

Code
ggplot(data = penguins) +
  geom_point(aes(x= bill_length_mm, y= bill_depth_mm, color= body_mass_g)) +
      labs(x = "Bill length (mm)", y = "Bill depth (mm)")+
  theme(panel.grid.major = element_line(linewidth = .5, linetype= "dashed"),
        panel.grid.minor = element_line(linewidth = .25, linetype= "dotted"), 
        panel.grid.major.x = element_line(color = "red1"),       
        panel.grid.major.y = element_line(color = "blue1"), 
        panel.grid.minor.x = element_line(color = "red4"), 
        panel.grid.minor.y = element_line(color = "blue4"))

Removing Grid Lines

Grid lines can be selectively removed.

  • element_blank(): Used to remove specific theme elements.

    • theme(panel.grid.minor = element_blank())
    • theme(panel.grid = element_blank())
Code
ggplot(data = penguins) +
  geom_point(aes(x = bill_length_mm, y= bill_depth_mm, color = species)) +
  labs(x = "Bill length (mm)", y = "Bill depth (mm)")+ 
  theme(panel.grid = element_blank())

Customizing multi-panel plots

When creating multi-panel plots in ggplot2, there are several functions and themes available to customize their appearance. Here’s a breakdown of the main functions and customization options based on the provided code:

Creating Facets with facet_grid and facet_wrap

  • facet_wrap(variable ~ .):
  • Creates a ribbon of panels based on a single variable.
Code
ggplot(data = penguins) +
  geom_point(mapping = aes(x = bill_length_mm, y = bill_depth_mm,
                           colour = species)) +
  facet_grid(~ species, scales = "free")

facet_grid(rows ~ columns):

  • Creates a grid of panels based on two variables.
Code
ggplot(data = penguins) +
  geom_point(mapping = aes(x = bill_length_mm, y = bill_depth_mm,
                           colour = species)) +
  facet_grid(year ~ species, scales = "free")
  • Customizing Layout of Facets
  • ncol and nrow:
    • Control the number of columns and rows in facet_wrap.

facet_wrap

Code
ggplot(data = penguins) +
  geom_point(aes(x = bill_length_mm, y= bill_depth_mm, color= species)) +
          labs(x = "Bill length (mm)", y = "Bill depth (mm)")+
      theme(axis.text.x = element_text(angle = 45, vjust = 1, hjust= 1)) + 
  facet_wrap( ~ species+sex, ncol = 3)
  • scales:
  • Allows axes to have free scales with scales = "free" or control specific axis with scales = "free_x" or scales = "free_y".

Combining Different Plots

  • patchwork package:

    • Combine multiple plots with simple syntax.
    • p1 + p2 p1 / p2 (g + p2) / p1
  • cowplot package:

  • Another package for combining multiple plots.

Code
library(cowplot) 
# plot_grid(plot_grid(g, p1), p2, ncol = 1)

Combining Different Plots

  • {gridExtra} package:
    • Provides functions to arrange multiple plots.
Code
library(gridExtra) 
# grid.arrange(g, p1, p2, layout_matrix = rbind(c(1, 2), c(3, 3)))`
  • Custom layout with patchwork:
    • Define complex layouts using a design matrix.
Code
# layout <- "AABBBB#   AACCDDE   ##CCDD#   ##CC### " 
# p2 + p1 + p1 + g + p2 + plot_layout(design = layout)

Colors

Several functions and techniques are highlighted for customizing colors in ggplot2 plots.

  • color and fill Arguments: Define the outline color (color) and the filling color (fill) of plot elements.

    • geom_point(color = "steelblue", size = 2)
    • geom_point(shape = 21, size = 2, stroke = 1, color = "#3cc08f", fill = "#c08f3c")
Code
# default
p <- ggplot(penguins, aes( x = bill_length_mm, y = bill_depth_mm, colour= species)) +
  geom_point() + labs(x = "Bill length (mm)", y = "Bill depth (mm)") 

Colors

Code
p +  
  geom_point(shape= 21, size=2, stroke=1, color= "#3cc08f", fill="#c08f3c")
  • scale_color_* and scale_fill_* Functions: Modify colors when they are mapped to variables. - These functions differ based on whether the variable is categorical (qualitative) or continuous (quantitative).

Qualitative Variables:

**`scale_color_manual` and `scale_fill_manual`**: Manually specify colors for categorical variables. `scale_color_manual(values = c("dodgerblue4", "darkolivegreen4", "darkorchid3", "goldenrod1"))`
Code
p + scale_color_manual(values=c("dodgerblue4", "darkorchid3", "goldenrod1"))
  • scale_color_brewer and scale_fill_brewer: Use predefined color palettes from ColorBrewer.

scale_color_brewer(palette = “Set1”)

Code
p+  scale_color_brewer(palette = "Set1")

Quantitative Variables:

  • scale_color_gradient and scale_fill_gradient: Apply a sequential gradient color scheme for continuous variables.

  • scale_color_gradient(low = "darkkhaki", high = "darkgreen")

Code
p2 <- ggplot(penguins, aes( x = bill_length_mm, y = bill_depth_mm, 
                            colour = body_mass_g)) + geom_point()+ 
  labs(x = "Bill length (mm)", y = "Bill depth (mm)") 
    p2 + scale_color_gradient(low = "darkkhaki", high = "darkgreen")

Quantitative Variables:

  • scale_color_viridis_c and scale_fill_viridis_c: Use the Viridis color palettes, which are perceptually uniform and suitable for colorblind viewers.

scale_color_viridis_c(option = "inferno")

Code
p2 + scale_color_viridis_c(option = "inferno")

Lines

  • geom_hline(): Adds horizontal lines to a plot at specified y-axis values.

    yintercept: A numeric vector indicating where to draw the horizontal lines. geom_hline(yintercept = c(12, 23))

Code
p + geom_hline(yintercept = c(12, 23))
  • geom_vline(): Adds vertical lines to a plot at specified x-axis values.
    • xintercept: A numeric vector or aesthetic mapping for x-axis intercepts.
    • color, linewidth, linetype: Aesthetics for customizing the appearance of the line.

Lines

geom_vline(aes(xintercept = 45), linewidth = 1.5, color = "firebrick", linetype = "dashed")

Code
p + geom_vline(aes(xintercept = 45), linewidth = 1.5, 
               color = "firebrick", linetype = "dashed")
  • geom_abline(): Adds lines with a specified slope and intercept to a plot.

    • intercept: The intercept of the line.
    • slope: The slope of the line.
    • color, linewidth: Aesthetics for customizing the appearance of the line. geom_abline(intercept = coefficients(reg)[1], slope = coefficients(reg)[2], color = "darkorange2", linewidth = 1.5)

Adding a Linear Fit

Though the default is a LOESS or GAM smoothing, it is also easy to add a standard linear fit:

Code
p + geom_point(color = "gray40", alpha = .5) + 
  stat_smooth(method = "lm", se = FALSE, color = "firebrick", linewidth = 1.3)+ 
  labs(x = "Temperature (°F)", y = "Dewpoint")

Interactive Plots

  • Interactive plots in R are a great way to enhance the user experience by providing dynamic and visually appealing graphics. Some libraries that can be used in combination with ggplot2 or on their own to create interactive visualizations:

There are different interactive Plot Libraries. The following are among the few

Plot.ly} is a tool for creating online, interactive graphics and web apps. The plotly package in R allows you to easily convert your ggplot2 plots into interactive plots.

Code
library(plotly)
ggplotly(p)

Create different plots using geom_*()

Create different plots using geom_*()

Code
p1 <- ggplot(penguins, aes(x = bill_length_mm, y = bill_depth_mm, colour = species)) +
  geom_point()
p2 <- ggplot(penguins, aes(x = bill_length_mm, y = bill_depth_mm, colour = species)) +
  geom_density2d()
p3 <- ggplot(penguins, aes(x = species, fill = island)) +
  geom_bar()
p4 <- ggplot(penguins, aes(x = species, y = bill_depth_mm, fill = species)) +
  geom_boxplot()

library(patchwork)
p1 + p2 + p3 + p4

Create different plots using geom_*()

This is a blank plot, before we add any geom_* to represent variables in the dataset.

Code
ggplot(penguins)

Bar plot using geom_bar()

  • Bar chart of number of penguins by species. I would like to know how many species we have in this dataset.
Code
ggplot(penguins, aes(x = species, fill = species)) +
  geom_bar() + labs(title = "Number of Penguins by Species",
       x = "Species",  y = "Count", fill = "Species") + theme_minimal()

Bar plot using geom_bar()

  • Number of Penguin species on each Island
Code
ggplot(data = penguins)+ geom_bar(mapping=aes(x=island, fill=species))+
  labs(title="Population of Penguin species on each Island", y="count of species")+
theme(text=element_text(size=14))

Bar plot using geom_bar()

  • chart of body mass by species & sex.
Code
ggplot(penguins, aes(x = species, y = body_mass_g, fill = sex)) +
  geom_bar(stat = "identity", position = "dodge") +
  labs(title = "Body Mass by Species and Sex",
       x = "Species", y = "Body Mass (g)", fill = "Sex") +
  theme_minimal()

Histograms: geom_histogram()

A histogram is an accurate graphical representation of the distribution of numeric data. There is only one aesthetic required: the x variable.

Code
ggplot(penguins,
       aes(x = bill_length_mm)) + geom_histogram() +
  ggtitle("Histogram of penguin bill length ")

Boxplot: geom_boxplot()

Boxplot: geom_boxplot()

  • Boxplot of body mass distribution of penguins by species
Code
ggplot(penguins, aes(x = species, y = body_mass_g, fill = species)) +
  geom_boxplot() +
  labs(title = "Body Mass Distribution of Penguins by Species",
       x = "Species",   y = "Body Mass (g)", fill = "Species") +
  theme_minimal()

Boxplot: geom_boxplot()

Code
ggplot(data = penguins,
       aes(x = species, y = bill_length_mm, fill = species)) +
  geom_boxplot() + labs(title = "Boxplot")

Boxplot: geom_boxplot()

  • Boxplot with annotations: geom_boxplot() and geom_signif()
Code
library(ggsignif)
ggplot(data = penguins, aes(x = species, y= bill_length_mm, fill = species)) +
  geom_boxplot()

Time series or line plot

You can use geom_line() for line plots or time series plot to display values over time.

Code
ggplot(economics, aes(date, unemploy)) +
  geom_line(color = "blue") +
  theme_bw()

Time series plot

Code
ggplot(economics, aes(x = date, y = unemploy)) +
  geom_line(color = "#1f77b4", linewidth = 1) +  # Classic blue color
  labs(
    title = "US Unemployment Over Time",
    subtitle = "Number of unemployed (in thousands)",
    x = "Year",
    y = "Unemployed (thousands)",
    caption = "Source: US Economic Time Series Data"
  ) +
  scale_x_date(date_breaks = "5 years", date_labels = "%Y") +
  scale_y_continuous(labels = scales::comma) +
  theme_minimal() +
  theme(
    plot.title = element_text(size = 16, face = "bold"),
    plot.subtitle = element_text(size = 12, color = "gray50"),
    panel.grid.major = element_line(color = "gray90", linewidth = 0.2),
    panel.grid.minor = element_blank(),
    plot.background = element_rect(fill = "white", color = NA)
  )

Time series plot

Resources