Accessible Data Visualizations

Dr. Mine Dogucu

Improving Data Visualizations

Packages for Today

library(hellodatascience)
library(tidyverse)

Saving a Plot

base_bar_plot <-
  ggplot(
    data = atus_college, 
    aes(
      x = employment, 
      fill = enrollment
    )
  ) +
  geom_bar(position = "dodge")

base_bar_plot 
Grouped bar chart showing student counts by employment status and enrollment. The x-axis, labeled 'employment', lists employment categories (Full Time, Part Time, NA) and the y-axis, labeled 'count' shows count. Bars are colored by enrollment status (Full Time vs Part Time), allowing comparison of enrollment within each employment group.
Figure 1: Side-by-side bar plot of employment broken down by enrollment status

Labels

improved_bar_plot <- 
  base_bar_plot +
  labs(
    title = "Student Count by Employment and Enrollment Status",
    subtitle = "Most students work part-time while enrolled full-time",
    x = "Employment Status",
    y = "Number of Students",
    fill = "Course Enrollment",
    caption = "Data source: American Time Use Survey 2024 (n = 312 students)"
  )

improved_bar_plot
Same bar chart as previous with better labels. The main title is Student Count by Employment and Enrollment and the subtitle says that most students work part-time while enrolled full-time. Labels for the x-axis, y-axis, and legend are 'Employment Status', 'Number of Students' and 'Course Enrollment' respectively. Caption notes Data source: American Time Use Survey 2024 (n = 321 students). There are more students enrolled full time than part time for each level of employment status. The full time enrolled students with unknown employment status have the highest count at about 80 students.
Figure 2: Bar plot of employment broken down by enrollment status with added labels

Themes

improved_bar_plot <-
  improved_bar_plot +
  theme_minimal() 

improved_bar_plot
Same bar chart as previous, with a white background instead of gray.
Figure 3: Bar plot of employment broken down by enrollment status with using theme_minimal

Other themes include but not limited to: theme_bw(), theme_light(), theme_dark(), theme_classic().

Using theme() function

improved_bar_plot <-
  improved_bar_plot + 
  theme(
    plot.title = element_text(size = 18),
    plot.subtitle = element_text(size = 14),
    axis.title = element_text(size = 12), 
    axis.text = element_text(size = 10), 
    legend.title = element_text(size = 11),
    legend.text = element_text(size = 9), 
    plot.caption = element_text(size = 8)
  ) +
  labs(title = "Student Count by Employment and \n Enrollment Status")

improved_bar_plot
Same bar plot as previous with larger font size and title broken across two lines.
Figure 4: Bar plot of employment broken down by enrollment status with bigger font sizes

Changing legend location

improved_bar_plot <-
  improved_bar_plot +
  theme(legend.position = "bottom")

improved_bar_plot
Same plot as previous with legend horizontally listed at bottom of plot instead of vertically listed on right side of plot.
Figure 5: Bar plot of employment broken down by enrollment status with legend at the bottom

Colors

Color palette display arranged in a 3-by-3 grid of colored squares. Each square is labeled with a named color, including lightcyan3, grey25, peachpuff, chartreuse3, khaki1, hotpink, cornsilk2, lavenderblush, and purple3.

Figure 6: Nine colors randomly selected from R colors as listed by the colors() function

Hex Codes

Color palette grid showing nine colored squares, each labeled with its hexadecimal color code, including #B4CDCD, #404040, #FFDAB9, #66CD00, #FFF68F, #FF69B4, #EEE8CD, #FFF0F5, and #7D26CD.

Figure 7: Hex codes for the randomly selected nine colors

Okabe-Ito Color Palette

palette.colors(palette = "Okabe-Ito")
[1] "#000000" "#E69F00" "#56B4E9" "#009E73" "#F0E442" "#0072B2" "#D55E00"
[8] "#CC79A7" "#999999"

Okabe-Ito Color Palette

Three-by-three grid of color swatches labeled with hexadecimal codes, including #000000 (black), #E69F00 (gold), #56B4E9 (light blue), #009E73 (teal), #F0E442 (yellow), #0072B2 (dark blue), #D55E00 (orange), #CC79A7 (pink), and #999999 (gray), intended as a reference color palette.

Figure 8: Hex codes of colors in the Okabe-Ito palette

Changing Colors

improved_bar_plot <-
  improved_bar_plot +
  scale_fill_manual(
    values = c(
      "Full Time" = "#009E73",
      "Part Time" = "#CC79A7"
    )
  )

improved_bar_plot
Most recent bar plot with different colors chosen for the bars.
Figure 9: Bar plot of employment broken down by enrollment status using Okabe-Ito colors

Look at how far we have come - Default plot

Original first bar plot from beginning of the chapter.

Figure 10: Default Bar Plot

Look at how far we have come - Improved plot

Most recent bar plot with different colors chosen for the bars.

Figure 11: Improved Bar plot

Customizing Scatterplots

simple_scatterplot <- 
  ggplot(
    data = atus_college,
    aes(
      x = time_alone,
      y = weekly_earnings
    )
  ) +
  geom_point() +
  labs(
    x = "Time Alone (minutes)",
    y = "Weekly Earnings ($)"
  ) +
  theme_bw()

simple_scatterplot
Scatter plot showing the relationship between time spent alone and weekly earnings. Each point represents an individual. Time alone (in minutes) is on the x-axis and weekly earnings are on the y-axis. Earnings vary widely at all levels of time alone, with no strong linear trend.
Figure 12: Scatterplot of time spent alone and weekly earnings

Customizing Scatterplots

colored_scatterplot <- 
  simple_scatterplot +
  geom_point(
    color = "aquamarine4",
    size = 4,
    shape = "square"
  )

colored_scatterplot
Same scatterplot as previous with points as green squares instead of black circles.
Figure 13: Scatterplot with customized appearance of points

Facets

simple_scatterplot +
  facet_grid(rows = vars(employment), cols = vars(enrollment))
Faceted version of same scatter plot of weekly earnings versus time alone, split by employment status (Full Time, Part Time, and NA). Each panel shows individual data points, allowing comparison of earnings patterns within each employment group. There are no obvious trends in either Full Time Enrollment plot. The part time employment plots are mostly weekly earnings between 0 and 1000$. The NA employment plots are empty.

Figure 14: Time spent alone versus weekly earnings broken by employment and enrollment status using facets

Facets - Rows only

simple_scatterplot +
  facet_grid(rows = vars(employment))
Vertical faceted scatter plot of same data showing weekly earnings versus time alone for three employment categories: Full Time, Part Time, and NA. Full-time workers tend to have higher earnings overall, while part-time workers cluster at lower earnings levels.

Figure 15: Time spent alone versus weekly earnings broken by employment status

Data Verbalization

Screen Readers

An assistive technology is any form of technology (software, device) that helps people with disabilities perform certain activities (e.g. walking sticks, wheel chairs).

A screen reader is a form of assistive technology that allows blind and visually impaired users, and people with other disabilities (e.g., dyslexia) to read what is on their computer screen.

If you have never heard of a screen reader reading a text, you can listen to this audio.

Alternate (alt) Text

A screen reader can read text, but non-text elements such as images cannot be read.

Instead a screen reader would say something along the lines of “Image.png”.

In order to make an image visually accessible, we can rely on alternate text.

Alternate text or alt text describes contents of an image and can be read by screen readers.

Alt text in ggplot

improved_bar_plot +
  labs(
    alt = "Barplot showing student count by employment and enrollment status. The y-axis shows number of students from 0 to 80, and the x-axis shows employment status (Full Time, Part Time, NA). Each employment status is broken down by full time or part time enrollment status. The plot displays six bars: students enrolled full time with full time employment (approximately 70 students), part time employment (apprximately 75 students), and unknown employment (approximately 80 students) as well as students enrolled part time with full time employment (approxmately 60 students), part time employment (approximately 18 students), and unknown employment (approximately 15 students). Data source: American Time Use Survey 2024 (n = 312 students)."
  )
Same well polished bar plot of employment status by course enrollment.

Figure 16: Bar plot of employment broken down by enrollment status with added alternative text in the background

Alt text is invisible

The alt text is never displayed in the front end.

The alt text is stored on the back end of a web page and is legible by screen readers.

If you are reading these slides using a screen reader, you should hear the alt text for the plot in the previous slide.

Good alt text practice

You should always write alt text for all images (ggplot or not), whether you use R or Quarto or not, whether you are doing work for STATS 6 or any other courses or any non-course work.

An effective alt text communicates the labels of the axes, the variables represented, ranges and values for each variable and group as seen in the visualization.

Most importantly alt text should convey the overall message of the visualization.

Writing the overall message of the visualization is crucial and practicing this skill will help you take the time to understand the visualizations you work with in depth.

Data Sonification

Saving Simple Plot

x <- 1:10
y <- 1:10

simple_plot <-
  ggplot(
    data = data.frame(x = x, y = y),
    aes(x = x,  y = y)) +
  geom_point()

simple_plot
Simple scatter plot showing a perfectly positive linear relationship between x and y. Points increase steadily from lower left to upper right, indicating that y increases as x increases.
Figure 17: Simple scatterplot to show how sonify() function works

Sonification

sonify::sonify(x, y)

If you were to listen to the output of the sonify() function, you would hear a series of ten tones, each one slightly higher in pitch than the last. The rising pitch directly corresponds to the rising y values as x increases. This auditory experience mirrors the visual experience of seeing the line go up and to the right, providing an alternative way to perceive the linear relationship between x and y.

Data Tactualization

Data tactualization refers to making data visualization in a form so that it can be touchable. The video shows printing of a tactile boxplot that one can touch using a Swell machine.