10 Best Practices for Effective Data Visualization: Simplicity

Home » 10 Best Practices for Effective Data Visualization: Simplicity

updated March 8th, 2023

This is a long read on best practices in data visualisation, which will be periodically updated. I will try to supplement each post with R code examples.

List of best practices:

Keep it simple
Reveal, not conceal
Highlight the essential information
Label and title the visualization
Provide context
Consider the audience
Test and iterate
Use appropriate scales
Provide interactivity
Use appropriate colours

Firstly, first and firstmost:)

Keep it simple

Why Keeping Charts Simple is Critical for Effective Data Visualization

Introduction: the Importance of Simplicity in Data representation

Data visualization is a powerful tool for communicating complex information to others. However, if done poorly, visualizations can be confusing and even misleading. One of the most critical principles of effective data visualization is to keep it simple. This means avoiding clutter and focusing on the key insights that the data can provide.

INFERIOR Example of categorical data visualisation: stacked bar plot

Consider this code and the resulting stacked bar plot.

# Create a cluttered (in my view) bar chart
library(ggplot2)
data <- data.frame(
  region = c("North", "South", "East", "West"),
  sales = c(100, 200, 150, 175),
  profit = c(50, 100, 75, 80)
)
ggplot(data, aes(x = region)) +
    geom_col(aes(y = sales, fill = "Sales"), position = "dodge") +
    geom_col(aes(y = profit, fill = "Profit"), position = "dodge") +
    scale_fill_manual(name = "", values = c("Sales" = "#F8766D", "Profit" = "#00BA38")) +
    labs(title = "Sales and Profit by Region") +
    theme_minimal() +
    theme(legend.position = "bottom")

Cluttered labelling and stacked structures can make comparisons harder.

A cleaner alternative

Removing unnecessary labels and grouping the categories visually helps the viewer focus on the message rather than on chart furniture.

Another good way to visualise the means of categorical data – grouped bar plots

Several reasons to choose grouped bar plots. Firstly, grouped bar charts allow viewers to easily compare the values of different variables within each group, as each variable is represented by a separate bar. They are also often more visually appealing and less cluttered than stacked bar charts.

library(ggplot2)
library(reshape2)
data <- data.frame(
    region = c("North", "South", "East", "West"),
    Sales = c(100, 200, 150, 175),
    Profit = c(50, 100, 75, 80)
)
melted_data <- melt(data, id.vars = "region")
ggplot(melted_data, aes(x = region, y = value, fill = variable)) +
    geom_bar(stat = "identity", position = "dodge") +
    scale_fill_manual(values = c("#619CFF", "#00BA38")) +
    labs(title = "Sales and Profit by Region", x = "Region", y = "USD") +
    theme_minimal() +
    theme(legend.position = "bottom") +
    guides(fill = guide_legend(title = NULL))

Stacked bar charts can become crowded and difficult to read when there are too many variables, while grouped bar charts provide a clearer and more organized way to display the data.

Faceted plots are often a superior choice over stacked bar charts because each facet presents a separate visualization of the same data.

# melt data for ggplot. "group" is either sales or profits
data_melt <- tidyr::gather(key = "group", value = "value", -Region)

ggplot(data_melt, aes(x = Region, y = value, fill = group)) +
    geom_col(position = "dodge") +
    facet_wrap(~ group, nrow = 1)

Tailoring Visualizations to underlying data: do not overcomplicate

Using overly complex visualization types for simple data is a common mistake that can hinder the audience’s ability to accurately interpret the information.

Accessibility should be a key consideration when designing data visualizations to ensure that the information can be effectively communicated to all members of the intended audience.

Using Software Tools for interactivity

Using software tools that are specifically designed for data visualization, such as plotly, can help to ensure that the resulting visualization is both effective and aesthetically pleasing.

plot_ly(z = as.matrix(temp_data[, 2:7]),
        x = colnames(temp_data[, 2:7]),
        y = temp_data$city,
        type = "surface",
        colors = "RdYlBu")

March 3, 2023

best practices in data visualisation dot plot generator dotplot maker ggplot2 R language scatter plot maker scatterplot Scatterplot generator

Maxim Bespalov

Tags:

best practices in data visualisation ggplot2 jitter keep it simple scatterplot

Comments

Your email address will not be published. Required fields are marked *

Comment *

Name

Website

Post Comment

10 Best Practices for Effective Data Visualization: Simplicity

This is a long read on best practices in data visualisation, which will be periodically updated. I will try to supplement each post with R code examples.

List of best practices:

Keep it simple

Reveal, not conceal

Highlight the essential information

Label and title the visualization

Provide context

Consider the audience

Test and iterate

Use appropriate scales

Provide interactivity

Use appropriate colours

Firstly, first and firstmost:)

Keep it simple

Why Keeping Charts Simple is Critical for Effective Data Visualization

Introduction: the Importance of Simplicity in Data representation

INFERIOR Example of categorical data visualisation: stacked bar plot

A cleaner alternative

Another good way to visualise the means of categorical data – grouped bar plots

Tailoring Visualizations to underlying data: do not overcomplicate

Using Software Tools for interactivity

How to build your own ChatGPT web app ↗

GPT-4 does data analysis of a pasted dataset ↗

“Naked” barplots conceal data distribution ↗

Comments

Leave a Reply