Open In App

Data Visualization in R

Last Updated : 21 Mar, 2024
Improve
Improve
Like Article
Like
Save
Share
Report

Data visualization is the technique used to deliver insights in data using visual cues such as graphs, charts, maps, and many others. This is useful as it helps in intuitive and easy understanding of the large quantities of data and thereby make better decisions regarding it.

Data Visualization in R Programming Language

The popular data visualization tools that are available are Tableau, Plotly, R, Google Charts, Infogram, and Kibana. The various data visualization platforms have different capabilities, functionality, and use cases. They also require a different skill set. This article discusses the use of R for data visualization.

R is a language that is designed for statistical computing, graphical data analysis, and scientific research. It is usually preferred for data visualization as it offers flexibility and minimum required coding through its packages.

Consider the following airquality data set for visualization in R:

Ozone Solar R. Wind Temp Month Day
41 190 7.4 67 5 1
36 118 8.0 72 5 2
12 149 12.6 74 5 3
18 313 11.5 62 5 4
NA NA 14.3 56 5 5
28 NA 14.9 66 5 6

Types of Data Visualizations

Some of the various types of visualizations offered by R are:

Bar Plot

There are two types of bar plots- horizontal and vertical which represent data points as horizontal or vertical bars of certain lengths proportional to the value of the data item. They are generally used for continuous and categorical variable plotting. By setting the horiz parameter to true and false, we can get horizontal and vertical bar plots respectively. 

Example 1: 

R




# Horizontal Bar Plot for 
# Ozone concentration in air
barplot(airquality$Ozone,
        main = 'Ozone Concenteration in air',
        xlab = 'ozone levels', horiz = TRUE)


Output:

Example 2: 

R




# Vertical Bar Plot for 
# Ozone concentration in air
barplot(airquality$Ozone, main = 'Ozone Concenteration in air'
        xlab = 'ozone levels', col ='blue', horiz = FALSE)


Output:

Bar plots are used for the following scenarios:

  • To perform a comparative study between the various data categories in the data set.
  • To analyze the change of a variable over time in months or years.

Histogram

A histogram is like a bar chart as it uses bars of varying height to represent data distribution. However, in a histogram values are grouped into consecutive intervals called bins. In a Histogram, continuous values are grouped and displayed in these bins whose size can be varied.

Example: 

R




# Histogram for Maximum Daily Temperature
data(airquality)
  
hist(airquality$Temp, main ="La Guardia Airport's\
Maximum Temperature(Daily)",
    xlab ="Temperature(Fahrenheit)",
    xlim = c(50, 125), col ="yellow",
    freq = TRUE)


Output:

For a histogram, the parameter xlim can be used to specify the interval within which all values are to be displayed. 
Another parameter freq when set to TRUE denotes the frequency of the various values in the histogram and when set to FALSE, the probability densities are represented on the y-axis such that they are of the histogram adds up to one. 

Histograms are used in the following scenarios: 

  • To verify an equal and symmetric distribution of the data.
  • To identify deviations from expected values.

Box Plot

The statistical summary of the given data is presented graphically using a boxplot. A boxplot depicts information like the minimum and maximum data point, the median value, first and third quartile, and interquartile range.

Example: 

R




# Box plot for average wind speed
data(airquality)
  
boxplot(airquality$Wind, main = "Average wind speed\
at La Guardia Airport",
        xlab = "Miles per hour", ylab = "Wind",
        col = "orange", border = "brown",
        horizontal = TRUE, notch = TRUE)


Output:

Multiple box plots can also be generated at once through the following code:

Example: 

R




# Multiple Box plots, each representing
# an Air Quality Parameter
boxplot(airquality[, 0:4], 
        main ='Box Plots for Air Quality Parameters')


Output:

Box Plots are used for: 

  • To give a comprehensive statistical description of the data through a visual cue.
  • To identify the outlier points that do not lie in the inter-quartile range of data.

Scatter Plot

A scatter plot is composed of many points on a Cartesian plane. Each point denotes the value taken by two parameters and helps us easily identify the relationship between them.

Example: 

R




# Scatter plot for Ozone Concentration per month
data(airquality)
  
plot(airquality$Ozone, airquality$Month,
     main ="Scatterplot Example",
    xlab ="Ozone Concentration in parts per billion",
    ylab =" Month of observation ", pch = 19)


Output:

Scatter Plots are used in the following scenarios: 

  • To show whether an association exists between bivariate data.
  • To measure the strength and direction of such a relationship.

Heat Map

Heatmap is defined as a graphical representation of data using colors to visualize the value of the matrix. heatmap() function is used to plot heatmap.

Syntax: heatmap(data)

Parameters: data: It represent matrix data, such as values of rows and columns

Return: This function draws a heatmap.

R




# Set seed for reproducibility
# set.seed(110)
  
# Create example data
data <- matrix(rnorm(50, 0, 5), nrow = 5, ncol = 5)
  
# Column names
colnames(data) <- paste0("col", 1:5)
rownames(data) <- paste0("row", 1:5)
  
# Draw a heatmap
heatmap(data)        


Output:

Map visualization in R

Here we are using maps package to visualize and display geographical maps using an R programming language.

install.packages("maps")

R




# Read dataset and convert it into
# Dataframe
data <- read.csv("worldcities.csv")
df <- data.frame(data)
  
# Load the required libraries
library(maps)
map(database = "world")
    
# marking points on map
points(x = df$lat[1:500], y = df$lng[1:500], col = "Red")


Output:

3D Graphs in R 

Here we will use preps() function, This function is used to create 3D surfaces in perspective view. This function will draw perspective plots of a surface over the x–y plane.

Syntax: persp(x, y, z)

Parameter: This function accepts different parameters i.e. x, y and z where x and y are vectors defining the location along x- and y-axis. z-axis will be the height of the surface in the matrix z.

Return Value: persp() returns the viewing transformation matrix for projecting 3D coordinates (x, y, z) into the 2D plane using homogeneous 4D coordinates (x, y, z, t).

R




# Adding Titles and Labeling Axes to Plot
cone <- function(x, y){
sqrt(x ^ 2 + y ^ 2)
}
    
# prepare variables.
x <- y <- seq(-1, 1, length = 30)
z <- outer(x, y, cone)
    
# plot the 3D surface
# Adding Titles and Labeling Axes to Plot
persp(x, y, z,
main="Perspective Plot of a Cone",
zlab = "Height",
theta = 30, phi = 15,
col = "orange", shade = 0.4)


Output:

Advantages of Data Visualization in R: 

R has the following advantages over other tools for data visualization: 

  • R offers a broad collection of visualization libraries along with extensive online guidance on their usage.
  • R also offers data visualization in the form of 3D models and multipanel charts.
  • Through R, we can easily customize our data visualization by changing axes, fonts, legends, annotations, and labels.

Disadvantages of Data Visualization in R:

R also has the following disadvantages: 

  • R is only preferred for data visualization when done on an individual standalone server.
  • Data visualization using R is slow for large amounts of data as compared to other counterparts.

Application Areas: 

  • Presenting analytical conclusions of the data to the non-analysts departments of your company.
  • Health monitoring devices use data visualization to track any anomaly in blood pressure, cholesterol and others.
  • To discover repeating patterns and trends in consumer and marketing data.
  • Meteorologists use data visualization for assessing prevalent weather changes throughout the world.
  • Real-time maps and geo-positioning systems use visualization for traffic monitoring and estimating travel time.


Previous Article
Next Article

Similar Reads

Why Data Visualization Matters in Data Analytics?
What if you wanted to know the number of movies produced in the world per year in different countries? You could always read this data in the form of a black and white text written on multiple pages. Or you could have a colorful bar chart that would immediately tell you which countries are producing more movies and if the total movies per year are
7 min read
Difference Between Data Mining and Data Visualization
Data mining: Data mining is the method of analyzing expansive sums of data in an exertion to discover relationships, designs, and insights. These designs, concurring to Witten and Eibemust be "meaningful in that they lead to a few advantages, more often than not a financial advantage." Data in data mining is additionally ordinarily quantitative par
2 min read
Difference Between Data Visualization and Data Analytics
Data Visualization: Data visualization is the graphical representation of information and data in a pictorial or graphical format(Example: charts, graphs, and maps). Data visualization tools provide an accessible way to see and understand trends, patterns in data and outliers. Data visualization tools and technologies are essential to analyze massi
3 min read
Difference Between Data Science and Data Visualization
Data Science: Data science is study of data. It involves developing methods of recording, storing, and analyzing data to extract useful information. The goal of data science is to gain knowledge from any type of data both structured and unstructured. Data science is a term for set of fields that are focused on mining big data sets and discovering t
2 min read
Conditional Data Visualization Using Google Data Studio
In this article, we will learn how to do Conditional Data Visualization using Google Data studio. Before moving forward, let's understand the title itself. Data visualization Data visualization is all about getting useful insights from the data through graphical media or visuals. When data is presented in a visual format. It becomes easy and quick
3 min read
Stock Data Analysis and Data Visualization with Quantmod in R
Analysis of historical stock price and volume data is done in order to obtain knowledge, make wise decisions, and create trading or investment strategies. The following elements are frequently included in the examination of stock data in the R Programming Language. Historical Price Data: Historical price data contains information about a stock's op
8 min read
Why is Data Visualization so Important in Data Science
Would you prefer to view large data tables and then make sense of that data or view a data visualization that represents that data in an easy-to-understand visual format? Well, most of you would prefer data visualization! That is because data visualization is extremely useful in understanding the data and obtaining useful insights. It can allow you
8 min read
Data Visualization with Pandas
Data Visualization with Pandas is the presentation of data in a graphical format. It helps people understand the significance of data by summarizing and presenting a huge amount of data in a simple and easy-to-understand format and helps communicate information clearly and effectively. Data Visualization with Pandas In this tutorial, we will learn
6 min read
Data Visualization using GoogleVis Package
GoogleVis is a package in R that is used to act as an interface between R and the Google API to produce interactive charts which can be easily embedded into web pages. This package helps the user to plot the data without uploading them into google. In this article let's parse through some charts that can be plotted using googleVis in R Programming
5 min read
COVID-19 Data Visualization using matplotlib in Python
It feels surreal to imagine how the virus began to spread from one person that is patient zero to four million today. It was possible because of the transport system. Earlier back in the days, we didn’t have a fraction of the transportation system we have today. Well, what good practices you can follow for now is to sanitize your grocery products a
8 min read
three90RightbarBannerImg