ggplot2包|创建美丽有用的折线图
专题介绍:R是一种广泛用于数据分析和统计计算的强大语言,于上世纪90年代开始发展起来。得益于全世界众多 爱好者的无尽努力,大家继而开发出了一种基于R但优于R基本文本编辑器的R Studio(用户的界面体验更好)。也正是由于全世界越来越多的数据科学社区和用户对R包的慷慨贡献,让R语言在全球范围内越来越流行。其中一些R包,例如MASS,SparkR, ggplot2,使数据操作,可视化和计算功能越来越强大。R是用于统计分析、绘图的语言和操作环境。R是属于GNU系统的一个自由、免费、源代码开放的软件,它是一个用于统计计算和统计制图的优秀工具。R作为一种统计分析软件,是集统计分析与图形显示于一体的。它可以运行于UNIX、Windows和Macintosh的操作系统上,而且嵌入了一个非常方便实用的帮助系统,相比于其他统计分析软件,R的学术性开发比较早,适合生物学和医学等学术学科的科研人员使用。
【R语言】开通了R语言群,大家相互学习和交流,请扫描下方二维码,备注:R群,我会邀请你入群,一起进步和成长。
我对于R语言做数据可视化,甚是喜欢。我也一直学习和实践着如何用R语言创建一系列美丽而有用的可视化图形。ggplot2包和以其为基础的扩展包,我都喜欢去尝试,同时,也会阅读这方面相关的资料(书籍、博客和代码等),其目的就是为了指导自己创建能够表达信息、带来价值、又具有美学的可视化图形。本文分享利用ggplot2包及相关包创建美丽而有用的折线图。折线图是一种常用的图形,可以直观展示出一种变化趋势(上升、下降、波动和保持不变等),比方说,我们经常看到用折线图表示时间序列关系,反映某个指标或者不同组下某个指标在时间轴上面的变化动态。
本文逐步地说明,如何用ggplot2包做出美丽而可用的折线图。
首先,加载所需R包
if (!require("pacman")) install.packages("pacman")
p_load(ggplot2, ggthemes, dplyr, readr, showtext)
然后,数据准备工作,我们以一份出口数据集为例
# 准备数据集
chilean_exports <- "year,product,export,percentage
2006,copper,4335009500,81
2006,others,1016726518,19
2007,copper,9005361914,86
2007,others,1523085299,14
2008,copper,6907056354,80
2008,others,1762684216,20
2009,copper,10529811075,81
2009,others,2464094241,19
2010,copper,14828284450,85
2010,others,2543015596,15
2011,copper,15291679086,82
2011,others,3447972354,18
2012,copper,14630686732,80
2012,others,3583968218,20
2013,copper,15244038840,79
2013,others,4051281128,21
2014,copper,14703374241,78
2014,others,4251484600,22
2015,copper,13155922363,78
2015,others,3667286912,22
"
exports_data <- read_csv(chilean_exports)
# 数据结构理解
str(exports_data)
第三,根据这份数据,逐步绘制和完善折线图,用于表示不同产品类型,在不同年份里,出口量的变化趋势情况
3.1 绘制基本的折线图
p1 <- ggplot(data = exports_data,
aes(x = year, y = export, colour = product)) +
geom_line()
p1
大家可以思考,基于这个基础图,如何做修改和完善,从而让它更美丽和有用,以达成可以公布的目标。
3.2 调整线条宽度
图中每条折线图的宽度太细,有必要放大一些。
p2 <- ggplot(data = exports_data,
aes(x = year, y = export, colour = product)) +
geom_line(size = 1.5)
p2
3.3 修改图例的标签和布局位置
exports_data <- exports_data %>%
mutate(product = factor(product, levels = c("copper","others"),
labels = c("Copper ","Pulp wood, Fruit, Salmon & Others")))
p3 <- ggplot(data = exports_data,
aes(x = year, y = export, colour = product)) +
geom_line(size = 1.5) +
theme(legend.position = "bottom",
legend.direction = "horizontal",
legend.title = element_blank()) # 用于控制图例 底部 水平 不显示标题
p3
这样做的目的,让图例更有效地表达要传递的信息。
3.4 修改x轴的刻度
观察发现,x轴的刻度不是我们所希望表现的,我们需要它呈现一串年份序列。
p4 <- p3 + scale_x_continuous(breaks = seq(2006, 2015, 1))
p4
3.5 设置坐标轴的标签和标题
一幅图形x轴和y轴分别表示什么含义,以及图像所要表示的主题是什么,这可以通过坐标轴的标签和标题来说明。
p5 <- p4 +
labs(title = "Composition of Exports to China ($)",
subtitle = "Source: The Observatory of Economic Complexity") +
labs(x = "Year", y = "USD million")
p5
3.6 调整直线的配色
配色是一门艺术,可以参照和学习其它优秀图形的配色。
colour <- c("#5F9EA0", "#E1B378")
p6 <- p5 + scale_color_manual(values = colour)
p6
3.7 字体设置
font_add("Tahoma","Tahoma.ttf")
font_add("Roboto Condensed", "RobotoCondensed-Regular.ttf")
showtext_auto()
3.8 主题设置
1)白色主题
p7_1 <- ggplot(data = exports_data,
aes(x = year, y = export, colour = product)) +
geom_line(size = 1.5) +
scale_x_continuous(breaks = seq(2006, 2015, 1)) +
labs(title = "Composition of Exports to China ($)",
subtitle = "Source: The Observatory of Economic Complexity") +
labs(x = "Year", y = "USD million") +
scale_colour_manual(values = colour) +
theme_bw() +
theme(legend.position = "bottom",
legend.direction = "horizontal",
legend.title = element_blank())
p7_1
2)使用经济学杂志的主题
p7_2 <- ggplot(data = exports_data,
aes(x = year, y = export, colour = product)) +
geom_line(size = 1.5) +
scale_x_continuous(breaks = seq(2006,2015,1)) +
labs(title = "Composition of Exports to China ($)",
subtitle = "Source: The Observatory of Economic Complexity") +
labs(x = "Year", y = "USD million") +
theme_economist() + scale_colour_economist() +
theme(axis.line.x = element_line(size = .5, colour = "black"),
legend.position = "bottom",
legend.direction = "horizontal",
legend.title = element_blank(),
plot.title = element_text(family = "Roboto Condensed"),
text = element_text(family = "Roboto Condensed"))
p7_2
3)使用Five Thirty Eight网站风格的主题
p7_3 <- ggplot(data = exports_data,
aes(x = year, y = export, colour = product)) +
geom_line(size = 1.5) +
scale_x_continuous(breaks = seq(2006,2015,1)) +
labs(title = "Composition of Exports to China ($)",
subtitle = "Source: The Observatory of Economic Complexity") +
labs(x = "Year", y = "USD million") +
theme_fivethirtyeight() + scale_colour_fivethirtyeight() +
theme(axis.title = element_text(family = "Roboto Condensed"),
legend.position = "bottom", legend.direction = "horizontal",
legend.title = element_blank(),
plot.title = element_text(family = "Roboto Condensed"),
legend.text = element_text(family = "Roboto Condensed"),
text = element_text(family = "Roboto Condensed"))
p7_3
4)自定义主题
完整代码:
#################################
#ggplot2包画出美丽而有用的点线图
#RUser
#2021-03-06
#################################
# 加载所需R包
if (!require("pacman")) install.packages("pacman")
p_load(ggplot2, ggthemes, dplyr, readr, showtext)
# 准备数据集
chilean_exports <- "year,product,export,percentage
2006,copper,4335009500,81
2006,others,1016726518,19
2007,copper,9005361914,86
2007,others,1523085299,14
2008,copper,6907056354,80
2008,others,1762684216,20
2009,copper,10529811075,81
2009,others,2464094241,19
2010,copper,14828284450,85
2010,others,2543015596,15
2011,copper,15291679086,82
2011,others,3447972354,18
2012,copper,14630686732,80
2012,others,3583968218,20
2013,copper,15244038840,79
2013,others,4051281128,21
2014,copper,14703374241,78
2014,others,4251484600,22
2015,copper,13155922363,78
2015,others,3667286912,22
"
exports_data <- read_csv(chilean_exports)
# 数据结构理解
str(exports_data)
# 绘制基础的折线图
p1 <- ggplot(data = exports_data,
aes(x = year, y = export, colour = product)) +
geom_line()
p1
# 调整线条的宽度
p2 <- ggplot(data = exports_data,
aes(x = year, y = export, colour = product)) +
geom_line(size = 1.5)
p2
# 修改图例变量的标签和布局的位置
exports_data <- exports_data %>%
mutate(product = factor(product, levels = c("copper","others"),
labels = c("Copper ","Pulp wood, Fruit, Salmon & Others")))
p3 <- ggplot(data = exports_data,
aes(x = year, y = export, colour = product)) +
geom_line(size = 1.5) +
theme(legend.position = "bottom",
legend.direction = "horizontal",
legend.title = element_blank()) # 用于控制图例 底部 水平 不显示标题
p3
# 修改x轴的刻度
# 使用scale_x_continuous函数
p4 <- p3 + scale_x_continuous(breaks = seq(2006, 2015, 1))
p4
# 设置坐标轴标签和图形标题
p5 <- p4 +
labs(title = "Composition of Exports to China ($)",
subtitle = "Source: The Observatory of Economic Complexity") +
labs(x = "Year", y = "USD million")
p5
# 调整配色
colour <- c("#5F9EA0", "#E1B378")
p6 <- p5 + scale_color_manual(values = colour)
p6
# 字体设置
font_add("Tahoma","Tahoma.ttf")
font_add("Roboto Condensed", "RobotoCondensed-Regular.ttf")
showtext_auto()
# 使用主题
# 1)白色主题
p7_1 <- ggplot(data = exports_data,
aes(x = year, y = export, colour = product)) +
geom_line(size = 1.5) +
scale_x_continuous(breaks = seq(2006, 2015, 1)) +
labs(title = "Composition of Exports to China ($)",
subtitle = "Source: The Observatory of Economic Complexity") +
labs(x = "Year", y = "USD million") +
scale_colour_manual(values = colour) +
theme_bw() +
theme(legend.position = "bottom",
legend.direction = "horizontal",
legend.title = element_blank())
p7_1
# 2) 使用经济学角度的图片可视化
p7_2 <- ggplot(data = exports_data,
aes(x = year, y = export, colour = product)) +
geom_line(size = 1.5) +
scale_x_continuous(breaks = seq(2006,2015,1)) +
labs(title = "Composition of Exports to China ($)",
subtitle = "Source: The Observatory of Economic Complexity") +
labs(x = "Year", y = "USD million") +
theme_economist() + scale_colour_economist() +
theme(axis.line.x = element_line(size = .5, colour = "black"),
legend.position = "bottom",
legend.direction = "horizontal",
legend.title = element_blank(),
plot.title = element_text(family = "Roboto Condensed"),
text = element_text(family = "Roboto Condensed"))
p7_2
# 3) Five Thirty Eight
p7_3 <- ggplot(data = exports_data,
aes(x = year, y = export, colour = product)) +
geom_line(size = 1.5) +
scale_x_continuous(breaks = seq(2006,2015,1)) +
labs(title = "Composition of Exports to China ($)",
subtitle = "Source: The Observatory of Economic Complexity") +
labs(x = "Year", y = "USD million") +
theme_fivethirtyeight() + scale_colour_fivethirtyeight() +
theme(axis.title = element_text(family = "Roboto Condensed"),
legend.position = "bottom", legend.direction = "horizontal",
legend.title = element_blank(),
plot.title = element_text(family = "Roboto Condensed"),
legend.text = element_text(family = "Roboto Condensed"),
text = element_text(family = "Roboto Condensed"))
p7_3
# 4) 创建自己的主题
colour <- c("#40b8d0", "#b2d183")
p7_4 <- ggplot(data = exports_data,
aes(x = year, y = export, colour = product)) +
geom_line(size = 1.5) +
scale_x_continuous(breaks = seq(2006,2015,1)) +
labs(title = "Composition of Exports to China ($)",
subtitle = "Source: The Observatory of Economic Complexity") +
labs(x = "Year", y = "USD million") +
scale_colour_manual(values = colour) +
theme(panel.border = element_rect(colour = "black", fill = NA, size = .5),
axis.text.x = element_text(colour = "black", size = 10),
axis.text.y = element_text(colour = "black", size = 10),
legend.key = element_rect(fill = "white", colour = "white"),
legend.position = "bottom", legend.direction = "horizontal",
legend.title = element_blank(),
panel.grid.major = element_line(colour = "#d3d3d3"),
panel.grid.minor = element_blank(),
panel.background = element_blank(),
plot.title = element_text(size = 14, family = "Tahoma", face = "bold"),
text = element_text(family = "Tahoma"))
p7_4
好书推荐
3 推断统计与数据科学,moderndive和tidyverse包
公众号推荐
请关注“恒诺新知”微信公众号,感谢“R语言“,”数据那些事儿“,”老俊俊的生信笔记“,”冷🈚️思“,“珞珈R”,“生信星球”的支持!