R Visual. - Bubbles and Bar Charts on China Map

Keep exploring the things we can do on a map with ggplot2 and ggvis.

About visualization on a map:

The data to be used:

Preparation

Load packages and set the language environment:

1
2
3
4
5
6
library(maptools)
library(dplyr)
library(ggplot2)
library(ggvis)
Sys.setlocale("LC_ALL", "chinese")

1
## [1] "LC_COLLATE=Chinese (Simplified)_People's Republic of China.936;LC_CTYPE=Chinese (Simplified)_People's Republic of China.936;LC_MONETARY=Chinese (Simplified)_People's Republic of China.936;LC_NUMERIC=C;LC_TIME=Chinese (Simplified)_People's Republic of China.936"

Load and clean the raw data. Get 2 data frame cnmapdf and cap_coord as we did in the previous visualization article.

As we are going to draw pie and bar chart for each province, we need to add a bit more information to cap_coord.

1
2
3
4
5
6
7
8
9
10
11
12
car_prov <- read.csv("car_ownership.csv", stringsAsFactors = F)
cap_bubble <- cap_coord %>%
plyr::join(subset(car_prov, year == 2013), by = "prov_en") %>%
na.omit()
cap_bar <- cap_coord %>%
plyr::join(car_prov, by = "prov_en") %>%
na.omit() %>%
filter(city_en %in% c("Beijing", "Shanghai", "Guangzhou", "Chengdu")) %>%
mutate(year = paste("Y", year, sep = "")) %>%
tidyr::spread(year, no_car_k)

Bubbles on A Map

Bubbles with A Single Color - ggplot2

A bubble can be considered as a point, of which the size is controlled by the car ownership statistics.

1
2
3
4
5
6
ggplot() +
geom_polygon(data = cnmapdf, aes(long, lat, group = group),
fill = "skyblue", colour = "grey") +
geom_point(data = cap_bubble, aes(cap_long, cap_lat, size = no_car_k),
shape = 21, fill = "red", alpha = .5) +
scale_size_area(max_size=20)

vis3-plot1

Bubbles with Effect of Heat Map - ggplot2

Calculate the growth rate the private car from 2008 to 2013, assign the data to argument fill of geom_point().

1
2
3
4
5
6
7
8
9
10
11
cap_bubble["growth"] <- subset(car_prov, year == 2013)[,3,drop = F] /
subset(car_prov, year == 2008)[,3,drop = F]-1
ggplot() +
geom_polygon(data = cnmapdf, aes(long, lat, group = group),
fill = "skyblue", colour = "grey") +
geom_point(data = cap_bubble, shape = 21,
aes(cap_long, cap_lat, size = no_car_k,
fill = growth, alpha = .5)) +
scale_fill_gradient(low = "red", high = "yellow") +
scale_size_area(max_size=20)

vis3-plot2

The bubbles with transparency cannot display the difference clearly when the growth rates are similar across provinces. Let’s try smaller bubbles without specifying the transparency. We can also remove the legend to keep the map simple and clear.

1
2
3
4
5
6
7
8
ggplot() +
geom_polygon(data = cnmapdf, aes(long, lat, group = group),
fill = "skyblue", colour = "grey") +
geom_point(data = cap_bubble, shape = 21, colour = "white",
aes(cap_long, cap_lat, size = no_car_k, fill = growth)) +
scale_fill_gradient(low = "red", high = "yellow") +
scale_size_area(max_size=15) +
theme(legend.position = "none")

vis3-plot3

It seems that from 2008 to 2013, Hainan and Gansu are the 2 provinces with the quickest growth.

Bubbles by ggvis

1
2
3
4
5
6
7
8
9
10
11
cnmapdf %>%
group_by(group) %>%
ggvis(x = ~long, y = ~lat) %>%
layer_paths(fill := "skyblue", stroke := "grey") %>%
layer_points(data = cap_bubble, x = ~cap_long, y = ~cap_lat,
fill = ~growth, size = ~no_car_k, stroke := "white") %>%
scale_numeric("fill", range = c("red", "yellow")) %>%
scale_numeric("size", range = c(50, 500)) %>%
hide_legend(scales = c("fill", "size")) %>%
add_axis("x", title = "Longitude") %>%
add_axis("y", title = "Latitude")

vis3-plot4

To remove the legend, ggvis provides hide_legend(), within this function, you can specify the names of scales to be hidden.

Bar Charts on A Map

Bar Charts by ggplot2

It is more complicated to place a bar chart than plot just a bubble on certain spot.
To locate a bubble, you just need to give aes() the coordinates which have nothing to do with the size and color of bubble.
Regarding bar chart, we need to assign at least 2 pairs of x and y, one for bar chart location and the other for the height and width of the bar chart. In other words, we need to know the coordinates of 4 corners for a bar.

Since a bar chart may takes more space and looks more complicated than a bubble, we just plot 4 cities.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
p1 <- ggplot() +
geom_polygon(data = cnmapdf, aes(long, lat, group = group),
fill = "beige", colour = "grey") +
geom_errorbar(data = cap_bar, size =3, colour = "brown",
alpha = .8, width = 0,
aes(x = cap_long - 1, ymin = cap_lat,
ymax = cap_lat + Y2003 / Y2013 * 3)) +
geom_errorbar(data = cap_bar, size =3, colour = "red",
alpha = .8, width = 0,
aes(x = cap_long, ymin = cap_lat,
ymax = cap_lat + Y2008 / Y2013 * 3)) +
geom_errorbar(data = cap_bar, size =3, colour = "orange",
alpha = .9, width = 0,
aes(x = cap_long + 1, ymin = cap_lat,
ymax = cap_lat + Y2013 / Y2013 * 3)) +
geom_text(aes(85, 24), colour = "brown",
label = "Car Ownership - 2003") +
geom_text(aes(85, 22), colour = "red",
label = "Car Ownership - 2008") +
geom_text(aes(85, 20), colour = "orange",
label = "Car Ownership - 2013")
p1

vis3-plot5

We could add more details, i.e. the avg. growth rate in the from 2003 to 2013.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
# define a function to find the root
root10find <- function(x, y){
low <- 0
high <- y/x
ans <- (low + high)/2
while (abs(x * ans^10 - y) > 0.001){
if (x * ans^10 < y){
low <- ans
}else{
high <- ans
}
ans <- (high + low)/2
}
return(ans)
}
for (i in 1:4){
cap_bar[i,"growth_10y"] <- root10find(cap_bar[i,"Y2003"], cap_bar[i,"Y2013"])-1
}
p1 +
ggtitle("Yearly Growth of Private Vehicles") +
geom_text(data = cap_bar,
aes(cap_long, cap_lat - 0.5,
label = paste(prov_en, ": ",
round(growth_10y * 100, 2),
"%", sep = "")))

vis3-plot6

Bar Charts by ggvis

It is the same logic to plot by ggvis.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
year_labels <- data.frame(long = c(80, 80, 80),
lat = c(24, 22, 20),
label = c("Car Ownership - 2003",
"Car Ownership - 2008",
"Car Ownership - 2013"),
fill = c("brown", "red", "orange"))
cnmapdf %>%
group_by(group) %>%
ggvis(x = ~long, y = ~lat, fill := "beige", stroke := "grey") %>%
layer_paths() %>%
add_axis("x", title = "longtitude") %>%
add_axis("y", title = "latitude") %>%
layer_rects(data = cap_bar, fill := "brown", stroke := "white",
x = ~cap_long - 1.5, x2 = ~cap_long - .5,
y = ~cap_lat, y2 = ~cap_lat + Y2003 / Y2013 * 3) %>%
layer_rects(data = cap_bar, fill := "red", stroke := "white",
x = ~cap_long - .5, x2 = ~cap_long + .5,
y = ~cap_lat, y2 = ~cap_lat + Y2008 / Y2013 * 3) %>%
layer_rects(data = cap_bar, fill := "orange", stroke := "white",
x = ~cap_long + .5, x2 = ~cap_long + 1.5,
y = ~cap_lat, y2 = ~cap_lat + Y2013 / Y2013 * 3) %>%
layer_text(data = year_labels, x = ~long, y = ~lat,
text := ~label, stroke := ~fill) %>%
scale_nominal("fill", range = c("brown", "red", "orange")) %>%
layer_text(data = cap_bar, x = ~cap_long -.5, y = ~cap_lat - 1,
text := ~prov_en, stroke:= "black", align:="right") %>%
layer_text(data = cap_bar, x = ~cap_long +.5, y = ~cap_lat - 1,
text := ~round(growth_10y*100 ,2),
stroke:= "black", align:="left")

vis3-plot7

I tried to combine the last 2 layer_text() functions together (the chunk below) and add a “%” to the end of the percentage point but it returns undefined at the position which is supposed to be like Beijing 14.78%. Need to figure out why the function does not support multiple embedding.

1
2
3
4
layer_text(data = cap_bar, x = ~cap_long -.5, y = ~cap_lat - 1,
text := ~paste(prov_en, ": ", round(growth_10y * 100, 2),
"%", sep = ""),
fill := "black", align := "center")

Reference

  1. 刘万祥: ggplot绘制商务图表–热力气泡中国地图
  2. 刘万祥: ggplot绘制商务图表–地图上的迷你柱形图