Many models in Operations Research needs a distance matrix between nodes in a network. If the distance matrix is based on road distances it can be found using R. Let us try to compute the distances between a set for zip codes in Denmark. First we load all zip codes for Jutland in Denmark:
options(stringsAsFactors = F)
zip<-read.csv2("zip_jutland.csv")
head(zip)
## Zip Area ## 1 5320 Agedrup ## 2 6753 Agerbæk ## 3 6534 Agerskov ## 4 8961 Allingåbro ## 5 6051 Almind ## 6 8592 Anholt
dim(zip)
## [1] 376 2
Note that we may have different zip codes for the same area, so we may use a single zip code for an area (this will reduce the number of calculations).
Using Google maps
First option is to use Google maps. We use the package ggmap
to find the distance between two areas:
library(ggmap)
set<-1:5 # areas under considration
zip[set,]
## Zip Area ## 1 5320 Agedrup ## 2 6753 Agerbæk ## 3 6534 Agerskov ## 4 8961 Allingåbro ## 5 6051 Almind
distanceMat<-matrix(NA, nrow=length(set), ncol = length(set))
for(i in 1:length(set)) {
for(j in 1:length(set)) {
if (i==j) distanceMat[i,j] <- 0
else distanceMat[i,j] <- mapdist(paste(zip$Zip[set[i]],", Denmark", sep=""),
paste(zip$Zip[set[j]], ", Denmark", sep=""),
mode = 'driving', output = 'simple')$km
}
}
colnames(distanceMat)<-zip$Zip[set]
rownames(distanceMat)<-zip$Zip[set]
distanceMat
## 5320 6753 6534 8961 6051 ## 5320 0.000 135.248 137.949 208.902 82.221 ## 6753 136.759 0.000 71.860 178.424 58.829 ## 6534 139.355 71.859 0.000 202.201 61.425 ## 8961 210.037 179.297 201.797 0.000 146.069 ## 6051 83.623 58.106 60.807 146.469 0.000
library(reshape2)
dat<-melt(distanceMat)
colnames(dat)<-c("from","to","km")
head(dat)
## from to km ## 1 5320 5320 0.000 ## 2 6753 5320 136.759 ## 3 6534 5320 139.355 ## 4 8961 5320 210.037 ## 5 6051 5320 83.623 ## 6 5320 6753 135.248
We use the zip as input to Google’s geocode
function. We might have used the area name instead; however, it is not as stable since Google’s geocode
function don’t like e.g. æ, ø and å. Two formats are returned which can be saved: a distance matrix and a long format.
Google’s API calculate shortest path distances based on Google maps. As a result the distances are not symmetric. If you can accept a small error replace if (i==j) distanceMatrix[i,j] <- 0
with if (i>j) next
.
Note that the Google maps API limits to 2500 element queries a day. That is, for large distance matrices you have a problem. Your usage can be checked using:
distQueryCheck()
## 2480 distance queries remaining.
Using Bing maps
Bing maps don’t have a limit on the number of queries. However, it seems to be quite unstable (returning service unavaliable sometimes). We use the package taRifx.geo
library(taRifx.geo)
options(BingMapsKey="<Your Bing Maps API key>")
set<-1:5 # areas under considration
zip[set,]
## Zip Area ## 1 5320 Agedrup ## 2 6753 Agerbæk ## 3 6534 Agerskov ## 4 8961 Allingåbro ## 5 6051 Almind
distanceMatBing<-matrix(NA, nrow=length(set), ncol = length(set))
for (trys in 1:10) { # number of times we try to connect
for(i in set) {
for(j in set) {
if (i>j) {distanceMatBing[i,j] <- distanceMatBing[j,i]; next} # assume symetric distances
if (!is.na(distanceMatBing[i,j])) next # value already calculated
if (i==j) {distanceMatBing[i,j] <- 0; next}
result <- try(georoute( paste(c(zip$Zip[i], zip$Zip[j]), " Danmark", sep=""), returntype="distance", service="bing" ) )
if (class(result) != "try-error") {
distanceMatBing[i,j] <- result[1,1]
}
}
}
}
colnames(distanceMatBing)<-zip$Zip[set]
rownames(distanceMatBing)<-zip$Zip[set]
distanceMatBing
## 5320 6753 6534 8961 6051 ## 5320 0.000 135.750 138.689 211.720 83.231 ## 6753 135.750 0.000 81.983 179.541 59.115 ## 6534 138.689 81.983 0.000 204.505 61.913 ## 8961 211.720 179.541 204.505 0.000 148.726 ## 6051 83.231 59.115 61.913 148.726 0.000
dat<-melt(distanceMatBing)
colnames(dat)<-c("from","to","km")
head(dat)
## from to km ## 1 5320 5320 0.000 ## 2 6753 5320 135.750 ## 3 6534 5320 138.689 ## 4 8961 5320 211.720 ## 5 6051 5320 83.231 ## 6 5320 6753 135.750
Note I have added a loop where we try to calculate the distance 10 times due to the unstable API. If we compare with the distances calculated using Google maps we see that there may be differences depending on which map engine we use.
I used the above code to find all distances between zips in jutland. The distance matrix is avaliable on GitHub.
Other solutions and resources
Note this way to calculate the distance matrix is brute force. A much better approach is to reoptimize the calculations using a All-to-All shortest path algorithm.
- Bing maps routes
- Google distance matrix API
- Mapquest distance matrix API
- taRifx.geo R package
- OSRM server API (see table) – Only report travel times. See also this discussion
- Geographic Distance Matrix Generator for spherical distances