# Distance matrix calculations in R

Many models in Operations Research needs a distance matrix between nodes in a network. If the distance matrix is based on road distances it can be found using R. Let us try to compute the distances between a set for zip codes in Denmark. First we load all zip codes for Jutland in Denmark:

options(stringsAsFactors = F)

##    Zip       Area
## 1 5320    Agedrup
## 2 6753    Agerbæk
## 3 6534   Agerskov
## 4 8961 Allingåbro
## 5 6051     Almind
## 6 8592     Anholt

dim(zip)

## [1] 376   2


Note that we may have different zip codes for the same area, so we may use a single zip code for an area (this will reduce the number of calculations).

First option is to use Google maps. We use the package ggmap to find the distance between two areas:

library(ggmap)
set<-1:5   # areas under considration
zip[set,]

##    Zip       Area
## 1 5320    Agedrup
## 2 6753    Agerbæk
## 3 6534   Agerskov
## 4 8961 Allingåbro
## 5 6051     Almind

distanceMat<-matrix(NA, nrow=length(set), ncol = length(set))
for(i in 1:length(set)) {
for(j in 1:length(set)) {
if (i==j) distanceMat[i,j] <- 0
else distanceMat[i,j] <- mapdist(paste(zip$Zip[set[i]],", Denmark", sep=""), paste(zip$Zip[set[j]], ", Denmark", sep=""),
mode = 'driving', output = 'simple')$km } } colnames(distanceMat)<-zip$Zip[set]
rownames(distanceMat)<-zip$Zip[set] distanceMat  ## 5320 6753 6534 8961 6051 ## 5320 0.000 135.248 137.949 208.902 82.221 ## 6753 136.759 0.000 71.860 178.424 58.829 ## 6534 139.355 71.859 0.000 202.201 61.425 ## 8961 210.037 179.297 201.797 0.000 146.069 ## 6051 83.623 58.106 60.807 146.469 0.000  library(reshape2) dat<-melt(distanceMat) colnames(dat)<-c("from","to","km") head(dat)  ## from to km ## 1 5320 5320 0.000 ## 2 6753 5320 136.759 ## 3 6534 5320 139.355 ## 4 8961 5320 210.037 ## 5 6051 5320 83.623 ## 6 5320 6753 135.248  We use the zip as input to Google’s geocode function. We might have used the area name instead; however, it is not as stable since Google’s geocode function don’t like e.g. æ, ø and å. Two formats are returned which can be saved: a distance matrix and a long format. Google’s API calculate shortest path distances based on Google maps. As a result the distances are not symmetric. If you can accept a small error replace if (i==j) distanceMatrix[i,j] <- 0 with if (i>j) next. Note that the Google maps API limits to 2500 element queries a day. That is, for large distance matrices you have a problem. Your usage can be checked using: distQueryCheck()  ## 2480 distance queries remaining.  ## Using Bing maps Bing maps don’t have a limit on the number of queries. However, it seems to be quite unstable (returning service unavaliable sometimes). We use the package taRifx.geo library(taRifx.geo) options(BingMapsKey="<Your Bing Maps API key>") set<-1:5 # areas under considration zip[set,]  ## Zip Area ## 1 5320 Agedrup ## 2 6753 Agerbæk ## 3 6534 Agerskov ## 4 8961 Allingåbro ## 5 6051 Almind  distanceMatBing<-matrix(NA, nrow=length(set), ncol = length(set)) for (trys in 1:10) { # number of times we try to connect for(i in set) { for(j in set) { if (i>j) {distanceMatBing[i,j] <- distanceMatBing[j,i]; next} # assume symetric distances if (!is.na(distanceMatBing[i,j])) next # value already calculated if (i==j) {distanceMatBing[i,j] <- 0; next} result <- try(georoute( paste(c(zip$Zip[i], zip$Zip[j]), " Danmark", sep=""), returntype="distance", service="bing" ) ) if (class(result) != "try-error") { distanceMatBing[i,j] <- result[1,1] } } } } colnames(distanceMatBing)<-zip$Zip[set]
rownames(distanceMatBing)<-zip\$Zip[set]
distanceMatBing

##         5320    6753    6534    8961    6051
## 5320   0.000 135.750 138.689 211.720  83.231
## 6753 135.750   0.000  81.983 179.541  59.115
## 6534 138.689  81.983   0.000 204.505  61.913
## 8961 211.720 179.541 204.505   0.000 148.726
## 6051  83.231  59.115  61.913 148.726   0.000

dat<-melt(distanceMatBing)
colnames(dat)<-c("from","to","km")

##   from   to      km
## 1 5320 5320   0.000
## 2 6753 5320 135.750
## 3 6534 5320 138.689
## 4 8961 5320 211.720
## 5 6051 5320  83.231
## 6 5320 6753 135.750


Note I have added a loop where we try to calculate the distance 10 times due to the unstable API. If we compare with the distances calculated using Google maps we see that there may be differences depending on which map engine we use.

I used the above code to find all distances between zips in jutland. The distance matrix is avaliable on GitHub.

## Other solutions and resources

Note this way to calculate the distance matrix is brute force. A much better approach is to reoptimize the calculations using a All-to-All shortest path algorithm.