Measuring the distance between two points on earth is probably more complicated than you think (at least it is more complicated than I thought). Calculating the distance between places is not as easy as getting a ruler and measuring on the map. The earth is a sphere, and not a very good sphere at that. The centrifugal force during rotation causes the earth to be a bit fatter at the equator, about 21km wider around the equator compared to measuring the radius at the poles.
The motivation behind wanting to measure distance between two points is because I want to calculate how far I have cycled using the GPX files my Garmin produces. Having not done this before I thought I'd investigate the best way to measure distance between locations.
There are a few different ways to calculate the distance between two locations, each with its own downsides. This stack exchange answer explains in much more detail than I could the error margins associated with modelling the earth as a sphere vs using a projection. There are three ways of calculating distance, I'll describe them in order of increasing accuracy. All of these calculations do not use an elevation component, I'll discuss this in another post.
Straight Line Distance
This measures the straight line distance between two points. Ignoring the sphericalness of the earth by projecting the coordinates onto a plane. The calculation can have issues when points become further apart or are close to the poles.
Spherical Distance
Greater Circle or Haversine distances assume the earth is a perfect sphere and uses the mean radius of the earth, 6371km, for calculations. This can have an error of ~0.5% because the earth isn't a perfect circle.
Ellipsoidal Distance
Uses more accurate ellipsoidal models which give better estimations of the earth's shape. You can use different projections for this, the projection that gives the best estimations for the entire planet is WGS-84. However most countries have a projection which works best for locations within its boundary, such as EPSG:27700 for the UK or EPSG:4993 for Laos. These calculations can be accurate to within a cm or so.
One common algorithm to calculate an ellipsoidal distance is the Vincenty algorithm. However this also has its drawbacks, being slower than the haversine algorithm and also being unstable for near antipodal points. This has meant that some python libraries have moved away from using it in favour of other calculation methods. This post has a good description on how to implement it yourself.
Locations
To find out how accurate different calculation methods are, let us pick some interesting locations and find the distance between them.
(44.420552, 6.736644) to (44.321590, 6.806830) (Second highest paved route through the alps)
(24.137328, 49.108738) to (24.138247, 51.311338) (Longest straight road in the world)
(71.170631, 25.784861) to (60.167217, 24.953351) (Start of EuroVelo 11 to Helsinki)
(-33.437776, -70.650450) to (33.437776, 109.349550) (Antipodal points, Santiago, Chile to Xi'An, China)
Calculations
There are a few different python libraries that can be used to measure distance. You can also implement some of the algorithms yourself.
haversine library, a single purpose library. If you want to calculate the haversine distance but don't want to implement the formula yourself, this library will do it for you.
geopy . A general purpose geocoding library. Can be used as a client to a multitude of different geocoding web services. It's distance method used to use the Vincenty formula to calculate the ellipsoidal distance however it was replaced with this method due to the problems Vincenty has with antipodal points.
pyproj. Wraps the PROJ library which can be used to perform transformations between different projections.
vincenty, a single purpose library. If you want to calculate the vincenty distance but don't want to implement the formula yourself, this library will do it for you.
Examples
Below are some examples using the formulas directly implemented in python or examples using the libraries listed above.
Straight Line Distance
Straight Line Distance
\(d=r\sqrt{(\varphi_2-\varphi_1)^2 + (\cos(\frac{\varphi_2-\varphi_1}{2})(\lambda_2-\lambda_1))^2}\)
Which when converted to python looks like this
from math import cos, radians, sqrt
def calculate_flat_surface_distance(
location_1: tuple[float, float], location_2: tuple[float, float]
) -> float:
latitude_1, longitude_1 = location_1
latitude_1, longitude_1 = math.radians(latitude_1), math.radians(longitude_1)
latitude_2, longitude_2 = location_2
latitude_2, longitude_2 = math.radians(latitude_2), math.radians(longitude_2)
difference_latitude = latitude_2 - latitude_1
difference_longitude = longitude_2 - longitude_1
# Radius of earth in kilometers
radius = 6371
distance = radius * sqrt(
(
difference_latitude ** 2
+ (cos(difference_latitude / 2) * difference_longitude) ** 2
)
)
return distance
location_pairs = [
[(44.420552, 6.736644), (44.321590, 6.806830)],
[(24.137328, 49.108738), (24.138247, 51.311338)],
[(71.170631, 25.784861), (60.167217, 24.953351)],
[(-33.437776, -70.650450), (33.437776, 109.349550)]
]
for location_pair in location_pairs:
distance = calculate_flat_surface_distance(*location_pair)
print(round(distance, 2))
13.49
244.92
1226.98
18282.88
Haversine
Wikipedia Formula
Using the haversine formula to calculate the greater circle distance between the two points.
\(d = 2r \arcsin(\sqrt{\sin^2(\frac{\varphi_2-\varphi_1}{2})+\cos\varphi_1\cdot\cos\varphi_2\cdot\sin^2(\frac{\lambda_2-\lambda_1}{2})})\)
from math import asin, cos, radians, sin, sqrt
def calculate_haversine_distance(
location_1: tuple[float, float], location_2: tuple[float, float]
) -> float:
latitude_1, longitude_1 = location_1
latitude_1, longitude_1 = radians(latitude_1), radians(longitude_1)
latitude_2, longitude_2 = location_2
latitude_2, longitude_2 = radians(latitude_2), radians(longitude_2)
difference_latitude = latitude_2 - latitude_1
difference_longitude = longitude_2 - longitude_1
# Radius of earth in kilometers
radius = 6371
distance = (
2
* radius
* asin(
sqrt(
(
sin(difference_latitude / 2) ** 2
+ cos(latitude_1)
* cos(latitude_2)
* sin(difference_longitude / 2) ** 2
)
)
)
)
return distance
location_pairs = [
[(44.420552, 6.736644), (44.321590, 6.806830)],
[(24.137328, 49.108738), (24.138247, 51.311338)],
[(71.170631, 25.784861), (60.167217, 24.953351)],
[(-33.437776, -70.650450), (33.437776, 109.349550)]
]
for location_pair in location_pairs:
distance = calculate_haversine_distance(*location_pair)
print(round(distance, 2))
12.34
223.5
1224.09
20015.09
Haversine Library
Calculate the Haversine distance using the python library haversine
from haversine import haversine
location_pairs = [
[(44.420552, 6.736644), (44.321590, 6.806830)],
[(24.137328, 49.108738), (24.138247, 51.311338)],
[(71.170631, 25.784861), (60.167217, 24.953351)],
[(-33.437776, -70.650450), (33.437776, 109.349550)]
]
for location_pair in location_pairs:
distance = haversine(*location_pair)
print(round(distance, 2))
12.34
223.5
1224.09
20015.11
Greater Circle with Geopy
Calculate greater circle distance with geopy
import geopy.distance
location_pairs = [
[(44.420552, 6.736644), (44.321590, 6.806830)],
[(24.137328, 49.108738), (24.138247, 51.311338)],
[(71.170631, 25.784861), (60.167217, 24.953351)],
[(-33.437776, -70.650450), (33.437776, 109.349550)]
]
for location_pair in location_pairs:
distance = geopy.distance.great_circle(*location_pair)
print(round(distance.km, 2))
12.34
223.5
1224.09
20015.12
Ellipsoidal Distance
Geodesic Distance with Geopy
Calculate geodesic distance with using geopy.
import geopy.distance
location_pairs = [
[(44.420552, 6.736644), (44.321590, 6.806830)],
[(24.137328, 49.108738), (24.138247, 51.311338)],
[(71.170631, 25.784861), (60.167217, 24.953351)],
[(-33.437776, -70.650450), (33.437776, 109.349550)]
]
for location_pair in location_pairs:
distance = geopy.distance.geodesic(*location_pair)
print(round(distance.km, 2))
12.34
223.88
1227.45
20003.93
Geodesic distance with Pyproj
Calculate geodesic distance with using pyproj on the WG-84 projection.
from pyproj import Geod
location_pairs = [
[(44.420552, 6.736644), (44.321590, 6.806830)],
[(24.137328, 49.108738), (24.138247, 51.311338)],
[(71.170631, 25.784861), (60.167217, 24.953351)],
[(-33.437776, -70.650450), (33.437776, 109.349550)]
]
g = Geod(ellps='WGS84')
for (lat_1, lon_1), (lat_2, lon_2) in location_pairs:
azimuth_1, azimuth_2, distance = g.inv(lon_1, lat_1, lon_2, lat_2)
print(round(distance / 1000, 2))
12.34
223.88
1227.45
20003.93
Vincenty library
Calculate the Vincenty distance using the python library vincenty
from vincenty import vincenty
location_pairs = [
[(44.420552, 6.736644), (44.321590, 6.806830)],
[(24.137328, 49.108738), (24.138247, 51.311338)],
[(71.170631, 25.784861), (60.167217, 24.953351)],
[(-33.437776, -70.650450), (33.437776, 109.349550)]
]
for location_pair in location_pairs:
distance = vincenty(*location_pair)
rounded_distance = round(distance, 2) if distance else distance
print(rounded_distance)
12.34
223.88
1227.45
None
Conclusion
The data
| Google's Distance |
Straight Line Distance |
Haversine (Wiki Formula) |
Haversine (Haversine Library) |
Greater Circle (Geopy) |
Geodesic (Geopy) |
Geodesic (pyproj) |
Vincenty (Vincenty Library) |
| 12.33 |
13.49 |
12.34 |
12.34 |
12.34 |
12.34 |
12.34 |
12.34 |
| 223.53 |
244.92 |
223.5 |
223.5 |
223.5 |
223.88 |
223.88 |
223.88 |
| 1,224.09 |
1226.98 |
1224.09 |
1224.09 |
1224.09 |
1227.45 |
1227.45 |
1227.45 |
| 20,015.12 |
18282.88 |
20015.09 |
20015.11 |
20015.12 |
20003.93 |
20003.93 |
None |
As can be seen, unsurprisingly, different libraries that use the same calculation method produce the same results. You can see that over larger distances the calculation methods that assume a perfectly spherical earth diverge from ones which use a projection. Assuming the world is flat comes with a lot of different problems, incorrectly measuring the distance between points is one of them. When measuring the distance between the antipodal points, the Vincenty library returns None as the formula does not converge for antipodal points.
Next Steps
This was a quick overview of some different ways of calculating distance using python. Next I'm going to look at how fast these methods are as well as looking at how much of a difference it makes when you introduce an elevation component.