Version info: Code for this page was tested in Stata 12.1.
This code fragment page shows an example using Mata to write functions and loop through a dataset to calculate the sum distance between lattitude and longitude points within particular id groups. We use a built in Stata dataset for convenience even though it is not real lattitude and longitude data. The distance formula used is haversine. The Mata function mhaversine stands for matrix haversine because it takes a matrix as input.
mata
real matrix mhaversine(real matrix x)
/* assumes data matrix with points */
/* lat1 lon1 (from) lat2 lon2 (to) */
{
d2r = pi() / 180
earth_Rmeters = 6372797.560856 /* earth radius in meters */
lat1 = x[, 1]
lat2 = x[, 3]
lon1 = x[, 2]
lon2 = x[, 4]
rdLat = (lat1 - lat2) * d2r
rdLon = (lon1 - lon2) * d2r
latH = sin(rdLat * .5):^2
lonH = sin(rdLon * .5):^2
tmp = cos(lat1 * d2r) :* cos(lat2 * d2r)
tmp = 2 * asin(sqrt(latH :+ tmp :* lonH))
results = earth_Rmeters :* tmp /* distance in meters */
return(results)
}
void foobar(string scalar varlist, ///
string scalar selector, ///
string scalar output) {
x = st_data(., varlist, selector)
results = I(rows(x))
for(i=1; i<=rows(x); i++) {
dat = (J(rows(x), 1, x[i, ]), x)
results[i, ] = mhaversine(dat)'
}
finalresults = rowsum(results)
st_store(., output, selector, finalresults)
}
end
sysuse auto
gen mata_id = .
gen results = .
egen id = group(displacement gear_ratio)
summarize id, meanonly
forvalues ind=1/`r(max)' {
replace mata_id = id == `ind'
mata foobar("weight length", "mata_id", "results")
}
list results in 1/10
+----------+
| results |
|----------|
1. | 0 |
2. | 0 |
3. | 0 |
4. | 3360138 |
5. | 7731513 |
|----------|
6. | 5559471 |
7. | 0 |
8. | 3360138 |
9. | 4.60e+07 |
10. | 1.34e+07 |
+----------+
