I recently faced the challenge of geocoding existing locations. The location table contains about 10k entries. Around 75% of which had useful address data attached to it. The table contains mainly addresses from the UK but also from the US, Canada and Germany.
I was looking for a tool which would take my 10k entries and attach lat/long values where ever possible. If the address data wasn’t accurate enough, I wanted to have at least the next possible level of accuracy in terms of the lat/long values. So if an entry wouldn’t have a street and house number but a valid city/postcode I was happy with just getting the lat/long values for let’s say the West End in Glasgow. If the City would be the only thing then lat/long of the Glasgow City Centre would do it as well.
One reason why I am so desperate to get at least some degree of geocoding is that I will be working on a radius based search. Basically something like “I am in Glasgow on Byres Road and I would like to see all entries within 5 miles”. Get the idea?
There are a couple of commercial tools out there which at least look quite nice. I could have also developed my own little application to geocode the data for example by using the yahoo geocoder. However I didn’t want to spend any money or to much time on writing a custom tool for the task.
After loads of researching I came across two tools which I would like to recommend. One of them is batchgeocode.com.One way of geocoding your data is using the website where you can copy/paste your data into a text box and click a button. That’s it. Really simple and it also gives you information/feedback about the accuracy, which I think is really handy.
These indicate how accurate your geocode was. If you are finding a large number of your geocodes end up as APPROXIMATE, check the formatting and completness of your addresses.
- ROOFTOP (most accurate) – indicates that the returned result reflects a precise geocode.
- RANGE_INTERPOLATED – indicates that the returned result reflects an approximation (usually on a road) interpolated between two precise points (such as intersections). Interpolated results are generally returned when rooftop geocodes are unavailable for a street address.
- GEOMETRIC_CENTER – indicates that the returned result is the geometric center of a result such as a polyline (for example, a street) or polygon (region).
- APPROXIMATE (least accurate) – indicates that the returned result is approximate, usually the center of the zip code.
They also provide an excel template which contains some vbscript voodoo talking to the google maps api. So you can export your data into that file, adding an extra column with your primary key, geocode all your data and than import them back into your database. The limit here is 15k requests per day (that’s if you keep your IP address 🙂 ).
A very similar service comes from the guys from juice analytics called “Excel Geocoding Tool v2“. Again they provide you with an excel template to which you can export your existing data. They are using the yahoo geocoder and according to their site the current request limit is 5k per day. batchgeocode.com was offline during the day or at least I couldn’t connect to their site and so I used the juice analytics excel file. I’ve used the batchgeocode service before but I will definitely give it another go tomorrow and compare the accuracy and the ability to handle dodgy address data.