florianb.net

${just go with the flo(w)}

Menu Close

Month: January 2010

Batch geocode your address data

I recently faced the challenge of geocoding existing locations. The location table contains about 10k entries. Around 75% of which had useful address data attached to it. The table contains mainly addresses from the UK but also from the US, Canada and Germany.

I was looking for a tool which would take my 10k entries and attach lat/long values where ever possible. If the address data wasn’t accurate enough, I wanted to have at least the next possible level of accuracy in terms of the lat/long values. So if an entry wouldn’t have a street and house number but a valid city/postcode I was happy with just getting the lat/long values for let’s say the West End in Glasgow. If the City would be the only thing then lat/long of the Glasgow City Centre would do it as well.

One reason why I am so desperate to get at least some degree of geocoding is that I will be working on a radius based search. Basically something like “I am in Glasgow on Byres Road and I would like to see all entries within 5 miles”. Get the idea?

There are a couple of commercial tools out there which at least look quite nice. I could have also developed my own little application to geocode the data for example by using the yahoo geocoder. However I didn’t want to spend any money or to much time on writing a custom tool for the task.

After loads of researching I came across two tools which I would like to recommend. One of them is batchgeocode.com.One way of geocoding your data is using the website where you can copy/paste your data into a text box and click a button. That’s it. Really simple and it also gives you information/feedback about the accuracy, which I think is really handy.

These indicate how accurate your geocode was. If you are finding a large number of your geocodes end up as APPROXIMATE, check the formatting and completness of your addresses.

  • ROOFTOP (most accurate) – indicates that the returned result reflects a precise geocode.
  • RANGE_INTERPOLATED – indicates that the returned result reflects an approximation (usually on a road) interpolated between two precise points (such as intersections). Interpolated results are generally returned when rooftop geocodes are unavailable for a street address.
  • GEOMETRIC_CENTER – indicates that the returned result is the geometric center of a result such as a polyline (for example, a street) or polygon (region).
  • APPROXIMATE (least accurate) – indicates that the returned result is approximate, usually the center of the zip code.

They also provide an excel template which contains some vbscript voodoo talking to the google maps api. So you can export your data into that file, adding an extra column with your primary key, geocode all your data and than import them back into your database. The limit here is 15k requests per day (that’s if you keep your IP address 🙂 ).
A very similar service comes from the guys from juice analytics called “Excel Geocoding Tool v2“. Again they provide you with an excel template to which you can export your existing data. They are using the yahoo geocoder and according to their site the current request limit is 5k per day. batchgeocode.com was offline during the day or at least I couldn’t connect to their site and so I used the juice analytics excel file. I’ve used the batchgeocode service before but I will definitely give it another go tomorrow and compare the accuracy and the ability to handle dodgy address data.

Where is Flo?

A while back, I posted about a fresh start project wise and that I am looking forward to blogging about my new project and all the development stuff that comes along with it. As you can see (or not) I didn’t post an awful lot recently. Apart from being quite busy recently, one reason for not blogging too intensively is that most of the work so far was around scaffolding and kind of standard CRUD stuff on the frontend side.

Although I think we came up with a pretty cool, lightweight and flexible framework on top of ASP.NET MVC, nHibernate, Spark View Engine and jQuery I just didn’t feel this was worth writing extensive blog posts about. Maybe I’ll change my mind and I shall do some blogging in retrospect about all the scaffolding work done so far.

For the next couple of weeks some of the work will include batch geocoding, legacy data import, google maps integration + radius search based on the geocoded data, implementing a payment provider as well as working on a little search engine. Some of the stuff is fairly new to me and not typically what you would see in a line of business application which I’ve done most so far.

I won’t promise anything but the chances are quite high that I’ll write something about one or the other task that I’m going to stumble accross as I find them much more exciting than implementing the repository pattern using nHibernate to get you User / Roles stuff materialized propperly 🙂

© 2020 florianb.net. All rights reserved.

Theme by Anders Norén.