florianb.net

Conversion of the Gauss Krueger notation into latitude/longitude

From the archives

A while ago I wanted to implement Microsoft Virtual Earth (that’s now Bing Maps) into one of my projects (Web-Application). Unfortunately the existing geo-coordinates were formatted using the the Gauss Krüger notation. The problem is that both, Virtual Earth and Google Maps, are working with latitude/longitude. So I had to convert the Guass Krüger coordinates.
After some research I found an article from Wolfgang Back. He wrote a little pda application to convert Gauss Krüger into latitute/longitude. Perfect! Well, almost. Mr. Back likes his VB so the code was in in Visual Basic which I had to translate into C# .

Convert Gauss Krueger into latitude/longitude

First we need to convert the given Gauss Krueger coordinates into lat/long:

7-Parameter-Helmert Transformation

After the conversion you have to do the 7-Parameter-Helmert Transformation to avoid the distortion which occurs when you convert coordinates from one 3-dimensional system to another 3-dimensional geodesic system. (To be honest I don’t understand it in every details but it works :-))

Getting Started with Solr

I recently wrote that I am working on a new project using some exciting technologies like the Spark View Engine and nHibernate. I was planning to write about some pitfalls along the development process or stuff in general which I think could be interesting to write about.

Well, I didn’t!

That’s not because there wasn’t much exciting stuff going on, it was the complete opposite. I also didn’t feel to blog about a lot of the stuff which was going on in the background. There are a million great blog posts out their on how to implement a repository pattern using nHibernate and ASP.NET MVC, on how to do nice stuff using query and all the rest of it.

What happened

That said, I recently started working on a search front end for the new site and ran into a few problems. Let me explain. The website basically consists of a backend part and a front page search facility. The domain model behind the application is quite complex. Because I am going the DDD approach together with nHibernate I was able to create a clean model with pocos, abstract nHibernate away using the repository pattern and end up with a reasonable consistent and clean “infrastructure”. I still love how easy it is to map inheritance with nHibernate using the discriminators.  That all works reasonably well, although I wish I would have started the project using a NoSQL database rather than a relational one.

The Problem

As I said I have quite a complex domain model and thus an even more complex database. I am using SQL Server 2005 which so far served me well for all the backend stuff. Doing simple searches against the complex data model worked fine as well but as soon as the client came up with some more complex search requirements, it all fell down. In the end it was the “multiple word search against everything + an auto complete function to assist the user while searching” requirement which let SQL Server fall down. I simply wasn’t able to produce really fast results using nHibernate. Another requirement included the just described search features + a spatial search functionality whereby we take the location of the user (using Google Maps to get the user’s lat/long) and attach a radius search to the general search functionality.

Now I was able to do most of that achieving a “OK performance”, but that was on my local machine with me, as a single user. As soon as I was simulating more realistic user numbers, performance turned really bad.

The Solution

Now, the first obvious things to look at would be the SQL Server full-text catalogue or the MS Search Server (Express) but to be honest  none of those really convinced me. SQL Server full text catalogue seems to be quite simple to implement and use at first but as soon as you need something more “special” you run against a wall. Don’t even try.

A colleague of mine was using Lucene  in another project and really liked it. He then pointed me to a product called Solr. What is Solr?

Solr is the popular, blazing fast open source enterprise search platform from the Apache Lucene project. Its major features include powerful full-text search, hit highlighting, faceted search, dynamic clustering, database integration, and rich document (e.g., Word, PDF) handling. Solr is highly scalable, providing distributed search and index replication, and it powers the search and navigation features of many of the world’s largest internet sites.

So 3 days ago I was starting to have a look at Solr and also started implementing it. Mauricio Scheffer did an amazing job by creating SolrNet, a Solr client for .NET which really simplifies querying Solr from within my MVC application.

I really would like to do some more detailed blog posts about how I exactly implemented Solr, what the difficult bits are and how to solve some if not all of the problems described above. But let me start with giving you a brief overview of what I have done.

Part of my project is to migrate a legacy database and transform it into my new data model. Now that part is done. What I needed to do was to get the legacy data into Solr in order to make it searchable. Now Solr is supposed to be lightning fast when you query against it but I have to say I am also surprised how bloody fast it is when you batch import data into it particular if you compare it with a relational database. So after some experimenting with the SolrNet client and after I got reasonable familiar with the Solr query syntax and the way you model the Solr schema I was able to put data into Solr and query against it as well. Overall it took my about 2 days to get all that up and running.

The second challenge was to setup a second core against which I would do my auto complete lookups. I kept the schema for my second core fairly simple. I have an ID field, so I can perform updates against the index using my relational database data, a type field which is a simple text field describing the type of the entity within the index. This also comes in handy when I need to boost certain results depending on the type/query. The last field is a text field containing  the actual. The only challenging thing was to find a good tokenizer to allow multi word searches and handle stuff like “some-auto-complete-text-in-london”. Again I hope I will find some time to blog about the details later on.

So far the last challenge has been to implement a spatial search functionality. unfortunately it is Solr 1.5 which will introduce a build in spatial search functionality. However there is a free plug-in called Spatial Solr which addresses that issue. Now I have to say I was really disappointed by the lack of documentation. Let me give you a quick example.  In an example query for Solr which can be found in the plug-in zip file (pdf) they state in order to query Solr you can do this:

 

q={!spatial lat=4.32 lng=54.32 radius=30 unit=km calc=arc threadCount=2}title:pizza

But the correct query should look more like this:

q={!spatial lat=4.32 long=54.32 radius=30 unit=km calc=arc threadCount=2}title:pizza

Another challenge was to actually get the plug-in running. I am using jetty on my windows machine and according to the documentation it is as simple as putting the plug-in into your lib folder and add the update processor and query parser to your config file. Well it was not. When I was starting the server I got the following error message:

First I though it had to do with my multi-core setup but after some research and goggling I came across a blog post by Phillip who had exactly the same problem and luckily found the solution.

You need to put the plug-in into the example/work/Jetty_0_0_0_0_8983_solr.war__solr__k1kf17/solr/WEB-INF/lib folder. Obvious isn’t it? :-)

There are a couple of little examples like this which made it really hard to integrate the plug-in and get some queries running without syntax exceptions but in the end it all worked out fine. And again since the plug-in is free you can’t really complain.

Batch geocode your address data

I recently faced the challenge of geocoding existing locations. The location table contains about 10k entries. Around 75% of which had useful address data attached to it. The table contains mainly addresses from the UK but also from the US, Canada and Germany.

I was looking for a tool which would take my 10k entries and attach lat/long values where ever possible. If the address data wasn’t accurate enough, I wanted to have at least the next possible level of accuracy in terms of the lat/long values. So if an entry wouldn’t have a street and house number but a valid city/postcode I was happy with just getting the lat/long values for let’s say the West End in Glasgow. If the City would be the only thing then lat/long of the Glasgow City Centre would do it as well.

One reason why I am so desperate to get at least some degree of geocoding is that I will be working on a radius based search. Basically something like “I am in Glasgow on Byres Road and I would like to see all entries within 5 miles”. Get the idea?

There are a couple of commercial tools out there which at least look quite nice. I could have also developed my own little application to geocode the data for example by using the yahoo geocoder. However I didn’t want to spend any money or to much time on writing a custom tool for the task.

After loads of researching I came across two tools which I would like to recommend. One of them is batchgeocode.com.One way of geocoding your data is using the website where you can copy/paste your data into a text box and click a button. That’s it. Really simple and it also gives you information/feedback about the accuracy, which I think is really handy.

These indicate how accurate your geocode was. If you are finding a large number of your geocodes end up as APPROXIMATE, check the formatting and completness of your addresses.

  • ROOFTOP (most accurate) – indicates that the returned result reflects a precise geocode.
  • RANGE_INTERPOLATED – indicates that the returned result reflects an approximation (usually on a road) interpolated between two precise points (such as intersections). Interpolated results are generally returned when rooftop geocodes are unavailable for a street address.
  • GEOMETRIC_CENTER – indicates that the returned result is the geometric center of a result such as a polyline (for example, a street) or polygon (region).
  • APPROXIMATE (least accurate) – indicates that the returned result is approximate, usually the center of the zip code.

They also provide an excel template which contains some vbscript voodoo talking to the google maps api. So you can export your data into that file, adding an extra column with your primary key, geocode all your data and than import them back into your database. The limit here is 15k requests per day (that’s if you keep your IP address :-) ).
A very similar service comes from the guys from juice analytics called “Excel Geocoding Tool v2“. Again they provide you with an excel template to which you can export your existing data. They are using the yahoo geocoder and according to their site the current request limit is 5k per day. batchgeocode.com was offline during the day or at least I couldn’t connect to their site and so I used the juice analytics excel file. I’ve used the batchgeocode service before but I will definitely give it another go tomorrow and compare the accuracy and the ability to handle dodgy address data.

Where is Flo?

A while back, I posted about a fresh start project wise and that I am looking forward to blogging about my new project and all the development stuff that comes along with it. As you can see (or not) I didn’t post an awful lot recently. Apart from being quite busy recently, one reason for not blogging too intensively is that most of the work so far was around scaffolding and kind of standard CRUD stuff on the frontend side.

Although I think we came up with a pretty cool, lightweight and flexible framework on top of ASP.NET MVC, nHibernate, Spark View Engine and jQuery I just didn’t feel this was worth writing extensive blog posts about. Maybe I’ll change my mind and I shall do some blogging in retrospect about all the scaffolding work done so far.

For the next couple of weeks some of the work will include batch geocoding, legacy data import, google maps integration + radius search based on the geocoded data, implementing a payment provider as well as working on a little search engine. Some of the stuff is fairly new to me and not typically what you would see in a line of business application which I’ve done most so far.

I won’t promise anything but the chances are quite high that I’ll write something about one or the other task that I’m going to stumble accross as I find them much more exciting than implementing the repository pattern using nHibernate to get you User / Roles stuff materialized propperly :-)

More Padness: jQueryPad

Just a quick one:
Similar if not even more useful to the in my previous post mentioned LINQPad is Paul Stovell’s jQueryPad. Guess what it does :-)

jQueryPad is a fast JavaScript and HTML editor. Just start it, enter the HTML you want to work with, bash in your jQuery code, and hit F5 to see the results. Say goodbye to ALT+TAB.

jQueryPad

LINQPad and the magic samples

One of the tools I find myself using quite a lot recently is LINQPad. In my new project we are using NHibernate, which together with the Linq provider is a real joy. I can still remember when I was using NHibernate for the first time in a project and to be honest, the Criteria API to write queries is just not nice. As soon as it comes to more complex search queries or the dynamic combination and aggregation of criterias (e.g from a search forms) it can get quite confusing. Or let me say it in a different way: In the end, the criteria API works absolutley fine and you can write all the queries you want, simple and complex ones, the only problemI always had was, that it just didn’t feel natural.

One important point of using an ORM for me is, to get me away from my stored procedures and sql code and create a certain level of abstraction. HQL or the criteria API always felt like something betwixt. We are using the repository pattern which implements IQueryable<T> and thus querying against our repository feels natural and integrates really nice. Nonetheless I had a query today where I had to use HQL rather than Linq, simply because I couldn’t figure out a nice and fast way how to write the query using Linq.

Anyway, back to LINQPad. Don’t know what LINQPad is? Here is what they say on their website:

LINQPad lets you interactively query SQL databases in a modern query language: LINQ.  Kiss goodbye to SQL Management Studio!

LINQPad supports everything in C# 3.0 and Framework 3.5:

LINQ to Objects
LINQ to SQL
Entity Framework
LINQ to XML
(Even old-fashioned SQL!)

Sadly, in order to get IntelliSense support, you have to pay something (ok, not a lot, but….). But the best thing about LINQPad for me is the “Samples” section. Packed with 200+ examples from the book C# in a Nutshell. I find myself looking up examples more often than actually using the tool itself.

image

You will find far more complex examples than the usual “get me all orders where the date is in the past” and especially the examples with lambda expressions are very helpful. Although my head still hurts after to much lambda quatsch today….

A Fresh Start

Well, one would hope so!

I haven’t really blogged an awful lot in the past. Well at least not in English. I’ve been blogging a bit over on my German Blog, but that’s mostly non techie stuff. In fact most of the stuff is about my experience as a German guy, living and working in Glasgow, Scotland.

The main reason why I decided to start this brand new shiny blog is because I’m working on a new project!

Yes I know, how exciting. Well, we all know that the chances to work on a Greenfield project are quite rare. I am still working (as a freelancer) for the same company in Glasgow. After working for the past one year or so on one of their flagship product, I now get the chance to work on something completely different. I can’t tell to much about the new project but I can tell you about the technologies we’re going to use.

After doing a couple of private projects using the asp.net mvc framework I think it’s now time for prime time.

Here are some of the key technologies we’re going to use:

  • asp.net mvc
  • spark view engine (Ja! No tag soup please!)
  • nHibernate (because database doesn’t matter)
  • StructureMap (cause I like to be independent)
  • jQuery

One of the reasons I am so excited about the new project is the fact, that most of the “professional” projects I was working on in the past were using asp.net forms, no proper ORM, no IoC and Microsoft Update panel Quatsch.

I am really looking forward to the next couple of weeks. I already know that I am going to bang my head on the table when it comes to jQuery function callbacks (actually I hate javascript) and nested transactions or the unit of work pattern in nHibernate and how to tie that all together using an IoC container. But in the end that’s all a great opportunity to dive into these new technologies. Much deeper than I could do in my spare time + I’m working with a really smart guy in it. (Yes, he actually likes jQuery :-).

I will use this blog to write about all the new stuff I will learn alongside this project and about all the stuff which has absolutely nothing to do with it. Ja, ich weiss!

Loading projects in Visual Studio using the Shared Profile feature in Parallels

I recently came across a strange problem when setting up my Windows machine using Parallels on my Macbook. I recently bought a Macbook and do all my development stuff using Parallels to host a virtual windows machine. I have 2 virtual machines running, one Windows XP box and one Windows 2008 Server box which is my main develeopment machine. However, the problem occurred on both machines

Scenario:

As I said, the set-up was fine and installing all necessary tools like Visual Studio 2008 and the rest of my development stack was no problem at all. The problem was to get my “dev” folder with all my projects and source code libraries on to the virtual machine. The first choice for me was to use the “Shared Profile” feature in Parallels. I hate having file on multiple locations and I haven’t found a free subversion client for Mac OSx so far.

If you enable Shared Profile in the Parallels settings, your existing profile on the Mac gets mapped to the virtual windows machine (same works for normal folders on the mac via Shared Folders), allowing you to have just one folder where you store all your stuff like photos, documents and yes, source code as well. I wanted to be able to have my projects available on my Mac as well so I could use for example Espresso to do some of the html/css work on my Mac while all the hardcore C# coding happens on my virtual machine in Visual Studio. Now the problem was when I tried to open one of my Projects, I got the following error message:

Basically the problem is, that the source folder is effectively on a network share aka my Macbook. Now this network share by default is a not trusted location so Visual Studio won’t load my solution. Every folder on that share was marked as being in the “Internet” zone thus a not trusted location.

Solution:

Well, there is already a great posts about how to solve the trusted location problem and after reading it and also reading some in depth stuff about Caspol.exe (Code Access Security Policy Tool) and how to fully trust a share, guess what: It didn’t work. Same error message, same problem. After hours and hours of fiddling around with caspol and the security settings I finally found the solution.

The trick is to simply add the share on my Macbook to the trusted sites in the “Local intranet” zone. This can be done in the Internet Explorer settings:

After I’ve added my share to the local intranet zone my share and the content appeared as being in the “Local intranet” zone. Loading projects from my Macbook share in Visual Studio works like charm! :-)

Copyright © 2016 florianb.net

Theme by Anders NorenUp ↑