|
|
| GeoCommunity Mailing List |
| |
| Mailing List Archives |
| Subject: | RE: GISList: Geocoder comparisons * Build your own |
| Date: |
02/06/2003 07:55:52 PM |
| From: |
Neil Havermale |
|
|
There is no better speed-up to geocoding than clean data. If you want simple geo-coding its generally included in the core product - no charge! The cost is the accurate street information with accurate addressing as well as CLEAN data that will match-up. Data cleaning prior to geoprocessing is a big value added, aka "hits" and efficiency.
Andrew offers a smart and low cost method FYI.....
MidNight Mapper Aka neil
-----Original Message----- From: Canfield, Andrew [mailto:Andrew_Canfield@cable.comcast.com] Sent: Thursday, February 06, 2003 12:28 AM To: 'Alex Eshed': MapInfo-L (E-mail) Subject: RE: MI-L Geocoding Scrubber!!!
I deal with this issue every day. Unfortunetly my solution will only work in the US and Canada. There is an AI company called SemaphoreCorp that makes a product called ZP4 it does everything from common spelling mistakes to phoenetic string building. A lot of great programming went into the Semaphores within the program.
The user interface side of it leaves a lot to be desired because it wasn't really meant to be used as a stand alone product. You can do it that way but it works best as a plugin. The great news for MapBasic people is that ZP4's native connection language is DDE not OLE so you can plug it directly into your MapBasic App. ZP4 has the USPS complete database compressed within about four or five compression database files. It scrubs, then corrects and gives you a zip + 4 hence the name. It will also do DPV validation based on the postal records (DPV means delivery point validation, wether or not the address has ever recieved mail before ). You can then run them through MapMarker to do your geocode. What I did was canned the whole thing so my users click a button browse to the text file of addresses then click run and they end up with a table of scrubbed and geocoded addresses with no further input. You have to also use the MapMarker API to do that.
The bare bones easiest way if you have to have it right now is scrub the text file with ZP4, make a table of it with MapInfo send that through MapMarker and your done in three steps rather than one but there is no programming involved for those who don't want to or can't do the programming. ZP4 is also really cheap and very fast I clean a million bad addresses in ten to fifteen minutes on my machine. Longer if you are running a slower machine but still very fast. I hate sounding like an add but I know how frustrating it can be trying to find good cleaning software that doesn't cost 30K so I will include the link to SemaphorCorp's site http://www.semaphorecorp.com/cgi/zp4.html.
The Postal Database updates come once every two months if I remember corrrectly so it runs about $950 a year but that's total cost there is no startup or app fee the app is on the first cd every time they send it to you if you need to reinstall it or whatever. It's a bit more expensive if you run it on a network allowing everyone access to it but still not bad. The next best one I found with that much AI in it was 30K to start and 6K a year to maintain. With ZP4 it's always just 950 if you don't want DPV it's only 475 a year. I hope this helps. For a while over on Jaques's site he had an example of my first app which canned the whole process. It is meant only as an example of how one might choose to do this. So you can use it as source code but it's not meant for production, only example. His site address is http://www.paris-pc-gis.com/index.html and on the left frame about three quarters of the way down is a link called files from other origins click that and the center frame should change about halfway down that page is a file called geomapbatch and it is meant as source examples only. Hope this helps.
-----Original Message----- From: Alex Eshed [mailto:Alexe@opisoft.com] Sent: Wednesday, February 05, 2003 12:07 AM To: MapInfo-L (E-mail) Subject: RE: MI-L Geocoding Scrubber!!!
Greetings, List.
I'm sure this topic has occupied the minds of everyone involved in geocoding by address.
My doubts coincide with Bill's. A very high level of artificial intelligence would be needed to identify the geocodable address behind the "telegraph mode" or "colloquial description."
So it seems that the most efficient way would be to employ a table of synonyms to be scanned before geocoding. This is far more complex than it sounds. It will work only in your language, and needs to address (no pun intended) the most frequent typos, blunders and shorthand encountered.
We employ this method in our PreGeocoder. Several other factors are also considered. For instance, the program suggests synonyms to the user when none are found in the scan. Accepted suggestions are added automatically so that the program "learns" by frequent usage. Several soundex-like tests are also incorporated. Despite the fact that many "gibberish"
|
|

Sponsored by:

For information regarding advertising rates Click Here!
|