I'm trying compress all the lines in the US Census Bureau's TIGER data set into as small a space as possible for use with a geocoder, so I'm looking for ideas on how best to do this. I've managed about 60% compression by using coordinates clipped to the nearest .00001 degrees and applying a delta compression techique from there. (To do this I store the first coordinate as a pair of signed long integers, then I store the differece between those and the next coordinate as a pair of bytes, then difference between the next point as a pair of bytes, and so on. If the difference between subsequent points is greater than 255 in either x or y, I just store the full coordinate and start again from there.) I trim the coordinates to 5 decimals because for the purposes of locating an address, the sixth digit contributes only a few feet to the precision, and so it is not really necessary.
My question is, is there a better way to approach this problem to get significantly greater compression, yet still retain good retrieval performance?
- Bill Thoen
_______________________________________________ gislist mailing list gislist@lists.geocomm.com http://lists.geocomm.com/mailman/listinfo/gislist
_________________________________ This list is brought to you by The GeoCommunity http://www.geocomm.com/
|