My original message appears at the bottom.
A big thank you to Bill Huber who detailed the analytic solution (which is complex) and went on to say that due to the complexity of the exact solution the problem is most commonly solved with a numerical solution as I suggested in my original question. I am including Bill's response in full.
Bryan
Here's Bill answer:
> -----Original Message----- > From: Quantitative Decisions [mailto:whuber@quantdec.com] > Sent: Monday, March 22, 2004 13:13 > To: bryan@geomega.com > Subject: Re: [gislist] average distance between polygons > > People have been doing things like this for over 300 years, but I do not > know whether there's a formal name for it. > > Geostatisticians have grappled with a similar problem: the total > covariance > between the mean value of a random field and a datum obtained > over a finite > region is an integral of a function (the covariance) that depends > explicitly on 'r'. The linear covariance model is equivalent to your > problem. Tables of results, where the regions are rectangles, appear in > Journel and Huijbregts ("Mining Geostatistics," Academic Press, 1978). > > Such expressions even appear in quantum mechanics, which must evaluate > similar integrals involving differential operators that can depend on the > separation 'r' between points. > > The exact, theoretical answer is much messier than you would think. For > starters, it's a four-dimensional integral. For two polygonal > regions 'A' > and 'B', we can first reduce the problem to an integral over 'B' of the > mean distance to 'A', so the 4D integral becomes a repeated double > integral. The mean distance from a point 'x' (in 'B') to 'A' in turn can > be expressed as a _finite_ weighted sum of mean distances to triangles > forming 'A'. Each of those can be dissected into (at most) a > pair of right > triangles. Let's see what this looks like in the case of one right > triangle, just to take the simplest instance. Remember, we will > later have > to integrate (over 'B') whatever we end up with. > > By translating, rotating, reflecting if necessary (which will not change > mean distances) and by choosing our distance units appropriately, we may > assume the right triangle has vertices at (0,0), (1,0), and (1,b), with b > positive. To give you a sense of what's going on, let's compute the mean > distance between this triangle and the origin: a very special and simple > case of what needs to be done in general. That's the mean value of > sqrt(x^2 + y^2), divided by the triangle's area. (The area is easy to > find, so let's forget about it for now.) This expression suggests we use > polar coordinates (r, t), in which the triangle is bounded by the > line t=0, > the line t = ATan(b), and the line r = sec(t). Now sqrt(x^2 + > y^2) = r and > the element of area is r*dt*dr, so we need to integrate r^2*dt*dr > over this > region. Integrating over r gives r^3/3*dt, which in turn must be > integrated from r = 0 to r = sec(t), leaving sec^3(t)*dt/3 to be > integrated > between t = 0 and t = ATan(b). Using routine tricks of integral calculus > yields [ln(|sec(t) + tan(t)| + sqrt(1 + sec^2(t))]/3 evaluated between 0 > and ATan(b), which reduces to [ln(b + sqrt(1+b^2)) + (b^2)/2]/3, because > sec^2(ATan(b)) = 1 + b^2. Dividing by the triangle's area of b/2 gives a > mean value of > > [2*ln(b + sqrt(1+b^2))/b + b]/3. > > I told you it was messy ... and this is the simplest case! > > As a check, notice that the limiting form of this answer as b > goes to zero > is the mean triangle width of 2/3, as one would suppose from elementary > geometry, and likewise as b gets very large, this expression is close to > the mean triangle height of b/3, again as expected. > > For something as simple as an isosceles right triangle, though, > with b = 1, > we get > > [2*ln(1+sqrt(2)) + 1]/3 = about 0.9209 > > for the mean distance to an acute vertex. Given that distances > within the > triangle range from 0 to sqrt(2) = 1.4142, and most of the triangle lies > far from the vertex, this result looks about right. > > We see now that there's a good reason your high school geometry > book (nor, > for that matter, your calculus book) did not enter into questions of mean > distances between sets of points... > > F
|