• Welcome to the Internet Infidels Discussion Board.

Average Distance between random points

SLD

Contributor
Joined
Feb 25, 2001
Messages
6,449
Location
Birmingham, Alabama
Basic Beliefs
Freethinker
Ok, take a disk of radius 10. Drop 10 random points throughout the disk. Just close your eyes and do it. No bias.

What’s the average distance between the 10 points?

And how do you do this?
 
Well, I have an integral:
10./(2*Pi)*integ(sqrt(r1^2 + r2^2 -2*r1*r2*cos(phi)))*d(r1^2)*d(r2^2)*d(phi))

r1=[0,1]
r2=[0.1]
phi=[0,2*Pi]

Number of point is irrelevant, 2 points is enough to have distance between 2 random points. The rest is just geometry and integration.
 
Ok, take a disk of radius 10. Drop 10 random points throughout the disk. Just close your eyes and do it. No bias.

What’s the average distance between the 10 points?

And how do you do this?

Good question. Assuming "No bias" means equal probability density at every point, the simplest way is to drop about 14 random points throughout the circumscribed square and discard the ones that land outside the circle.

r1=[0,1]
r2=[0.1]
phi=[0,2*Pi]

That's going to bias the distribution, with the center region of the circle more densely packed with points than near the periphery.
 
What if you did it through rectilinear coordinates or polar coordinates, will it be different?
 
I did this problem with Mathematica. For radius = 1, the solution is (128)/(45*Pi)

I did Integrate three times, one for each variable, then I checked what I did with NIntegrate. The numerical value of this quantity is 0.905415
 
Obviously this question is more complicated than I originally envisioned. If it’s easier on a square, that’s fine. My first thought was to simply divide the area by ten, but that’s not right. With a radius of 20, the average distance is greater than the diameter. Nope.

I was doing this to figure out the average distance to our closest technological civilization in the galaxy. Assuming that there are 10 such stars in our galaxy of 185,000 light years diameter, where’s the closest?

In our thread discussing this I came up with 29,059. But I don’t think that’s right.
 
Obviously this question is more complicated than I originally envisioned. If it’s easier on a square, that’s fine. My first thought was to simply divide the area by ten, but that’s not right. With a radius of 20, the average distance is greater than the diameter. Nope.
Oh, you need a number, not an expression
I was doing this to figure out the average distance to our closest technological civilization in the galaxy. Assuming that there are 10 such stars in our galaxy of 185,000 light years diameter, where’s the closest?

In our thread discussing this I came up with 29,059. But I don’t think that’s right.
That's a different problem. Average distance between civilizations is much higher than average distance to the closest one.
30,000 l.y. is about right
 
I was doing this to figure out the average distance to our closest technological civilization in the galaxy. Assuming that there are 10 such stars in our galaxy of 185,000 light years diameter, where’s the closest?
A naive way of doing such calculations: find the number of civs per unit disk area, then find the distance as (civs/area)^(-1/2). Likewise for volumes: (civs/volume)^(1/3)

All that a more precise calculation will do is multiply that number by some number near 1, likely in range 1/2 to 2.

This is a "back of the envelope" sort of calculation, and one can go remarkably far with such calculations.
 
Obviously this question is more complicated than I originally envisioned. If it’s easier on a square, that’s fine. My first thought was to simply divide the area by ten, but that’s not right. With a radius of 20, the average distance is greater than the diameter. Nope.
Oh, you need a number, not an expression
I was doing this to figure out the average distance to our closest technological civilization in the galaxy. Assuming that there are 10 such stars in our galaxy of 185,000 light years diameter, where’s the closest?

In our thread discussing this I came up with 29,059. But I don’t think that’s right.
That's a different problem. Average distance between civilizations is much higher than average distance to the closest one.
30,000 l.y. is about right

I am interested in the mathematics of it. But you’re right it’s two different problems.
 
I was doing this to figure out the average distance to our closest technological civilization in the galaxy. Assuming that there are 10 such stars in our galaxy of 185,000 light years diameter, where’s the closest?
A naive way of doing such calculations: find the number of civs per unit disk area, then find the distance as (civs/area)^(-1/2). Likewise for volumes: (civs/volume)^(1/3)

All that a more precise calculation will do is multiply that number by some number near 1, likely in range 1/2 to 2.

This is a "back of the envelope" sort of calculation, and one can go remarkably far with such calculations.
It's 2
average distance to the closest star is 2*sqrt(1./denisty)
this is for 2 dimensions.
for 3 it will be a different constant.
 
r1=[0,1]
r2=[0.1]
phi=[0,2*Pi]

That's going to bias the distribution, with the center region of the circle more densely packed with points than near the periphery.

Look again. Tricky Mr. Barbos wrote "d(r1^2)*d(r2^2)" where most of us would have formulated with the more prosaic "d(r1)*d(r2)."

Ok, take a disk of radius 10. Drop 10 random points throughout the disk. Just close your eyes and do it. No bias.

What’s the average distance between the 10 points?

And how do you do this?

Good question. Assuming "No bias" means equal probability density at every point, the simplest way is to drop about 14 random points throughout the circumscribed square and discard the ones that land outside the circle.
You'll only need 12.7 points on average to get 10 points on the disk this way. (Though you might get unlucky and need many more.) This 21.2% wastage of variates is trivial.

For the same task on a 3-ball, that method will waste 47.6%, 5-ball 83.6%, 9-ball 99.4%; for a 15-ball 99.999% of your variates will be discarded.  Volume_of_an_n-ball

For a 59-ball, which may arise in some applications — the dimensions need not be spatial — a whopping 99.9999999999999999999999999999999983% of points in a 59-D hypercube lie outside the inscribed 59-D hyperball. An alternative approach is needed! I do it with 59 Gaussian variates and a 60th random number (p^(1/59)) to normalize distance to origin.
 
I was doing this to figure out the average distance to our closest technological civilization in the galaxy. Assuming that there are 10 such stars in our galaxy of 185,000 light years diameter, where’s the closest?

In our thread discussing this I came up with 29,059. But I don’t think that’s right.
That's a different problem. Average distance between civilizations is much higher than average distance to the closest one.
30,000 l.y. is about right
The average distance from a civilization to the closest other one is 29K and change, yes; but that's not the same as the average distance to our closest technological civilization, because we aren't at a random unknown position in the galaxy. We're 27K l.y. from the center. Assuming uniform probability distribution of the other nine, my program gets a 26,250 l.y. average distance to the closest one to us. (Of course we shouldn't assume uniform probability distribution of the other nine; it would be better to plug in the actual density of stars and the actual thickness of the galactic disk as a function of radius. Then again, maybe we should assume everybody is at roughly 27,000 l.y. radius, since further out the stars are scattered thinly, and closer in there's so much dangerous astrophysics going on nearby that the mass extinctions come too often for anyone to develop technological civilization...)
 
Look again. Tricky Mr. Barbos wrote "d(r1^2)*d(r2^2)" where most of us would have formulated with the more prosaic "d(r1)*d(r2)."
So he did. Good catch.

Good question. Assuming "No bias" means equal probability density at every point, the simplest way is to drop about 14 random points throughout the circumscribed square and discard the ones that land outside the circle.
You'll only need 12.7 points on average to get 10 points on the disk this way.
12.7 is about 14, for sufficiently large values of "about". :)
 
It's 2
average distance to the closest star is 2*sqrt(1./denisty)
this is for 2 dimensions.
for 3 it will be a different constant.
How does that come about?

The average nearest-neighbor distance is the average of (NNDF)*(density)^(-1/dimension) where NNDF is the nearest-neighbor distribution function.

So to find NNDF, decided to do some random-number simulations. Mean and median:
  • 1: 0.50, 0.34
  • 2: 0.50, 0.46
  • 3: 0.55, 0.55
  • 4: 0.61, 0.61

In these simulations, I used periodic boundary conditions, to avoid boundary effects.
 
The nearest neighbor probability distribution and Lecture 19 - Nearest neighbor distribution

It does the 2D case, but it should be easy to generalize to different numbers of dimensions. Probability distribution w(r) for distance r with number density N:
\( w(r) \, dr = \left( 1 - \int_0^r w(r') \, dr' \right) N \cdot 2\pi r \, dr \)

Take d/dr
\( \frac{dw(r)}{dr} = - N \cdot 2\pi r w(r) + \left( 1 - \int_0^r w(r') \, dr' \right) N \cdot 2\pi = - N \cdot 2\pi r w(r) + \frac{1}{r} w(r) \)
This is easy to integrate:
\( w(r) = w_0 r e^{-N \pi r^2} \)
Making it add up to 1 over all space gives:
\( w(r) = N\cdot 2\pi r e^{-N\pi r^2} \)

For the 1D case,
\( w(r) \, dr = \left( 1 - \int_0^r w(r') \, dr' \right) N \, dr \)

With d/dr,
\( \frac{dw(r)}{dr} = - w(r) N \)

giving
\( w(r) = N e^{-N r} \)
 
Extending to n dimensions, the defining equation becomes
\( w(r) = \left( 1 - \int_0^r w(r') \, dr' \right) N \cdot n V_n r^{n-1} \)

where V(n) is the angular volume factor: V(1) = 1, V(2) = pi, V(3) = 4*pi/3, etc.
\( V_n = \frac{1}{n} \cdot \frac{2\pi^{n/2}}{\Gamma(n/2)} = \frac{\pi^{n/2}}{\Gamma(n/2+1)} \)

This expression gives V(1) = 2 instead of 1, but that's for extending in both directions instead of in only one direction from zero.

One can solve this equation in the same way as the others, and one finds
\( w(r) = N \cdot n V_n r^{n-1} e^{- N \cdot V_n r^n} \)

I will now try to calculate the mean and median.

The median is easy.
\( \int_r^{\infty} w(r) \, dr = e^{- N \cdot V_n r^n} \)

\( r_{median} = \left( \frac{\ln 2}{N \cdot V_n} \right)^{1/n} \)

The mean is more difficult. One must average r over the distribution:
\( r_{mean} = \int_0^{\infty} N \cdot n V_n r^n e^{- N \cdot V_n r^n} \, dr \)

Change variables from r to u:
\( r = \left( \frac{u}{N \cdot V_n} \right)^{1/n} \)

That gives us
\( r_{mean} = \left( \frac{1}{N \cdot V_n} \right)^{1/n} \int_0^{\infty} u^{1/n} e^{- u} \, du = \left( \frac{1}{N \cdot V_n} \right)^{1/n} \Gamma(1+1/n)\)
 
The numbers for my analytic expressions for the mean and median:
  • 1: 1, 0.693
  • 2: 0.5, 0.470
  • 3: 0.554, 0.549
  • 4: 0.608 0.612
I estimate an asymptotic limit for number of dimensions n:
\( \sqrt{ \frac{n}{2 \pi e} }\)

Evaluated for n = 1 to 4: 0.242, 0.342, 0.419, 0.483
 
It's 2
average distance to the closest star is 2*sqrt(1./denisty)
this is for 2 dimensions.
for 3 it will be a different constant.
How does that come about?



Code:
double min_dist(int nstar)
{
  nstar--;
  double mdist=10000;
  while(nstar--){
    double x=drand48()-0.5;
    double y=drand48()-0.5;
    double dist=sqrt(x*x+y*y);
    if(dist<mdist)mdist=dist;
  }
  return mdist;
}
double aver_min_dist(int tries, int nstars)
{
  double dt=(double)tries;
  double dist=0;
  while(tries--)dist+=min_dist(nstars);
  dist/=dt;
  double dd2=pow(1./(double)nstars,1./2);
  
  
  printf("Stars=%d aver_min=%g  sqrt(1./density)*%g\n",nstars,dist,dd2/dist);
  return dist;
}
int main()
{
  aver_min_dist(1000000,1000);
  for(int n=2;n <= 50;n++)aver_min_dist(10000000,n);
}
 
Back
Top Bottom