BioSPP: Statistical Methods

Distance and Nearest Neighbours

A fundamental concept in the analysis of the point pattern data is distance. The (Euclidean) distance, d(i,j), between points i and j in 3D space with locations (or coordinates) (x_i , y_i , z_i) and (x_j , y_j , z_j) respectively is

d(i,j) = √{( x_i- x_j)² + ( y_i - y_j)²+ ( z_i- z_j)²}

that is, the square-root of the squared differences between the x, y and z coordinates.

For a set of n data points corresponding to object locations, we have a total of N=n(n-1)/2 distances between objects taken pairwise. We can arrange the distances into a symmetric n x n matrix, D, with (i,j)^th element d(i,j) and zeros on the diagonal.

The i^throw of D contains the distances from object i to all other objects; the nearest neighbour of i is the object whose distance from i is smallest. That is, the nearest neighbour of i is denoted η_i and defined mathematically by

η_i = argmin_j d(i,j)

and the nearest neighbour distance is denoted δ_i and defined by

δ_i = d(i,η_i) = min_j d(i,j)

that is, the smallest observed distance from object i.

Thus, for any data set of size n, we can extract a set of N inter-point distances, and a set of n nearest neighbour differences. These sets of distances can be used to test our hypotheses of interest.

On to page 4:

Pages 1 2 3 4 5 6 7

Back to the BioSPP Home Page