Subj : Re: Locating Nearest Neighbors in space (fast)
To   : comp.programming
From : Randy
Date : Fri Sep 09 2005 01:37 pm

KRK wrote:
> I like the idea about partitioning the space into several cubes.  You could 
> do that and move recursively to smaller and smaller cubes.  I think I will 
> try that.  As far as triangulation goes, I know what that is and use it with 
> matlab, but I don't have time to look into it and write code, test it, 
> etc... to do the triangulation for my particular problem.

Quadtrees are essentially the same idea.  Although the data structure is
fairly quick to use, you do spend time traversing pointers, as well as
building the tree.

> 
> I never heard of a 'Manhattan Distance'  it sounds reasonable to try.  It 
> would be interesting to see how fast one could make that method, and if you 
> could make it 100% accurate (I don't think so, but am not sure).  

It's not as accurate as sqrt(diffx^2 + diffy^2 + diffz^2).  But since
manhattan is always larger than euclidean distance, it's a useful upper
bound on excluding points from the search space.  And it's blazingly
fast on RISC CPUs.

> (By the 
> way, I would never dream of using sqrt() to compare distances in a large 
> problem, unless of course there was some reason it would be necessary for 
> accuracy -- i.e. the squares were very large numbers)

True.

> 
> One thing about this Manhattan distance.  Wouldn't it be better to use 
> abs(z2-z1) + abs(y2-y1) + abs(x2-x1).  Of course you would have to code your 
> own abs() function as a macro (can that be done efficiently??), or you would 
> be making to many calls.

Yes.  Without abs(), your differences too often will be incorrect.

Can you implement abs() efficiently?  Probably not more efficiently than
 the compiler or math library can.  A good compiler suite should use a
built-in implementation for abs rather than a function call.  I have no
idea if they do, however.

For the IEEE754 floating point number representation, the first bit is
the sign bit.  To convert a float to positive, you could just mangle the
value by ANDing the number with a binary mask that zeros out the first
bit, like this:

pos_number = (float) ((int) 0x8FFFFFFF & (int) (x2 - x1));  // abs

However, this code does not produce the desired result.  As C coerces
each variable's type, it also coerces its value.  Which is NOT what we
want.  I'd stick to using the native abs (fabs, actually).

I just checked GCC, and it uses a built-in for fabs -- no function call.
Here's what it looks like:

andb    $127, 3(%eax)

This ANDs 127 with the third byte of the float variable to zero out its
high bit.  For a double, it would be:

andb    $127, 7(%eax).

> 
> This is a nice newsgroup, lots of good responses, and quickly!

Maybe, but are the responses fast AND correct?  :-)

> 
> KRK 

    Randy

-- 
Randy Crawford   http://www.ruf.rice.edu/~rand   rand AT rice DOT edu

"Overstatement sucks." -- William of Ockham

.