1

I currently have a series of points in $n$-dimensional space and a series of weights.

I wish to calculate the weighted distance between two points, using the following formula: $$d(a,b) = \sqrt{\sum_{i=0}^n w_i \left(X_i(a) - X_i(b)\right)^2}$$

where $w_i$ are my weights, and $X_i(x)$ is the $i$th coordinate of $x$

This works fine if my weights are all positive but it is possible for my weights to be negative, which means that sometimes the sum is negative and we cannot take the square root to get a real distance.

My first thought was to simply use $$d(a,b) = \sqrt{\left | \sum_{i=0}^n w_i \left(X_i(a)-X_i(b)\right)^2\right| }$$ but this intuitively feels wrong (but I can't work out why).

How can I fix this so I always find a real distance?


I couldn't find any relevant tags except radicals, feel free to add relevant ones!

lioness99a
  • 4,943
  • If you want $d$ to be a metric, don't you need to worry about the triangle inequality? – quasi Sep 25 '18 at 10:08
  • @quasi What do you mean? Can you elaborate a bit more please – lioness99a Sep 25 '18 at 10:09
  • The triangle inequality is the condition $$d(a,b)+d(b,c)\ge d(a,c)$$ for all $a,b,c$. – quasi Sep 25 '18 at 10:10
  • I still can't see how that is relevant? I just want to find the distance between 2 points in $n$-dimensional space – lioness99a Sep 25 '18 at 10:12
  • 1
    The standard distance in $\mathbb{R}^n$ is a metric, i.e., for all $u,v,w$, we have $$d(u,u)=0$$ $$d(u,v)=d(v,u)$$ $$d(u,v)+d(v,w)\ge d(u,w)$$ Your weighted distance function clearly satisfies the first two properties. Are you saying you don't care about the third? – quasi Sep 25 '18 at 10:15

1 Answers1

1

What is your actual application? If you look at how you're going to actually use this "distance", the application might outright tell you what the right thing to do is.


As an example of something to do, this comes up in the theory of special relativity. In that application, what one does is to have two different kinds of distance:

  • Spacelike separation $\sqrt{x^2 + y^2 + z^2 - t^2}$, used when $t^2 < x^2 + y^2 + z^2$
  • Timelike separation $\sqrt{t^2 - x^2 - y^2 - z^2}$, used when $t^2 > x^2 + y^2 + z^2$

These two kinds of separation have qualitatively different interpretations in the actual application (the physical description of space and time).

Incidentally, whichever of the two formulas you use, the value is equal to $$ \sqrt{\left|t^2 - x^2 - y^2 - z^2 \right|}$$ However, this fact usually isn't used. Usually if one wants to consider both kinds of separation in the same formula, one simply adopts the convention that "timelike separation is real and spacelike separation is purely imaginary" and sticks to the formula $\sqrt{t^2 - x^2 - y^2 - z^2}$ for both kinds. (or the other way around)


Another example is to check if, in your application, you might just use the quantity

$$s(a,b) = \sum_{i=0}^n w_i \left(X_i(a) - X_i(b)\right)^2 $$

and don't bother with taking a square root. This obviates your sign problem, and for many purposes this sort of quantity is equally useful to the square root version. Often it's even more useful.

  • I am making a classifier and need to know which case (point in $n$-dimensions) is closest, taking into account feature weights, to my test case (another point in $n$-dimensions) – lioness99a Sep 25 '18 at 10:19
  • @lioness99a: ... and what do you do with the result? Is it better to have $\sum_i w_i (X_i(a) - X_i(b))^2 = -1$ than it is to have $\sum_i w_i (X_i(a) - X_i(b))^2 = 0$? What about a value of $-1$ vs a value of $1$? –  Sep 25 '18 at 10:23
  • Surely distance between 2 points is just a positive integer? Thinking in 2D, say, are the weights not just scaling the axes and bringing points closer together/further away from each other, but in the end there will still a positive distance between them (length of the line joining them) – lioness99a Sep 25 '18 at 10:39