3

I understand that I am using the terminology incorrectly because distance is defined between two objects while my question refers to $m$ objects. However, I don't know the correct name for this, so I use the word distance. Corrections are welcomed.

Assume that there are $m$ sequences of $n$ real numbers: $$x_1, ..., x_n$$ $$y_1, ..., y_n$$ $$...$$ $$z_1, ..., z_n$$ How to find such constants $\lambda, \mu, ..., \nu$ that when shifted, i.e. $x_i \to x_i+\lambda,\; y_i \to y_i+\mu,\; ...,\; z_i \to z_i +\nu\;$ for $i = 1, ..., n$, all the sequences become as close as possible to each other? The distance between two sequences $x_1, ..., x_n$ and $y_1, ..., y_n$ is $s=\sum\limits_{i=1}^n(x_i-y_i)^2$

Graphically, the original sequences may look like

enter image description here

and the shifted sequences are as follows:

enter image description here

I can calculate the shifting constants for two sequences, $x_1, ..., x_n$ and $y_1, ..., y_n$. To minimize $s=\sum\limits_{i=1}^n(x_i+\lambda-y_i-\mu)^2$, I differentiate $s$ with respect to either $\lambda$ or $\mu$, which gives $$\sum\limits_{i=1}^n x_i+n\lambda-\sum\limits_{i=1}^n y_i-n\mu=0$$ This equation is satisfied for the following $\lambda$ and $\mu$: $$\lambda=-\frac{1}{n}\sum\limits_{i=1}^n x_i,\;\;\;\mu=-\frac{1}{n}\sum\limits_{i=1}^n y_i$$ This means that I bring the two sequences as close as possible to each other if I subtract their averages from them. What to do when there are more than two sequences?

Looking at the pictures above, I need to shift the original sequences in such a way that the spread between the points $x_1,y_1,z_1$ + the spread between the points $x_2,y_2,z_2$ + ... + the spread between the points $x_n,y_n,z_n$ is minimized. The problem is that for each $i$, the order of $x_i,y_i,z_i$ changes. For example, $x_1<y_1<z_1$ but $y_2<z_2<x_2$ in the picture, so I don't know in advance the borders of the segments that I need to minimize.

For $m$ sequences $\{x_i\},\;\{y_i\},\;...,\{z_i\}$, I only need to know the distances between $m_i$ = $min(x_i,y_i,...,z_i)$ and $M_i$ = $max(x_i,y_i,...,z_i)$ at each $i$, and I don't care how the points are distributed between $m_i$ and $M_i$. So, this is similar to the calculation of the distance between two sequences.

Will the sequences be as close to each other as possible if I subtract their averages from them, that is

$$x_i \to x_i-\frac{1}{n}\sum\limits_{j=1}^n x_j,\;\;\;y_i \to y_i-\frac{1}{n}\sum\limits_{j=1}^n y_j,\;\;\;...,\;\;\;z_i \to z_i-\frac{1}{n}\sum\limits_{j=1}^n z_j$$

or these shifts are not optimal for $m$ sequences? How to prove it?

  • The notion of distance you are using is frequently called the "Euclidean norm" between the vectors $(x_i){1\leq i\leq n}$ and $(y_i){1\leq i\leq n}$ in $\mathbb{R}^n$. What are you trying to minimize in the general case when you have ${x_i}{1\leq i\leq n}$, ${y_i}{1\leq i\leq n}$, $\ldots$, ${z_i}_{1\leq i\leq n}$? Is it the sum of all pair-wise distances? – Zim Jun 18 '20 at 04:04
  • 1
    @Zim I am trying to minimize $\sum\limits_{i=1}^n(M_i-m_i)^2$, where $M_i=\max(x_i,y_i,...z_i)$, and $m_i=\min(x_i,y_i,...z_i)$ – Vladislav Gladkikh Jun 18 '20 at 05:18

0 Answers0