1

Say we have a signal s = [1,2,2,1,3,3,2,2,1,1] or similar (only 1,2,3 can occur). I want to look for a particular pattern, e.g. [1,3,3,2], which can occur anywhere (0 or many times) in the signal. I would like to perform the computation as a convolution with some filter f.

I've convinced myself that f = [100,1000,1000,10] will do the job, i.e. sending the pattern p = [1,3,3,2] to f = 10.^fliplr(p), in matlab syntax. That is, the convolution s*f will take the value 1*10 + 3*1000 + 3*1000 + 2*100 = 6210 if and only if we have the pattern p in the right place, at least as long as the pattern we are looking for is short enough.

For a long enough pattern, e.g. [1,1,1,1,1,1,1,1,1,1,2] (ten 1s followed by a 2), the corresponding filter f = [100,10,..,10] would give a "false positive" for the pattern [2,2,2,2,2,2,2,2,2,2,1], since the convolution would give the values 10*10 + 2*100 = 300 for the first, and 2*10*10 + 1*100 = 300 for the second. (That I needed 10 1s to "trick" the filter is obviously related to the fact that my filter "encodings" of the 1s and 2s differ by a factor 10. Changing the encoding to e.g. {1,2,3} -> {10,1000,100000} should let me handle correspondingly longer patterns).

I have a strong feeling that this is a very standard signal processing task. What is the standard terminology and methods concerning this problem? (I.e. what key words should I google.)

svangen
  • 299
  • 1
  • 7
  • "I would like to perform the computation as a convolution with some filter f" why? – msm Apr 18 '17 at 11:13
  • The reason is that I'm in matlab so I want to avoid for-loops. I thought of convolutions as an operation which is pretty fast and "vectorises" the computations, if I can just choose the appropriate filter. – svangen Apr 18 '17 at 11:25
  • You can use Matlab with other algorithms as well. Look for "pattern matching", "string searching", "sequence alignment", or similar things. – msm Apr 18 '17 at 11:30

2 Answers2

-1

The thing you are looking for is the matched filter, which is essentially the (optimal) linear approach to look for a known pattern in a signal (which may be impacted by additive zero mean noise).

The impulse response (i.e., the sequence you are convolving with) of the matched filter is just the pattern reversed, i.e. $f = [2, 3, 3, 1]$ will do the job. If the input signal is normalized in RMS amplitude, the matthed filter will give maximum output if and only if the pattern occurs.

Andreas H.
  • 1,525
  • Say that I'm looking for the pattern p = [1,1,1,3], so convolve with f = [3,1,1,1]. Then I get the value 12 both for the correct pattern, and the "false positive" [3,3,3,1], right? What am I missing here – svangen Apr 18 '17 at 11:03
  • Yes, this is true. But only because [3,3,3,1] is not normalized in amplitude. Its RMS (root mean square) value is higher than the one for [1,1,1,3]. This needs to be taken into account. So you need to determine the RMS value for each 4 element bin of the input and take this into account when calculating the threshold after matched filter. – Andreas H. Apr 18 '17 at 11:13
  • Actually, I find your approach of convolving with $[1, M, M^2, M^3, ...]$ quite smart. I would prefer it if it is known that the input data is between 0 and M and contains no noise. Then it gives the unique answers. However it does not work in presence of noise or if there is negative data. – Andreas H. Apr 18 '17 at 11:16
  • On the other hand, if you want to search for a pattern in data that contains no noise, a filter approach may not be the most efficient in the first place. – Andreas H. Apr 18 '17 at 11:17
  • yeah, point taken. Thanks! – svangen Apr 18 '17 at 11:26
-1

Assuming that the pattern $x$ you are looking for consists of $L$ samples (in your example, $x \in \{1,2,3\}^L$), the filtering operation with a filter of length $L$ essentially performs a weighted average of all length-$L$ sequences of the incoming signal with a weighting vector $w \in \mathbb{R}^L$. Therefore, if you want perfect detection, the following condition should hold $$ w^T x \neq w^T y, \text{ for all } y \in \{1,2,3\}^L, y \neq x. $$

The above set of equation can be compactly written as $$ A^Tw=b $$ where $A$ is an $L\times (3^L-1)$ matrix with columns equal to $x-y$, for all $y \in \{1,2,3\}^L, y \neq x$, and $b \in \mathbb{R}^{3^L-1}$ is a column vector with non-zero elements.

Your problem, therefore, boils down to finding such a $w$ (and $b$). Unfortunately, I am not aware of a systematic way to do this (someone else might). You could also try asking the signal processing SE.

Stelios
  • 3,077