1

I have a sensor that can make thousands of measurements per second, it sends all this measurements ( packed in a single message ) using a messaging protocol over internet to my service, there I need to translate all the data ( it comes in form of bytes ) to human readable value.

The measurements are from different kind of sensors, but I know that a big amount of measurements can sometimes represent so small changes from each other and no spikes at all. Would be a waste of resources to store and analyze all these data. I could reduce $~2000~$ measurements that looks like this:

$$~2331.14 ,~ 2331.13 ,~ 2331.11,~ 2331.12 ,~ 2331.12,~ 2339.52 , ~2339.40, ~4216.51$$

to something like this : $~2331.12, ~2339.52 ,~ 4216.51$

The problem

I can't simply put all this non reduced data into my database, I do pre and post analyzes to this data and would be a waste of resources.

The solution

Reduce the set of data without loosing too much accuracy is the biggest challenge, I thought of doing a "current and previous diff", like :

$x_1~$ is the first measurement, save it anyway

$x_1~$ diff $~x_2 > 0.15~$ ? no, $~x_1~$ is already saved so just forget about $~x_2$

$x_1~$ diff $~x_3 > 0.15~$ ? no, $~x_1~$ is already saved so just forget about $~x_3$

$x_1~$ diff $~x_4 > 0.15~$ ? yes, $~x_1~$ is already saved so just save $~x_4$

$x_4~$ diff $~x_5 > 0.15~$ ? no, $~x_4~$ is already saved so just forget about $~x_5$

But that would drastically reduce the accuracy of the data, as there is the chance to **lose spikes of measurements that slightly increase between each other.

Is there any algorithm or already known method that I can use to archive what I want?

  • So, each measurement gives you a single quantity, correct? Is the order important, or can we freely rearrange the results? – Berci Jul 02 '19 at 07:46
  • So the "small differences" in data you are talking about are slightly different real numbers and not, for example, numbers that only differ at a few bits in binary representation? – Dirk Jul 02 '19 at 07:55
  • Do you have a proper definition of "spike"? – denklo Jul 02 '19 at 07:59
  • @Berci, yes, the order is important, thats the way i have to match them with the timestamps – Bruno Cerk Jul 02 '19 at 14:48
  • @Dirk well, that also fits. But I was really saying about the decimal representation, 1234.4 1234.5, the diff is too small, but that's only examples, in the real world, my measurements are close to : 1.542 to 14.543 – Bruno Cerk Jul 02 '19 at 14:53
  • @denklo Just like in a graph, the measurements goes linear, but suddenly, the sensor detects big changes, I'll have a spike in the graph. The thing is that maybe the increase is not suddenly. Think of temperature, it increases slowly on normal situations, I also need to know when this happens, not when just I get a suddenly increase on the measurements – Bruno Cerk Jul 02 '19 at 14:56

0 Answers0