Median
We filtered our brightness graph and got something like this.
, pick the middle number. If there is no middle, take the average of the middle two.
Lets work out an example. Take a look at the following example
\[ 31, 41, 59, 26, 53, 58, 97, 93, 23, 84 \]
If we sort this sequence we get
\[ 23, 26, 31, 41, 53, 58, 59, 84, 93, 97 \]
Because there are an even number of values, we should take the average of the of the two middle values. The average of \(53\) and \(58\) is \(\frac{53 + 58}{2} = \frac{111}{2} \approx 55.5\).
Make a library.
Because we are going to use the median several times, we are going to create a
library. Let's start with our lib.js
.
In our median.js
we are module.exports
will be a function object.
module.exports = {
median_of: median_of
};
Our median_of
function will have an array as parameter and return the
median. Once we have a sorted copy of the data called copy
, getting the
median comes down to determining if the number of elements is even or odd, and
performing the right calculation.
# #![allow(unused_variables)] #fn main() { const n = data.length; const middle = Math.floor(n / 2); var median; if (n % 2 == 1) { return data[middle] } else { return (data[middle] + data[middle - 1]) / 2.0; } #}
But how do we sort the original data?
Sorting side-quest
There are a few interesting tidbits about sorting an array of floating point
values that we are going to make a side-quest out of it. While looking into
Array.prototype
documentation,
you can come across the method sort
. Let's see if we can use it.
const vs = [3.0, 2.0, 1.0, 11.0];
vs.sort();
Unfortunately this doesn't work as expected.
[ 1, 11, 2, 3 ]
Which could come as a surprise. The sort function first transforms the values to string and then sorts them as Unicode source point.
Luckily we can solve that. We just have to pass a function with which we will sort the values.
const vs = [3.0, 2.0, 1.0. 11.0];
vs.sort(function(a, b){ return b < a; });
This correctly sorts our array. But it also alters our original data.
Copying Data
We need a copy of our data
. Luckily the slice
provides an easy
method to copy. We use it as
const data = values.slice();
This is the final piece in the median puzzle. We are able to put everything
together and write our median_of
function.
Form Groups
We do not want to calculate the median of our entire sequence. Instead we want to move a sliding window over our data and calculate the median of that specific window.
For that we need to group our data. Let's create that an object.
const SlidingWindow = function(size){
this.size = size;
this.window = [];
};
We are creating a SlidingWindow
constructor. This object will keep track of
two things. The size of the sliding window, and the values that it will have
seen. Together with two prototype functions this will provide our intended
functionality.
SlidingWindow.prototype.push = function(value){
this.window.push(value);
if (this.window.length > this.size) {
this.window = this.window.slice(1);
}
};
The push
method will be used to add a new value to the window, maintaining the
invariant of maximum of size
values. Whenever the window grows to large, we
slice off a value.
SlidingWindow.prototype.result = function(){
if (this.window.length === this.size) {
return this.window.slice();
} else {
return null;
}
};
The result
method will return the current window, if it has grown enough to
have the correct size
. Otherwise it will return null
, which will signal the
csv
module that there is no data.
Make sure to add the SlidingWindow
to the module.exports
.
Processing
We are now ready to create a tranformation for our data. In our data we should keep track of two sliding windows. One for the time measurement and one for the brightness measurements.
We collect the median of both and combine them in an array to produce the output of our pipeline. Maybe something like this.
var size = 10;
var times = new SlidingWindow(size);
var values = new SlidingWindow(size);
var transformer = transform(function(data){
const time = parseFloat(data[0]);
const value = parseFloat(data[3]);
times.push(time);
values.push(value);
var tw = times.result();
var vw = value.result();
if (tw != null && vw != null) {
var tm = median_of(tw);
var vm = median_of(vw);
return [tm, vm];
} else {
return null;
}
});
Further Considerations
You have created a library that contains some functions. How do you know that they are implemented correctly? Try to add some tests that increases your confidence in your code.
The SlidingWindow
accepts an window_size
argument. What is a good value?
Why haven't we used same the method we used to detrend the data?