moving median with variable window

Is there any way how to effectively generalize movmedian function to work with variable window length or local variable k-point median values, where k is vector with the same length as length of input vector (lenght(x) = lenght(k))?
Example:
x = 1:6
k = 2,3,3,5,3,2
M = movmedian_vk(x,k)
M = 1, 2, 3, 4, 5, 5.5
My naive solution looks like:
function M = movmedian_vk(x,k)
if length(k) ~= length(x)
error('Incomaptible input data')
end
M = zeros(size(x));
[uk,~,ck] = unique(k);
for i = 1:length(uk)
M_i = movmedian(x,uk(i));
I_i = (ck == i);
M(I_i) = M_i(I_i);
end
end

2 comentarios

Image Analyst
Image Analyst el 23 de Sept. de 2024
Can you explain the use case? Why do you want to do this?
Michal
Michal el 24 de Sept. de 2024
Editada: Michal el 24 de Sept. de 2024
Robust and effective trend extraction in a case of a priori known 1-D signal parts with high slope changes (typicaly by active control of dynamic system). Median filter then used short window in a case of active control and long windows in opposite case.
But after some additional test I learned that this naive approch is not suitable for reliable trend estimation.
Anyway, I will be very happy for any hint how to apply robust median filter on my use case, where separate parts of signal shoud be filtered with different filter windows (something like weighting).

Iniciar sesión para comentar.

Respuestas (3)

Bruno Luong
Bruno Luong el 23 de Sept. de 2024
Editada: Bruno Luong el 23 de Sept. de 2024
One way (for k not very large)
x = 1:6
x = 1×6
1 2 3 4 5 6
<mw-icon class=""></mw-icon>
<mw-icon class=""></mw-icon>
k = [2,3,4,5,3,2]; % Note: I change k(3) to 4
winmedian(x,k)
ans = 1×6
1.0000 2.0000 2.5000 4.0000 5.0000 5.5000
<mw-icon class=""></mw-icon>
<mw-icon class=""></mw-icon>
function mx = winmedian(x,k)
x = reshape(x, 1, []);
k = reshape(k, 1, []);
K = max(k);
p = floor(K/2);
q = K-p;
qm1 = q-1;
r = [x(q:end), nan(1,qm1)];
c = [nan(1,p), x(1:q)];
X = hankel(c,r);
i = (-p:qm1).';
kb = floor(k/2);
kf = k-1-kb;
mask = i < -kb | i > kf;
X(mask) = NaN;
mx = median(X,1,'omitnan');
end

7 comentarios

John D'Errico
John D'Errico el 23 de Sept. de 2024
This was essentially the solution I would have proposed. Is it an improvement? That may depend on how much variety there is in k. Also, for larger values of k, and long vectors, the arrays generated could be large. So unless the simple looped code is a problem, it might not be worth doing. But as I said, @Bruno Luong wrote exactly what I was going to write.
Michal
Michal el 24 de Sept. de 2024
Editada: Michal el 24 de Sept. de 2024
x = rand(1,1e5);
k = 8000*ones(1,1e5);
k(20000:30000) =50;
k(18000:20000) =200;
k(30000:32000) =200;
tic;M1 = movmedian_vk(x,k);time1 = toc
tic;M2 = winmedian(x,k);time2 = toc
isequal(M1,M2)
with this result:
time1 = 0.0107
time2 = 9.5687
ans =
logical
1
So for longer input signals, even in a case of a few unique k values, is your code ~562x slower than my naive code.
Michal
Michal el 24 de Sept. de 2024
Editada: Michal el 24 de Sept. de 2024
@John D'Errico Yes, you are right!!! For longer input vectors and large widths k(i) is @Bruno Luong code significantly slower. And @Matt J code is out of memory (:
Matt J
Matt J el 24 de Sept. de 2024
Editada: Matt J el 24 de Sept. de 2024
So for longer input signals, even in a case of a few unique k values, is your code 900x slower than my naive code.
But will that always be the case? And how large can the window widths k(i) get? Bruno stipulated that his solution was for the case when the window widths were not too large, and mine is the same.
Michal
Michal el 24 de Sept. de 2024
Editada: Michal el 24 de Sept. de 2024
@Matt J OK, so k(i) must be limited up to some uknown small widths, but significantly smaller than were used in my test? I definitely need to work with large widths.
But, even for significanltly smaller widths k(i) are the test results for Bruno and Matt codes still slower:
k = 8*ones(1,1e5);
x = rand(1,1e5);
k(20000:30000) =2;
k(18000:20000) =5;
k(30000:32000) =5;
time1 = 0.0069 % My naive code
time2 = 0.0104 % Bruno's code
time3 = 0.0276 % Matt's code
So ... ???
Matt J
Matt J el 24 de Sept. de 2024
Editada: Matt J el 24 de Sept. de 2024
When there are a small number of unique k(i), yes, yours is best. However, more generally, Bruno's is faster:
k = randi(30,1,1e5);
x = rand(1,1e5);
timeit(@() winmedianMichal(x,k))
ans = 0.0579
timeit(@() winmedianBruno(x,k))
ans = 0.0371
function M = winmedianMichal(x,k)
if length(k) ~= length(x)
error('Incomaptible input data')
end
M = zeros(size(x));
[uk,~,ck] = unique(k);
for i = 1:length(uk)
M_i = movmedian(x,uk(i));
I_i = (ck == i);
M(I_i) = M_i(I_i);
end
end
function mx = winmedianBruno(x,k)
x = reshape(x, 1, []);
k = reshape(k, 1, []);
K = max(k);
p = floor(K/2);
q = K-p;
qm1 = q-1;
r = [x(q:end), nan(1,qm1)];
c = [nan(1,p), x(1:q)];
X = hankel(c,r);
i = (-p:qm1).';
kb = floor(k/2);
kf = k-1-kb;
mask = i < -kb | i > kf;
X(mask) = NaN;
mx = median(X,1,'omitnan');
end
Michal
Michal el 24 de Sept. de 2024
@Matt J Good point...

Iniciar sesión para comentar.

Matt J
Matt J el 23 de Sept. de 2024
Editada: Matt J el 24 de Sept. de 2024
x = rand(1,6)
x = 1×6
0.1034 0.9884 0.6244 0.0233 0.2999 0.6556
<mw-icon class=""></mw-icon>
<mw-icon class=""></mw-icon>
k = [2,3,3,5,3,2];
n=numel(x);
J=repelem(1:n,k);
I0=1:numel(J);
splitMean=@(vals,G) (accumarray(G(:),vals(:))./accumarray(G(:),ones(numel(vals),1)))';
cc=repelem( round(splitMean( I0,J )) ,k);
zz=min(max(I0-cc+J+1,1),n+2);
vals=[nan,x,nan];
vals=vals(zz);
I=I0-repelem( find(diff([0,J]))-1 ,k);
X=accumarray([I(:),J(:)], vals(:), [max(k),n],[],nan);
M = median(X,1,'omitnan')
M = 1×6
0.1034 0.6244 0.6244 0.6244 0.2999 0.4777
<mw-icon class=""></mw-icon>
<mw-icon class=""></mw-icon>
Matt J
Matt J el 24 de Sept. de 2024
Editada: Matt J el 24 de Sept. de 2024
Anyway, I will be very happy for any hint how to apply robust median filter on my use case, where separate parts of signal shoud be filtered with different filter windows (something like weighting).
If your movmedian windows are simply varying over a small sequence of consecutive intervals, then the code below shows a little bit of speed-up. It won't give the exact same output near the break points between intervals, but it should be fairly close.
x = rand(1,1e5);
k = 8000*ones(1,1e5);
k(20000:30000) =50;
k(18000:20000) =200;
k(30000:32000) =200;
timeit(@() winmedianMichal(x,k))
ans = 0.0134
timeit(@() winmedianMatt(x,k))
ans = 0.0097
function M = winmedianMichal(x,k)
if length(k) ~= length(x)
error('Incomaptible input data')
end
M = zeros(size(x));
[uk,~,ck] = unique(k);
for i = 1:length(uk)
M_i = movmedian(x,uk(i));
I_i = (ck == i);
M(I_i) = M_i(I_i);
end
end
%Requires download of groupConsec
%https://www.mathworks.com/matlabcentral/fileexchange/78008-tools-for-processing-consecutive-repetitions-in-vectors
function M = winmedianMatt(x,k)
M=splitapply(@(a,b){movmedian(a,b(1))}, x,k, groupConsec(k));
M=[M{:}];
end

5 comentarios

Bruno Luong
Bruno Luong el 24 de Sept. de 2024
Editada: Bruno Luong el 24 de Sept. de 2024
M=splitapply(@(a,b){movmedian(a,b(1))}, x,k, groupConsec(k));
Not sure if you have to extend subgroup of x by more or less k/2 on each sides to avoid boundary truncation of each group.
Test shows the results do not match, I strongly suspect thtis is the cause:
x = rand(1,1e5);
k = 8000*ones(1,1e5);
k(20000:30000) =50;
k(18000:20000) =200;
k(30000:32000) =200;
M_Matt = winmedianMatt(x,k);
M_Michael = winmedianMichal(x,k);
isequal(M_Michael,M_Matt)
ans =
logical
0
Matt J
Matt J el 24 de Sept. de 2024
Correct:
"It won't give the exact same output near the break points between intervals, but it should be fairly close."
Bruno Luong
Bruno Luong el 25 de Sept. de 2024
Editada: Bruno Luong el 25 de Sept. de 2024
It is not stated which algorithm movmedian uses in the official document. Or at least I can't see it.
I assume it is somewhat based on https://en.wikipedia.org/wiki/Quickselect to select the two median elements in the sliding windows. The average complexity is n * k; where n is the length of x for fixed sliding windows size k.
As your use case k is quite large, some other algorithms exist and are better such as algorithm based on histogram. The complexity is log(r), where r is the histogram resolution. It is not difficult to implement with constant resolution, the resolution being selected to be suitable for a specific goal. The idea is to kep track the histogram and update it by binary search when old element is removed and new element is inserted.
For more riguruous implementation, see Weiss paper. For k up to 8000, I would suggest to investigate to such algorithm rather than using MATLAB stock function it time is critical.
Michal
Michal el 25 de Sept. de 2024
@Bruno Luong Could you add the reference on the Weiss paper?
Bruno Luong
Bruno Luong el 25 de Sept. de 2024
Done; somehow this Answers forum and firefox have issue when I edit it, must use another browser.

Iniciar sesión para comentar.

Categorías

Más información sobre Creating and Concatenating Matrices en Centro de ayuda y File Exchange.

Productos

Versión

R2024b

Etiquetas

Preguntada:

el 23 de Sept. de 2024

Comentada:

el 25 de Sept. de 2024

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by