how to identify unique columns, based on the first two rows, make an average of the uniques columns and remove the duplicate columns?

Question

ehsan el 11 de Jun. de 2019

0
Enlazar

Enlace directo a esta pregunta

https://es.mathworks.com/matlabcentral/answers/466522-how-to-identify-unique-columns-based-on-the-first-two-rows-make-an-average-of-the-uniques-columns

Respondida: Peter Perkins el 12 de Jun. de 2019

I have a matrix with the size of 572 x 790. The first two rows of the matrixes are Longitude and Latitude for some stations, and the other rows are the measurments values. Now I would like to calculate the mean value of the two or more columns that have similar Lon and Lat (the first to rows). Then delete those columns' values and insert the derived mean value instead of them.

05	9.75	9.85	9.85	9.75	9.55    9.85
15	47.15	47.25	47.25	47.25	47.34   47.25
0	0	0	0	0        2.56
215	5.599	0	0	0	0        0
547	0.553	2.058	2.058	1.422	3.063    2.664
94	8.98	8.295	8.295	7.058	4.011    9.24
0	0	0	0	0        1
0	0	0	0	0        0
0	0	0	0	0        2.5
0	0	0	0	0        0
80	2.15	1.02	1.029	0.164	0.656    0.86
97	7.93	1.57	1.57	8.31	3.53    1.22
64	0.36	0.30	0.30	0.21	0.10    0
...   ...      ...     ...     ...    ...      ...

For example in this case the columns number 3, 4, and 7 are unique and I need to make an average .

I am confisung a little bit with unique and accumarray.

I appreciate any help.

0 comentarios
Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

Iniciar sesión para comentar.

Iniciar sesión para responder a esta pregunta.

Answer 1

Star Strider el 11 de Jun. de 2019

0
Enlazar

Enlace directo a esta respuesta

https://es.mathworks.com/matlabcentral/answers/466522-how-to-identify-unique-columns-based-on-the-first-two-rows-make-an-average-of-the-uniques-columns#answer_378725

Abrir en MATLAB Online

With ‘M’ as your matrix:

M = [10.0500    9.7500    9.8500    9.8500    9.7500    9.5500    9.8500
     47.1500   47.1500   47.2500   47.2500   47.2500   47.3400   47.2500
           0         0         0         0         0         0    2.5600
                                       . . .                             ];                                   

Use uniquetol (or unique), splitapply, cellfun and cell2mat to get:

[Uc,~,idx] = uniquetol(M([1 2],:)', 0.001, 'ByRows',1)              % Choose A Function
[Uc,~,idx] = unique(M([1 2],:)',  'rows')                           % Choose A Function
colcolct = splitapply(@(x){x}, M', idx);                            % Collect Columns (By Row Of Transposed Matrix)
colmeans = cellfun(@(x)mean(x,1), colcolct, 'Uni',0)                % Column Means (By Row Of Transposed Matrix)
Out = cell2mat(colmeans)'                                           % Convert To Matrix & Transpose

producing:

Out =
    9.5500    9.7500    9.7500    9.8500   10.0500
   47.3400   47.1500   47.2500   47.2500   47.1500
         0         0         0    0.8533         0
         0    5.5990         0         0    1.2150
    3.0630    0.5530    1.4220    2.2600    0.5470
    4.0110    8.9800    7.0580    8.6100   10.9400
         0         0         0    0.3333         0
         0         0         0         0         0
         0         0         0    0.8333         0
         0         0         0         0         0
    0.6560    2.1500    0.1640    0.9697    4.8000
    3.5300    7.9300    8.3100    1.4533    0.9700
    0.1000    0.3600    0.2100    0.2000    1.6400

I could not entice splitapply to both group the columns and calculate the means, explaining the cellfun call.

6 comentarios
Mostrar 4 comentarios más antiguosOcultar 4 comentarios más antiguos

Star Strider el 11 de Jun. de 2019

Our individual and collective pleasures!

Guillaume el 11 de Jun. de 2019

As I've shown in my answer, you certainly don't need to go through cell arrays with splitapply. Using cell arrays is going to significantly slow down the code.

Iniciar sesión para comentar.

Answer 2

Guillaume el 11 de Jun. de 2019

2
Enlazar

Enlace directo a esta respuesta

https://es.mathworks.com/matlabcentral/answers/466522-how-to-identify-unique-columns-based-on-the-first-two-rows-make-an-average-of-the-uniques-columns#answer_378727

Abrir en MATLAB Online

using unique and accumarray:

[latlong, ~, group] = unique(yourmatrix(:, [1, 2]), 'rows');   %optionally use the 'stable' option.
[destrow, destcol] = ndgrid(group, 3:size(yourmatrix, 2));
newmatrix = accumarray([destrow(:), destcol(:)], reshape(yourmatrix(:, 3:end), 1, []), [], @mean)
newmatrix(:, [1, 2]) = latlong

using the newer findgroups and splitapply:

[group, lat, long] = findgroups(yourmatrix(:, 1), yourmatrix(:, 2));
newmatrix = splitapply(@(rows) mean(rows, 1), yourmatrix(:, 3:end), group);
newmatrix = [lat, long, newmatrix]

But you might be better off converting your matrix into a table, particularly if columns 3:end represent different things. Using a table, what you want is very straightforward with either varfun or groupsummary:

latlongdata = array2table(yourmatrix, 'VariableNames', [{'latitude', 'longitude'}, compose('var%d', 1:size(yourmatrix, 2)-2)]);  %obviously use better variable names than var1...varn for the remaining columns
newdata = varfun(@mean, latlongdata, 'GroupingVariables', {'latitude', 'longitude'})
%or
newdata = groupsummary(latlongdata, {'latitude', 'longitude'}, 'mean');

groupsummary lets you calculate several statistics at once (e.g. mean, std, max, min, sum) whereas varfun can only calculate one.

0 comentarios
Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

Iniciar sesión para comentar.

Answer 3

Peter Perkins el 12 de Jun. de 2019

0
Enlazar

Enlace directo a esta respuesta

https://es.mathworks.com/matlabcentral/answers/466522-how-to-identify-unique-columns-based-on-the-first-two-rows-make-an-average-of-the-uniques-columns#answer_378890

Abrir en MATLAB Online

You may have a reason for wanting row-oriented data, but I'm gonna suggest that you transpose your data, create a table, and use a grouped varfun. It's a one-liner:

>> T = array2table(X); T.Properties.VariableNames(1:2) = ["Lat" "Lon"]
T =
  7×13 table
     Lat      Lon      X3      X4       X5       X6      X7    X8    X9     X10     X11     X12     X13 
    _____    _____    ____    _____    _____    _____    __    __    ___    ___    _____    ____    ____
    10.05    47.15       0    1.215    0.547    10.94    0     0       0     0       4.8    0.97    1.64
     9.75    47.15       0    5.599    0.553     8.98    0     0       0     0      2.15    7.93    0.36
     9.85    47.25       0        0    2.058    8.295    0     0       0     0      1.02    1.57     0.3
     9.85    47.25       0        0    2.058    8.295    0     0       0     0     1.029    1.57     0.3
     9.75    47.25       0        0    1.422    7.058    0     0       0     0     0.164    8.31    0.21
     9.55    47.34       0        0    3.063    4.011    0     0       0     0     0.656    3.53     0.1
     9.85    47.25    2.56        0    2.664     9.24    1     0     2.5     0      0.86    1.22       0
>> T2 = varfun(@(x) repmat(mean(x),size(x)),T,'GroupingVariables',["Lat" "Lon"])
T2 =
  7×14 table
     Lat      Lon     GroupCount    Fun_X3     Fun_X4    Fun_X5    Fun_X6    Fun_X7     Fun_X8    Fun_X9     Fun_X10    Fun_X11    Fun_X12    Fun_X13
    _____    _____    __________    _______    ______    ______    ______    _______    ______    _______    _______    _______    _______    _______
     9.55    47.34        1               0        0     3.063     4.011           0      0             0       0         0.656      3.53       0.1  
     9.75    47.15        1               0    5.599     0.553      8.98           0      0             0       0          2.15      7.93      0.36  
     9.75    47.25        1               0        0     1.422     7.058           0      0             0       0         0.164      8.31      0.21  
     9.85    47.25        3         0.85333        0      2.26      8.61     0.33333      0       0.83333       0       0.96967    1.4533       0.2  
     9.85    47.25        3         0.85333        0      2.26      8.61     0.33333      0       0.83333       0       0.96967    1.4533       0.2  
     9.85    47.25        3         0.85333        0      2.26      8.61     0.33333      0       0.83333       0       0.96967    1.4533       0.2  
    10.05    47.15        1               0    1.215     0.547     10.94           0      0             0       0           4.8      0.97      1.64  

I think that does what you're looking for. It seems to match StarStrider's output.

0 comentarios
Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

Iniciar sesión para comentar.

how to identify unique columns, based on the first two rows, make an average of the uniques columns and remove the duplicate columns?

0 comentarios
Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

Respuesta aceptada

6 comentarios
Mostrar 4 comentarios más antiguosOcultar 4 comentarios más antiguos

Más respuestas (2)

0 comentarios
Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

0 comentarios
Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

Ver también

Categorías

Etiquetas

Community Treasure Hunt

how to identify unique columns, based on the first two rows, make an average of the uniques columns and remove the duplicate columns?

0 comentarios Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

Respuesta aceptada

6 comentarios Mostrar 4 comentarios más antiguosOcultar 4 comentarios más antiguos

Más respuestas (2)

0 comentarios Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

0 comentarios Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

Ver también

Categorías

Etiquetas

Community Treasure Hunt

0 comentarios
Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

6 comentarios
Mostrar 4 comentarios más antiguosOcultar 4 comentarios más antiguos

0 comentarios
Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

0 comentarios
Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos