Deleting duplicates based on conditions of multiple columns

Question

Nick el 28 de Dic. de 2020

0
Enlazar

Enlace directo a esta pregunta

https://es.mathworks.com/matlabcentral/answers/703957-deleting-duplicates-based-on-conditions-of-multiple-columns

Respondida: Akash kumar el 31 de Jul. de 2022

Hi,

I have a large dataset (100m rows x 40 columns ) and I would like to delete any row that has duplicates on a few specific columns. See example below:

A = [1 10 4; 1 10 4; 1 11 5; 1 11 5; 1 12 6; 1 12 7; 1 13 8; 2 4 25; 2 10 28; 2 10 28; 3 5 33; 4 25 23; 4 23 24];

I would like to delete all rows where the three columns have duplicate within each specific column. So in this example, row 2, 4 and 9 would be deleted because e.g.

row 1 and 2 have duplicates in each of the three columns and so I'd want to delete one of the two (doesn't matter which one).

I suspect the answer is somewhere along the use of unique and logical indexing but haven't managed to figure it out. Any help would be much appreciated. (I'm using Matlab 2018b)

Thanks

3 comentarios
Mostrar 1 comentario más antiguoOcultar 1 comentario más antiguo

Nick el 28 de Dic. de 2020

Thanks for this but unfortunately, this would work for this sample only I think. The actual dataset has 40 columns and i'd like to remove the rows based on the dupicates of 3 columns only, rather than all.

Nick el 28 de Dic. de 2020

Abrir en MATLAB Online

Just found the answer. This way you can find the unique rows amongst a number of columns (in this case, columns 1, 2 and 3) and then produce the original table without the duplicate values.

[C,ia] = unique(A(:,1:3),'rows')
A_new = A(ia,:)

Iniciar sesión para comentar.

Iniciar sesión para responder a esta pregunta.

Answer 1

Nick el 28 de Dic. de 2020

0
Enlazar

Enlace directo a esta respuesta

https://es.mathworks.com/matlabcentral/answers/703957-deleting-duplicates-based-on-conditions-of-multiple-columns#answer_586042

[C,ia] = unique(A(:,1:3),'rows')

A_new = A(ia,:)

0 comentarios
Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

Iniciar sesión para comentar.

Answer 2

Akash kumar el 31 de Jul. de 2022

0
Enlazar

Enlace directo a esta respuesta

https://es.mathworks.com/matlabcentral/answers/703957-deleting-duplicates-based-on-conditions-of-multiple-columns#answer_1018540

Abrir en MATLAB Online

% With Index Number:- Shows the which index or Row value is extract from
% the A Matrix. I thinks, It can help you.
A = [1 10 4; 1 10 4; 1 11 5; 1 11 5; 1 12 6; 1 12 7; 1 13 8; 2 4 25; 2 10 28; 2 10 28; 3 5 33; 4 25 23; 4 23 24]';
[B index]=unique(AA(1:3,:).','rows', 'stable')
B = 10×3
     1    10     4
     1    11     5
     1    12     6
     1    12     7
     1    13     8
     2     4    25
     2    10    28
     3     5    33
     4    25    23
     4    23    24
index = 10×1
     1
     3
     5
     6
     7
     8
     9
    11
    12
    13

0 comentarios
Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

Iniciar sesión para comentar.

Deleting duplicates based on conditions of multiple columns

3 comentarios
Mostrar 1 comentario más antiguoOcultar 1 comentario más antiguo

Respuesta aceptada

0 comentarios
Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

Más respuestas (1)

0 comentarios
Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

Ver también

Categorías

Etiquetas

Productos

Community Treasure Hunt

Deleting duplicates based on conditions of multiple columns

3 comentarios Mostrar 1 comentario más antiguoOcultar 1 comentario más antiguo

Respuesta aceptada

0 comentarios Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

Más respuestas (1)

0 comentarios Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

Ver también

Categorías

Etiquetas

Productos

Community Treasure Hunt

3 comentarios
Mostrar 1 comentario más antiguoOcultar 1 comentario más antiguo

0 comentarios
Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

0 comentarios
Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos