Vectorize a table row with mixed numeric values

Question

Captain Karnage el 12 de Jun. de 2023

0
Enlazar

Enlace directo a esta pregunta

https://es.mathworks.com/matlabcentral/answers/1981999-vectorize-a-table-row-with-mixed-numeric-values

Comentada: Captain Karnage el 31 de Ag. de 2023

MATLAB has the ability to concatenate different numerical types and make an array of the type with the most precision, sort of.

If I concatenate a logical type with any numerical type, it will promote the logical value to that numerical type, examples:

[logical(1), uint8(1)]

will give a 1x2 uint8 vector

likewise

[logical(1), uint64(1)]

will give a 1x2 uint64 vector

and

[logical(1), 1.5]

will give a 1x2 double vector

However, if you concatenate multiple numeric types, they will all be converted to the first integer numeric type listed

[uint8(1), 1.5, uint64(256)]

will give a 1x3 uint8 vector, rounding 1.5 up to 2 and truncating 256 to 255.

[logical(1), uint16(256), 1.5, uint64(66000)]

will give a 1x4 uint16 vector, rounding 1.5 up to 2 and truncating 66000 to 65535

[logical(1), 1.5, uint16(256), uint64(33000)]

will also give a 1x4 uint16 vector.

So while logical will get promoted to the first integer type, all higher-precision integers and floating point values will get demoted to the integer value.

Further, if you're using is a user-defined class that is a subclass to an int or uint type, regardless of its position, it will then try to convert all the values to that user-defined class.

For example, if i create an enumeration class octal

classdef octal < uint8
    %OCTAL Test Class Definition
    %   First 8 integers
    
    enumeration 
        zero    (0);
        one     (1);
        two     (2);
        three   (3);
        four    (4);
        five    (5);
        six     (6);
        seven   (7);
    end
end

And I concatenate:

[1, 2, octal.six, 5]

it converts all of the numbers to the corresponding octal class object and I get the output

[one two six five]

But if I add a number that isn't a part of the enumeration, such as:

[9, 1, octal.six, 5]

I get the error:

Error using octal
Cannot find a member of the 'octal' enumeration class that corresponds to each element of the given input argument.

So, here's my dilemma. I have a very large table variable that I am saving to disk. All of the table variables are unsigned integers. Each variable has a different range of valid values. To save memory and disk space, each variable is set to the lowest precision that contains the range of values required (e.g. if no value can be > 255, the variable is uint8). Additionally, a couple of variables are restricted not just in their range, but only to specific (non-continuous) values in that range, and I'm using an enumeration to store them (for the enumerations - each integer value represents a code, and the enumeration names are the names corresponding to those codes).

One of the columns is also a checksum for each row. So, what I want to do is verify the checksum by doing the necessary math on the other values in the row. If I could make the table row a single vector all of type uint64, I could vectorize the math for the checksum. I can, of course, do a for loop through each element in the row I'm calculating the checksum for - but once my data populates and I have thousands of rows, this takes up considerable time. Is there any way to vectorize converting a table row like this to uint64 without losing precision?

0 comentarios
Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

Iniciar sesión para comentar.

Iniciar sesión para responder a esta pregunta.

Answer 1

Steven Lord el 12 de Jun. de 2023

1
Enlazar

Enlace directo a esta respuesta

https://es.mathworks.com/matlabcentral/answers/1981999-vectorize-a-table-row-with-mixed-numeric-values#answer_1254464

Abrir en MATLAB Online

If I concatenate a logical type with any numerical type, it will promote the logical value to that numerical type, examples:

That's correct, as shown by the table on this documentation page.

However, if you concatenate multiple numeric types, they will all be converted to the first integer numeric type listed

If one of the arrays you're concatenating together is of an integer type yes, as per this documentation page.

Further, if you're using is a user-defined class that is a subclass to an int or uint type, regardless of its position, it will then try to convert all the values to that user-defined class.

True again, as per the last item in the "Behavior Categories" section on this documentation page.

So, here's my dilemma. I have a very large table variable that I am saving to disk.

Saving as a MAT-file as a table array or writing to some type of file as a regular numeric array?

One of the columns is also a checksum for each row. So, what I want to do is verify the checksum by doing the necessary math on the other values in the row. If I could make the table row a single vector all of type uint64, I could vectorize the math for the checksum. I can, of course, do a for loop through each element in the row I'm calculating the checksum for - but once my data populates and I have thousands of rows, this takes up considerable time. Is there any way to vectorize converting a table row like this to uint64 without losing precision?

Variables in a table array must be of one type, but you can have data of different types in a row of a table (the Name variable may be a string array while Age a double or an integer and Smoker a logical true or false, as an example.)

Rather than computing checksums on each row separately, why not vectorize your checksum calculation?

T = array2table(magic(5));
T.Var3 = int8(T.Var3);
T.Var5 = uint64(T.Var5)
T = 5×5 table
    Var1    Var2    Var3    Var4    Var5
    ____    ____    ____    ____    ____

     17      24       1       8      15 
     23       5       7      14      16 
      4       6      13      20      22 
     10      12      19      21       3 
     11      18      25       2       9 

See that each variable is of the expected class.

varfun(@class, T)
ans = 1×5 table
    class_Var1    class_Var2    class_Var3    class_Var4    class_Var5
    __________    __________    __________    __________    __________

      double        double         int8         double        uint64  

Now instead of computing using T{1, 'Var1'}, T{1, 'Var2'}, etc. just use T.Var1, T.Var2, etc. and peform the desired conversion on the variables as a whole.

y = single(T.Var3) + single(T.Var5);
class(y)
ans = 'single'

2 comentarios
Mostrar NingunoOcultar Ninguno

Captain Karnage el 13 de Jun. de 2023

Thank you, this at least gives me a faster running loop. I still have to loop through each variable (could have different names depending on a loaded file). But I just use numbers to index the table instead of names and convert it column by column into a uint64 array, and then do the math on the array to get the checksums for each row. Works much faster.

Captain Karnage el 31 de Ag. de 2023

Was looking back over this and just noticed I never answered a question: Saving as a MAT-file as a table array or writing to some type of file as a regular numeric array? I was saving a MATLAB table as variable in a MAT-file. If I was saving to any other file type, i would have just converted all the values to double as there's no savings in memory otherwise.

Iniciar sesión para comentar.

Vectorize a table row with mixed numeric values

0 comentarios
Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

Respuestas (1)

2 comentarios
Mostrar NingunoOcultar Ninguno

Ver también

Categorías

Etiquetas

Productos

Versión

Community Treasure Hunt

Vectorize a table row with mixed numeric values

0 comentarios Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

Respuestas (1)

2 comentarios Mostrar NingunoOcultar Ninguno

Ver también

Categorías

Etiquetas

Productos

Versión

Community Treasure Hunt

0 comentarios
Mostrar -2 comentarios más antiguosOcultar -2 comentarios más antiguos

2 comentarios
Mostrar NingunoOcultar Ninguno