Vectorize a table row with mixed numeric values

1 visualización (últimos 30 días)
Captain Karnage
Captain Karnage el 12 de Jun. de 2023
Comentada: Captain Karnage el 31 de Ag. de 2023
MATLAB has the ability to concatenate different numerical types and make an array of the type with the most precision, sort of.
If I concatenate a logical type with any numerical type, it will promote the logical value to that numerical type, examples:
[logical(1), uint8(1)]
will give a 1x2 uint8 vector
likewise
[logical(1), uint64(1)]
will give a 1x2 uint64 vector
and
[logical(1), 1.5]
will give a 1x2 double vector
However, if you concatenate multiple numeric types, they will all be converted to the first integer numeric type listed
[uint8(1), 1.5, uint64(256)]
will give a 1x3 uint8 vector, rounding 1.5 up to 2 and truncating 256 to 255.
[logical(1), uint16(256), 1.5, uint64(66000)]
will give a 1x4 uint16 vector, rounding 1.5 up to 2 and truncating 66000 to 65535
[logical(1), 1.5, uint16(256), uint64(33000)]
will also give a 1x4 uint16 vector.
So while logical will get promoted to the first integer type, all higher-precision integers and floating point values will get demoted to the integer value.
Further, if you're using is a user-defined class that is a subclass to an int or uint type, regardless of its position, it will then try to convert all the values to that user-defined class.
For example, if i create an enumeration class octal
classdef octal < uint8
%OCTAL Test Class Definition
% First 8 integers
enumeration
zero (0);
one (1);
two (2);
three (3);
four (4);
five (5);
six (6);
seven (7);
end
end
And I concatenate:
[1, 2, octal.six, 5]
it converts all of the numbers to the corresponding octal class object and I get the output
[one two six five]
But if I add a number that isn't a part of the enumeration, such as:
[9, 1, octal.six, 5]
I get the error:
Error using octal
Cannot find a member of the 'octal' enumeration class that corresponds to each element of the given input argument.
So, here's my dilemma. I have a very large table variable that I am saving to disk. All of the table variables are unsigned integers. Each variable has a different range of valid values. To save memory and disk space, each variable is set to the lowest precision that contains the range of values required (e.g. if no value can be > 255, the variable is uint8). Additionally, a couple of variables are restricted not just in their range, but only to specific (non-continuous) values in that range, and I'm using an enumeration to store them (for the enumerations - each integer value represents a code, and the enumeration names are the names corresponding to those codes).
One of the columns is also a checksum for each row. So, what I want to do is verify the checksum by doing the necessary math on the other values in the row. If I could make the table row a single vector all of type uint64, I could vectorize the math for the checksum. I can, of course, do a for loop through each element in the row I'm calculating the checksum for - but once my data populates and I have thousands of rows, this takes up considerable time. Is there any way to vectorize converting a table row like this to uint64 without losing precision?

Respuestas (1)

Steven Lord
Steven Lord el 12 de Jun. de 2023
If I concatenate a logical type with any numerical type, it will promote the logical value to that numerical type, examples:
That's correct, as shown by the table on this documentation page.
However, if you concatenate multiple numeric types, they will all be converted to the first integer numeric type listed
If one of the arrays you're concatenating together is of an integer type yes, as per this documentation page.
Further, if you're using is a user-defined class that is a subclass to an int or uint type, regardless of its position, it will then try to convert all the values to that user-defined class.
True again, as per the last item in the "Behavior Categories" section on this documentation page.
So, here's my dilemma. I have a very large table variable that I am saving to disk.
Saving as a MAT-file as a table array or writing to some type of file as a regular numeric array?
One of the columns is also a checksum for each row. So, what I want to do is verify the checksum by doing the necessary math on the other values in the row. If I could make the table row a single vector all of type uint64, I could vectorize the math for the checksum. I can, of course, do a for loop through each element in the row I'm calculating the checksum for - but once my data populates and I have thousands of rows, this takes up considerable time. Is there any way to vectorize converting a table row like this to uint64 without losing precision?
Variables in a table array must be of one type, but you can have data of different types in a row of a table (the Name variable may be a string array while Age a double or an integer and Smoker a logical true or false, as an example.)
Rather than computing checksums on each row separately, why not vectorize your checksum calculation?
T = array2table(magic(5));
T.Var3 = int8(T.Var3);
T.Var5 = uint64(T.Var5)
T = 5×5 table
Var1 Var2 Var3 Var4 Var5 ____ ____ ____ ____ ____ 17 24 1 8 15 23 5 7 14 16 4 6 13 20 22 10 12 19 21 3 11 18 25 2 9
See that each variable is of the expected class.
varfun(@class, T)
ans = 1×5 table
class_Var1 class_Var2 class_Var3 class_Var4 class_Var5 __________ __________ __________ __________ __________ double double int8 double uint64
Now instead of computing using T{1, 'Var1'}, T{1, 'Var2'}, etc. just use T.Var1, T.Var2, etc. and peform the desired conversion on the variables as a whole.
y = single(T.Var3) + single(T.Var5);
class(y)
ans = 'single'
  2 comentarios
Captain Karnage
Captain Karnage el 13 de Jun. de 2023
Thank you, this at least gives me a faster running loop. I still have to loop through each variable (could have different names depending on a loaded file). But I just use numbers to index the table instead of names and convert it column by column into a uint64 array, and then do the math on the array to get the checksums for each row. Works much faster.
Captain Karnage
Captain Karnage el 31 de Ag. de 2023
Was looking back over this and just noticed I never answered a question: Saving as a MAT-file as a table array or writing to some type of file as a regular numeric array? I was saving a MATLAB table as variable in a MAT-file. If I was saving to any other file type, i would have just converted all the values to double as there's no savings in memory otherwise.

Iniciar sesión para comentar.

Categorías

Más información sobre Numeric Types en Help Center y File Exchange.

Productos


Versión

R2022b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by