utf-16 encoding in writestruct /readstruct or other xml2struct functions

11 visualizaciones (últimos 30 días)
I have a large xml file that is used to store information and settings for variables used in another program. It holds the variable name, units, description, display name. That kind of thing.
Due to some weird error with another program, someone tried to import this variable set and instead of overwriting the existing variables on that system it merged them. So I now have this xml file with over 400 duplicate entries and over 2000 entries in total.
I've already sorted a way of finding and removing these but the problem has come with encoding. The original xml file is in UTF-16 and it needs this presumably because of the japanese characters that are used for the variable descriptions etc.
Presumably then I need to have matlab not convert to utf-8 on reading and also save as utf-16 on saving. Is this possible?
I've been using community functions xml2struct and struct2xml but I see there are also native matlab options of readstruct and writestruct. But its not clear if they are capable of doing UTF-16 or whether its a selectable option?
  3 comentarios
Alex Mason
Alex Mason el 22 de Ag. de 2024
@Walter Roberson Do I use this unicode conversion before I am exporting using struct2xml? or doing it once the new xml is made?
Walter Roberson
Walter Roberson el 22 de Ag. de 2024
You would struct2xml() returning the generated xml, which would be generated with utf-8. You would "fix up" the header that says encoding utf-8 to say utf-16 instead. You would uint8() that to convert from characters to bytes, and you would native2unicode() the bytes to convert into unicode code points. You would then unicode2native() that char stream asking for UTF-16, generating a byte stream. You would fwrite() the byte stream.

Iniciar sesión para comentar.

Respuestas (1)

Harsh
Harsh el 26 de Ag. de 2024
Hi,
Based on my understanding, you've been utilizing community functions to handle the reading and writing of UTF-16 XML files. Now, you're seeking a MATLAB-native solution to achieve the same task.
Fortunately, MATLAB provides built-in functions that can seamlessly accomplish this. Here's how you can use MATLAB's native capabilities to read and write UTF-16 XML files:
% Define the file name
filename = 'example_utf16.xml';
% Open the file for writing with UTF-16 encoding
fileID = fopen(filename, 'w', 'n', 'UTF-16LE');
if fileID == -1
error('Failed to open file for writing.');
end
% Write some text to the file
fprintf(fileID, '<note>It can contain special characters like: ä, ö, ü, ñ, ç, 𤭢.\n </note>');
% Close the file
fclose(fileID);
disp(['File "', filename, '" has been created with UTF-16 encoding.'])
File "example_utf16.xml" has been created with UTF-16 encoding.
The encoding of the file created can be confirmed in notepad as well,
I hope this helps, thanks!
  1 comentario
Alex Mason
Alex Mason el 2 de Sept. de 2024
Hi @Harsh
Sorry for the late reply, I will give this a try. I've used MATLAB on and off for a long time and have gotten used to functions I need not being in MATLAB natively and just going straight to the community for solutions.
I will give it a try and report back.
Many thanks

Iniciar sesión para comentar.

Productos


Versión

R2022b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by