Windows named pipe for data input and ouput in Matlab extremely slow compared to other languages

Question

MikeMoc el 11 de Dic. de 2023

0
Enlazar

Enlace directo a esta pregunta

https://es.mathworks.com/matlabcentral/answers/2059229-windows-named-pipe-for-data-input-and-ouput-in-matlab-extremely-slow-compared-to-other-languages

Comentada: Eric el 1 de Abr. de 2024

Running a Python module between two named pipes runs roughly 40 times faster than using a Matlab implementation.

Source of data: A binary program is generating a data stream consisting of 4 float values with 8 bytes each. Always 32 bytes are somehow representing these 4 timely related measurements and are followed by the next 4 measurements.

Destination of data: the same binary or even another seperate binary should get the data blocks after doing some processing. So the output stream should have the same structure, since it shoule be possible to concatenate each individual data source and destination together.

Adding a python function just passing the data from one named pipe to another named pipe runs with a certain performance.

Doing the same with a Matlab module, even without any data processing in between, just passing input pipe data to the ouput pipe, takes roughly 40 times longer than with the Python function. Python just reads ( inpipe.read() ) 32 bytes from the pipe and writes it to the output pipe ( outpipe.write() ).

The same is done in Matlab. But with a lot lower (40 times) data rate.

Here is the example Matlab code:

bufferSize = 32;
timeOut = 100;
% PipeName and Server defination
pipeNameIn = "datain_fifo";
pipeNameOut = "dataout_fifo";
serverName = "localhost";
% Add .Net             
NET.addAssembly('System.Core');
pipeStreamIn = System.IO.Pipes.NamedPipeClientStream(serverName,...
    pipeNameIn,...
    System.IO.Pipes.PipeDirection.In);
pipeStreamOut = System.IO.Pipes.NamedPipeClientStream(serverName,...
    pipeNameOut,...
    System.IO.Pipes.PipeDirection.Out);
pipeStreamIn.Connect(timeOut);
pipeStreamOut.Connect(timeOut);
if ~pipeStreamIn.IsConnected
    error('Pipe %s isnt connected...', pipeNameIn);
end
if ~pipeStreamOut.IsConnected
    error('Pipe %s isnt connected...', pipeNameOut);
end
read_buffer = NET.createArray('System.Byte', bufferSize);
write_buffer = NET.createArray('System.Byte', bufferSize);
while pipeStreamIn.IsConnected
    % this is a byte array with 32 bytes, including 4 * 8 byte floats
    read_data = pipeStreamIn.Read(read_buffer, int32(0),int32(bufferSize));
    inBuf = read_buffer.uint8;
    % data processing should happen here
    outBuf = inBuf;
    pipeStreamOut.Write(outBuf,int32(0),bufferSize);
end

Any idea to increase the data throughput to a comparable rate as with Python would be appreciated. Since currently there is no data processing involved in this data passing, it is assumed, that the data rate reduction is a result of an incorrectly configured pipe or usage mode.

Running e.g. 1M of data rows, takes ~10sec with Python, but 400sec with Matlab.

Above code is more or less derived from an example found on "MATLAB Answers" (Link)

Even though the binary code producing the data can provide both output and input pipe, it is not mandatory. Therefore an input and ouput named pipe is provided, to be more general. An inout pipe could not be used, since the interface of the binary model can not be modified. Since the performance of Python with two pipes is that much higher, I would not expect this to be the issue.

Asynchronous mode on both pipes, did not enhance anything. So I assume, that there must be a way to optimize this data throughput, but I do not see the current caviats.

Any comments and ideas would be really appreciated.

2 comentarios
Mostrar NingunoOcultar Ninguno

MikeMoc el 18 de Dic. de 2023

In the meanwhile I did try the same on a Linux machine. Named pipes on Linux can be used natively, without including any further functionalities, also from within Matlab. It's more or less file IO processing, based on special a file object.

And most important, the penalty of factor 40 compared to Python is gone.

So the current assumption is, that the interface to .NET is caussing the performance penalty, since it is not required for Linux and the rest of the code is rather identical.

So any ideas about modifying the usage of named pipes on Windows together with Matlab will be really appreciated.

Eric el 1 de Abr. de 2024

Some ideas (I just started messing with named pipes on Windows myself, so I apologize for dumb suggestions):

For me, MATLAB likes to default to Message mode instead of Byte mode for the In pipe's ReadMode (no idea why). I'm not sure if that's affecting your performance somehow. (I fixed it by just trying recreating the pipe again. Bizarre.)
I discovered that my server writes were initially hanging because the default buffer size is zero. Apparently, a default buffer size of 0 is supposed to be allocated "as needed", but this was clearly not happening for me. So my server writes and client reads were needing to occur simultaneously, and I imagine no optimization was being done by the .NET interface. I'm not sure what your server settings are, but this could be contributing to your slow down. The fix would be to set the buffer size in the server constructor.
I would try increasing bufferSize. In my experience, 32 is a bit small for reading/writing data chunks. I recommend trying 4096 (i.e. a page size on Linux) and see if you do any better. Your program already captures the number of bytes read in read_data, and so could easily be adjusted to account for less than full buffer reads.

Iniciar sesión para comentar.

Iniciar sesión para responder a esta pregunta.

Answer 1

Gojo el 22 de En. de 2024

1
Enlazar

Enlace directo a esta respuesta

https://es.mathworks.com/matlabcentral/answers/2059229-windows-named-pipe-for-data-input-and-ouput-in-matlab-extremely-slow-compared-to-other-languages#answer_1394626

Hey MikeMoc,

It appears that some adjustments to the handling of named pipes in Windows using MATLAB are required. The discrepancy in performance between the Python module and MATLAB's approach may stem from MATLAB's reliance on the .NET framework in Windows, an issue not observed on the Linux system.

Although, there is no library support for using named pipes in MATLAB, I can suggest some workarounds for the same.

“MEX” files for using named pipes: A “MEX” file is a MATLAB executable file. “MEX” files provide an interface for using functions written in C/C++. A named pipe can be created in C/C++ and can be used for inter-process communication. Please have a look at the following answer for creating a named pipe in C: https://stackoverflow.com/questions/2784500/how-to-send-a-simple-string-between-two-programs-using-pipes After creating a C function, use the “mex” command for building the file. Please refer to the following MATLAB documentation for building a “MEX” function: https://www.mathworks.com/help/matlab/ref/mex.html For more information regarding external language interfaces that could be used in MATLAB, please check the following: https://www.mathworks.com/help/matlab/external-language-interfaces.html?s_tid=srchbrcm
Using other IPC methods: Please have a look at the following benchmarks present for IPC: https://github.com/goldsborough/ipc-bench#: The ”Shared Memory” and “Memory-Mapped Files” methods have a much better benchmark as compared to that of named pipes.

“Shared Memory” method: Using this method, MATLAB objects could be shared using shared memory. There is a File Exchange link leveraging this and has support for windows as well. Please find the post here: https://www.mathworks.com/matlabcentral/fileexchange/28572-sharedmatrix
“Memory-Mapped Files” method: Inter Process Communication could be done using a shared file. The file is shared by mapping part of their memory space to a common location in the file. Please refer to the following MATLAB documentation on using memory mapping for communication: https://www.mathworks.com/help/matlab/import_export/share-memory-between-applications.html

Hope this helps.

1 comentario
Mostrar -1 comentarios más antiguosOcultar -1 comentarios más antiguos

MikeMoc el 23 de En. de 2024

Hi Shubham,

Thanks a lot for your detailed response and your suggestions. Since we have to use a compiled model (no access to the source code) providing a pipe based interface (named pipes) or file IO, we are not totally free in selecting the IPC method. But I will have a closer look at your suggestions, how we can make use of your proposals.

As you also would have noticed, there is yet only your reply to this topic. Interestingly, model interaction (models of different sources) seems not to be a continously occuring issue for many other Matlab users out there.

Thanks a lot again, Mike

Iniciar sesión para comentar.

Windows named pipe for data input and ouput in Matlab extremely slow compared to other languages

2 comentarios
Mostrar NingunoOcultar Ninguno

Respuestas (1)

1 comentario
Mostrar -1 comentarios más antiguosOcultar -1 comentarios más antiguos

Ver también

Categorías

Etiquetas

Productos

Versión

Community Treasure Hunt

Windows named pipe for data input and ouput in Matlab extremely slow compared to other languages

2 comentarios Mostrar NingunoOcultar Ninguno

Respuestas (1)

1 comentario Mostrar -1 comentarios más antiguosOcultar -1 comentarios más antiguos

Ver también

Categorías

Etiquetas

Productos

Versión

Community Treasure Hunt

2 comentarios
Mostrar NingunoOcultar Ninguno

1 comentario
Mostrar -1 comentarios más antiguosOcultar -1 comentarios más antiguos