Running a Python module between two named pipes runs roughly 40 times faster than using a Matlab implementation.
Source of data: A binary program is generating a data stream consisting of 4 float values with 8 bytes each. Always 32 bytes are somehow representing these 4 timely related measurements and are followed by the next 4 measurements.
Destination of data: the same binary or even another seperate binary should get the data blocks after doing some processing. So the output stream should have the same structure, since it shoule be possible to concatenate each individual data source and destination together.
Adding a python function just passing the data from one named pipe to another named pipe runs with a certain performance.
Doing the same with a Matlab module, even without any data processing in between, just passing input pipe data to the ouput pipe, takes roughly 40 times longer than with the Python function. Python just reads ( inpipe.read() ) 32 bytes from the pipe and writes it to the output pipe ( outpipe.write() ).
The same is done in Matlab. But with a lot lower (40 times) data rate.
Here is the example Matlab code:
pipeNameIn = "datain_fifo";
pipeNameOut = "dataout_fifo";
serverName = "localhost";
NET.addAssembly('System.Core');
pipeStreamIn = System.IO.Pipes.NamedPipeClientStream(serverName,...
System.IO.Pipes.PipeDirection.In);
pipeStreamOut = System.IO.Pipes.NamedPipeClientStream(serverName,...
System.IO.Pipes.PipeDirection.Out);
pipeStreamIn.Connect(timeOut);
pipeStreamOut.Connect(timeOut);
if ~pipeStreamIn.IsConnected
error('Pipe %s isnt connected...', pipeNameIn);
if ~pipeStreamOut.IsConnected
error('Pipe %s isnt connected...', pipeNameOut);
read_buffer = NET.createArray('System.Byte', bufferSize);
write_buffer = NET.createArray('System.Byte', bufferSize);
while pipeStreamIn.IsConnected
read_data = pipeStreamIn.Read(read_buffer, int32(0),int32(bufferSize));
inBuf = read_buffer.uint8;
pipeStreamOut.Write(outBuf,int32(0),bufferSize);
Any idea to increase the data throughput to a comparable rate as with Python would be appreciated. Since currently there is no data processing involved in this data passing, it is assumed, that the data rate reduction is a result of an incorrectly configured pipe or usage mode.
Running e.g. 1M of data rows, takes ~10sec with Python, but 400sec with Matlab.
Above code is more or less derived from an example found on "MATLAB Answers" (Link) Even though the binary code producing the data can provide both output and input pipe, it is not mandatory. Therefore an input and ouput named pipe is provided, to be more general. An inout pipe could not be used, since the interface of the binary model can not be modified. Since the performance of Python with two pipes is that much higher, I would not expect this to be the issue.
Asynchronous mode on both pipes, did not enhance anything. So I assume, that there must be a way to optimize this data throughput, but I do not see the current caviats.
Any comments and ideas would be really appreciated.