Borrar filtros
Borrar filtros

Python Matlab engine: How to pass a pandas dataframe to Matlab function?

24 visualizaciones (últimos 30 días)
Stefan
Stefan el 29 de Mayo de 2024
Comentada: Stefan el 29 de Mayo de 2024
Hi everyone,
I'd like to pass a Python Pandas dataframe to a Matlab function. E.g.:
>>> DATAFILE = "2024-05-28_11-30-06.parquet"
>>> import matlab.engine
>>> import pandas as pd
>>> eng = matlab.engine.start_matlab()
>>> df = pd.read_parquet(DATAFILE)
>>> eng.table(df)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "C:\Users\someone\venv\matlab2\Lib\site-packages\matlab\engine\matlabengine.py", line 64, in __call__
future = pythonengine.evaluateFunction(self._engine()._matlab,
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
TypeError: unsupported Python data type: pandas.core.frame.DataFrame
Do I miss a conversion step?
Doing the same from Matlab works:
>> pyenv('Version','C:\Users\someone\venv\matlab2\Scripts\python.exe','ExecutionMode','OutOfProcess')
ans =
PythonEnvironment with properties:
Version: "3.11"
Executable: "C:\Users\someone\venv\matlab2\Scripts\python.exe"
Library: "C:\Users\someone\AppData\Local\Programs\Python\Python311\python311.dll"
Home: "C:\Users\someone\venv\matlab2"
Status: NotLoaded
ExecutionMode: OutOfProcess
>> df = py.pandas.read_parquet("2024-05-28_11-30-06.parquet");
>> t = table(df)
t =
300000×6 table
Thanks in advance!
Best regards,
Stefan

Respuestas (1)

Akshat
Akshat el 29 de Mayo de 2024
Hi Stefan,
I see you are trying to make a MATLAB table out of a pandas Dataframe object using the MATLAB engine for python.
You're approach is more or less accurate, just leaving out one thing which I found out from this documentation page:
Now in this, it is using the python connector in MATLAB, but I saw the line where they are first typecasting the dataframe to a struct, and then making a table.
Looking at that, I tried making a struct first, but I got the same error. Somehow, we need to first convert the dataframe into a dictionary (using "df.to_dict()" documentation here : https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.to_dict.html). I used the 'records' argument.
Then, it successfully converted into a MATLAB Object!
PFA the screenshot.
Hope this helps!
  1 comentario
Stefan
Stefan el 29 de Mayo de 2024
Hi Akshat,
I just tested your solution. It works but it is very slow. My example dataframe took 85 seconds. Just to compare:
eng.eval("pyenv('Version','C:\\Users\\someone\\venv\\matlab2\\Scripts\\python.exe','ExecutionMode','OutOfProcess')", nargout=0)
fname = "C:\\temp\\file.parquet"
df.to_parquet(fname) # write pandas dataframe to disk
dfm = eng.py.pandas.read_parquet(fname) # python - matlab -python to read it again
t = eng.table(dfm)
took 1.7 seconds. So writing dataframe to disk, reading it from disk and converting it to Matlab table was more than 40 times faster. So there hopefully is a shortcut without writing things to disk.
Best regards,
Stefan

Iniciar sesión para comentar.

Categorías

Más información sobre Call Python from MATLAB en Help Center y File Exchange.

Etiquetas

Productos


Versión

R2024a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by