problem reading HDF5 on s3

7 visualizaciones (últimos 30 días)
Ben Dichter
Ben Dichter el 22 de Feb. de 2022
Editada: Ben Dichter el 6 de Nov. de 2022
>> H5F.open('https://dandiarchive.s3.amazonaws.com/blobs/58c/537/58c53789-eec4-4080-ad3b-207cf2a1cac9')
Error using hdf5lib2
Unable to access
'https://dandiarchive.s3.amazonaws.com/blobs/58c/537/58c53789-eec4-4080-ad3b-207cf2a1cac9'.
The specified URL scheme is invalid.
Error in H5F.open (line 130)
file_id = H5ML.hdf5lib2('H5Fopen', filename, flags, fapl, is_remote);
This works when using the same URL with h5py in Python:
from h5py import File
file = File("https://dandiarchive.s3.amazonaws.com/blobs/58c/537/58c53789-eec4-4080-ad3b-207cf2a1cac9", "r", driver="ros3")
file.keys()
<KeysViewHDF5 ['acquisition', 'analysis', 'file_create_date', 'general', 'identifier', 'processing', 'session_description', 'session_start_time', 'specifications', 'stimulus', 'timestamps_reference_time', 'units']>

Respuesta aceptada

Ben Dichter
Ben Dichter el 6 de Nov. de 2022
Editada: Ben Dichter el 6 de Nov. de 2022
Two things needed to solve this:
  1. You need to input an s3 path, not an http or https path
  2. Delete or rename ~/.aws/credentials (on Windows something like C:/Users/username/.aws/credentials)

Más respuestas (1)

Yongjian Feng
Yongjian Feng el 23 de Feb. de 2022
In python, you seem to use read_only flag ("r"). Maybe you want to try:
H5F.open('https://dandiarchive.s3.amazonaws.com/blobs/58c/537/58c53789-eec4-4080-ad3b-207cf2a1cac9', 'H5F_ACC_RDONLY')
  1 comentario
Ben Dichter
Ben Dichter el 23 de Feb. de 2022
That does not appear to be the issue. The documentation for H5F indicates that it should be possible to include only the s3 path:
file_id = H5F.open(URL) opens the hdf5 file at a remote location
for read-only access and returns the file identifier, file_id.
Also, this use-case is demonstrated in an example:
Example: Open a file in Amazon S3 in read-only mode with
default file access properties.
H5F.close(fid);
I tried some optional arguments, which did not help:
>> H5F.open('https://dandiarchive.s3.amazonaws.com/blobs/58c/537/58c53789-eec4-4080-ad3b-207cf2a1cac9', 'H5F_ACC_RDONLY')
Not enough input arguments.
Error in H5F.open (line 130)
file_id = H5ML.hdf5lib2('H5Fopen', filename, flags, fapl, is_remote);
>> H5F.open('https://dandiarchive.s3.amazonaws.com/blobs/58c/537/58c53789-eec4-4080-ad3b-207cf2a1cac9', 'H5F_ACC_RDONLY', 'H5P_DEFAULT')
Error using hdf5lib2
Unable to access
'https://dandiarchive.s3.amazonaws.com/blobs/58c/537/58c53789-eec4-4080-ad3b-207cf2a1cac9'. The
specified URL scheme is invalid.
Error in H5F.open (line 130)
file_id = H5ML.hdf5lib2('H5Fopen', filename, flags, fapl, is_remote);
It looks like MATLAB is doing validation on the input, ensuring that the path starts with s3://, which mine does not.

Iniciar sesión para comentar.

Etiquetas

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by