Reading multiple datasets in hdf5

88 visualizaciones (últimos 30 días)
Amit Sinha
Amit Sinha el 24 de Mayo de 2023
Respondida: David el 1 de Dic. de 2023
I have an HDF5 file format where the datasets are in a group like /abc/xyz1 to /abc/xyznnn
All these have similar datasets: name, age, loc
How do I loop through these n subgroups to read all the data from the hdf5 file?

Respuestas (2)

Animesh
Animesh el 30 de Mayo de 2023
Hello!
You can use MATLAB's h5info function to get information about the contents of the HDF5 file(including the names of the datasets and groups) and h5read to read the data from the dataset.
You can do something like this to read all the data from all subgroups of "/abc":
filename = 'yourfile.h5';
info = h5info(filename, '/abc');
subgroup_names = {info.Groups.Name};
for i = 1:length(subgroup_names)
groupname = subgroup_names{i};
name = h5read(filename, strcat(groupname, '/name'));
age = h5read(filename, strcat(groupname, '/age'));
loc = h5read(filename, strcat(groupname, '/loc'));
% Do something with the data
end
You may refer to the following documentation for more information:

David
David el 1 de Dic. de 2023
Hello, I think you can use the h5py library in Python. Here's an example of how you can achieve this:
import h5py
# Open the HDF5 file
file = h5py.File('your_file.hdf5', 'r')
# Get the group containing the subgroups
group = file['abc']
# Loop through the subgroups
for subgroup_name in group:
subgroup = group[subgroup_name]
# Access the datasets within the subgroup
name = subgroup['name'][:]
age = subgroup['age'][:]
loc = subgroup['loc'][:]
# Process the data as needed
print(f"Subgroup: {subgroup_name}")
print(f"Name: {name}")
print(f"Age: {age}")
print(f"Location: {loc}")
# Close the HDF5 file
file.close()
In this example, you first open the HDF5 file using h5py.File() and then access the group containing the subgroups using file['abc']. You can replace 'abc' with the actual name of your group.
Next, you iterate over the subgroups in the group using a for loop. For each subgroup, you access the datasets name, age, and loc by indexing into the subgroup object (subgroup['name'], subgroup['age'], subgroup['loc']). The [:] indexing is used to read the data from the datasets.
You can process the data within the loop as needed. In the example, I've simply printed the data, but you can perform any required operations or store the data in a suitable data structure.
Finally, don't forget to close the HDF5 file using file.close() to free up system resources.

Etiquetas

Productos


Versión

R2022b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by