Industrial Data Science
in C# and .NET:
Simple. Fast. Reliable.
 
 

ILNumerics - Technical Computing

Modern High Performance Tools for Technical

Computing and Visualization in Industry and Science

HDF5 Attributes

Attributes are a way of storing metadata with general objects in HDF5. Common examples of attributes: 

  • The date and time of a measurement, stored with the acquired data in a dataset.
  • The name of a measurement or of the author of some data.
  • Units of the data.
  • Additional information associated with the file, such as original storage location, the system used to create the file, etc.

General information about HDF5 attributes are found on the HDF5 site. This article shows how to deal with HDF5 attributes in ILNumerics.IO.HDF5.

Attributes represent name / value tuples, attached to HDF5 objects.

Attribute Names

The name of an attribute is specified at creation time and cannot change afterwards. Attribute names can consist out of any unicode character string. Names of attributes must be unique between the attributes of the same object. The name is used to identify the attribute later. Therefore the name should be chosen carefully and be related to the value stored in the attribute.

Attribute Values

Attribute values are stored as array of arbitrary number of elements of the same type. Attributes are very similar to datasets and Array<T>. The array value is stored with the attribute in a datastructure which is not optimized for huge data. Therefore, attribute values are expected to be rather small. However, there is no hard limit and no upper limit recommendation given by the HDF5 specification.

While technically each attribute stores an array of numbers, many attributes represent scalar values only.

Allowed types for attribute elements include all numeric number formats supported by ILNumerics as well as arrays of strings.

Creating Attributes

HDF5 attributes in ILNumerics are created by adding a new H5Attribute object to an existing HDF5 object. Every object in ILNumerics.IO.HDF5 exposes the Attributes collection which is used to access and manage the list of attributes for the object:

If you are lucky enough to use C# there is an even shorter syntax to set up new HDF5 attributes:

Note that the order in which the attributes are created in does not necessarily correspond to the order the attributes are stored in the file. A unique name is required to address attributes. The following screenshot shows the utilization of data tips in Visual Studio to inspect the HDF5 file and to crawl down the list of attributes of the root node. While the attributes were created in the order: name, attr. #2 and AttributeString, the order after retrieval is reversed here:

Accessing Attributes

The Attributes collection property can be used to access the attributes of any HDF5 object. Accessing (reading) the value of an attribute consists out of the following steps:

1) Locate the object the attribute is stored with.

2) Use the Attributes property to retrieve the attribute.

3) Use the Get<T> method of the H5Attribute class in order to retrieve the value of the attribute.

The second line gives an alternative to accessing the attribute via the hosting object. Here, the H5Group.First<T>() method is used to find the attribute in the file directly. Note that First<T>() by default does not recognize attributes so we must enable the search for attributes by setting the includeAttributes argument to true. The other methods for navigating the objects in the HDF5 file work similarly. 

Note that the Get<T>() method does not allow to retrieve parts of the (array) value. The whole attribute value is retrieved at once. However, since Get<T>() returns an ILNumerics array all options of retrieving subarrays or individual elements from Array<T> apply.

Modifying Attributes

The value of an attribute can be changed after creation. This is easiest for scalar attributes. Just assign the new value to the indexer of the Attributes collection:

New array values can be assigned in the same manner:

Moreover, the length and number of dimensions of the attribute can be changed as well. In the next example the scalar attribute "owner" is  extended by adding another owners name, hence turning the attributes value into a vector:

Note that attribute values can only be changed at once. No partial access is possible as for datasets or Array<T>.

Properties of Attributes

HDF5 attributes represent objects with properties. These properties can be inspected at runtime. Note that properties of attributes are readonly - they can only be set at the time the attribute is created.

 

H5Attribute Property Name Type Description
Properties of H5Attribute
Class enum H5Class

The base type of the attribute's elements. The enum values defined here correspond to the base type values in HDF5. Potential values include 'Integer', 'Float', and 'String'.

File H5File Reference to the file object this attribute is stored in. The file ID will be useful when combining low level access via HDF.PInvoke with the high level API in ILNumerics. 
H5Type H5ObjectTypes Derived from H5Object. The type of the object. Returns "Attribute" always.
Name string The name of the attribute. Used to identify the attribute in the collection.
NameEncoding StringEncoding Either 'ASCII' or 'UTF8'. See: Strings in HDF5.
Path string The absolute path from the root of the file to the object hosting this attribute.
Size RetArray<long>

The size of the attributes value, including the number an length of the attributes dimensions. This value may change when the attributes value is changed using the Set() method.

 

Further reading: