Industrial Data Science
in C# and .NET:
Simple. Fast. Reliable.
 
 

ILNumerics - Technical Computing

Modern High Performance Tools for Technical

Computing and Visualization in Industry and Science

HDF5 Multithreading Notes

Living in a parallel world brings more and more situations where we are required to utilize our CPU cores more efficiently. Multithreading is the most convenient way to do so. The HDF Group is actively working on comprehensive support to this regard. Here is the current state of the HDF5 library multithreading support in version 1.8.17 (as of 2016, ILNumerics version 4.12). This comes from the HDF5 FAQ:

 

Is HDF5 multi-threaded?

No, HDF5 is not multi-threaded.

Is HDF5 thread-safe?

The HDF5 library can be built in thread-safe mode. The thread-safe version of the HDF5 library effectively serializes the HDF5 library calls. It is thread-safe but not thread-efficient.

 

ILNumerics internaly relies on the HDF.PInvoke signatures and the prebuilt HDF5 binaries which come with this project. These binaries are built as threadsafe version. However, keep the following notes in mind when using HDF5 from multiple threads:

  • In order to utilize HDF5s binaries from multiple threads the user must synchronize calls to the library on her own. Use the common locking mechanisms to prevent from concurrent access to the underlying HDF5 library and HDF5 file resources. This is even true for situations where multiple threads are utilized to access multiple, independent HDF5 files. (A restriction wich will be removed from ILNumerics v4.13 on).
  • Even if proper locking was implemented concurrently reading data will not speed up the whole retrieval process. HDF5 internally serializes accesses to the library. It is thread-safe because no corruption of internal data will happen, but it is not thread-efficient.

 

Single Writer Multiple Reader (SWMR)

HDF Group has already identified the requirement for multithreading support. In 2016 HDF5 version 1.10 was released which introduces a great new feature: Single Writer Multiple Reader (SWMR). This allows one single process to write to a dataset while multiple other processes are allowed to read from the same dataset concurrently. Since this major release introduces a number of other breaking changes which would also potentially break existing applications for our existing users of ILNumerics.IO.HDF5 we were not following this big jump yet and ILNumerics 4.12 is still working with HDF5 version 4.8.17. Also, the need for carefully locking accesses to library calls accessing shared resources in a multithreading setup still remains in HDF5 v1.10.