Set two-phase I/O properties

ss_blob_set_2pio is a function defined in ssblob.c.

Synopsis:

herr_t ss_blob_set_2pio(ss_blob_t *blob, ss_prop_t *props)

Formal Arguments:

  • blob: Optional blob to which whose file these settings apply. If no blob is specified then the settings apply to SSlib in general and can be overridden by individual blobs.
  • props: See *Aggregation Properties*.

Description: Sets properties for two-phase I/O for the HDF5 file associated with the specified blob. If no blob is supplied then the library-wide defaults are adjusted. Only the parameters present in the property list are adjusted. If called with null pointers for both blob and props then the library-wide defaults are initialized from the SSLIB_2PIO environment variable.

The SSLIB_2PIO environment variable should be the word yes’ (or `on’) which causes SSlib to use default values for two-phase I/O, the word `off’ (or `no’) in order to turn off two-phase I/O, or a semicolon-separated list of terms of the form ``KEY``=``VALUE`. The KEY names are as follows and are the same as the names that could appear in the property list.

minbufsize: The value should be an integer that specifies the minimum size in bytes to use for
each aggregation buffer. The actual aggregation buffer size is approximated by dividing the dataset size by the number of aggregators, subject to the alignment specified below. The default is 512kB, which is the GPFS page size on LLNL’s AIX systems.
alignment: The aggregation tasks will each be responsible for a dataset part which is some multiple
of the alignment size. However, SSlib will ignore the alignment under certain conditions when the alignment isn’t a multiple of the dataset element size. The default is 512kB, which causes aggregation buffers to align on GPFS page boundaries on LLNL systems.
maxaggtasks: The maximum number of aggregation tasks to use for each dataset. The actual number is
approximated by dividing the dataset size by the minbufsize, and limiting that by the maxaggtasks value. The default is to use 1/32 of the total MPI tasks, rouned up.
sendqueue: This is the maximum number of buffers that can be held by any task during the data
shipping phase of two-phase I/O. The default is four. If more than four buffers are requested then SSlib will block pending completion of one of the previous asynchronous MPI_Isend operations.
aggbuflimit: This is the maximum number of bytes that can be used for all aggregation buffers
across all files on a particular MPI task. An operation that would result in more than this amount being allocated by SSlib will cause SSlib to block until some asynchronous MPI_Irecv and H5Dwrite calls complete. The default is 10MB.
``asynchdf5``: The value is a Boolean that specifies whether SSlib should attempt to use POSIX.1b
asynchronous I/O (AIO) in HDF5’s mpiposix virtual file driver. Doing so currently requires a small patch to HDF5. The default is to attempt AIO.
aggbase: SSlib chooses aggregators by taking a base MPI task and adding multiples of some
aggregator increment modulo the number of tasks. The aggbase term specifies how to choose the base aggregator. It can be the rank of a particular MPI task or the value -1, which indicates that the base aggregator is chosen by hashing the dataset’s HDF5 object header address. This is normally used only for debugging.
tpn: Tasks per node, used to determine what MPI tasks serve as aggregators for a particular
dataset. If unspecified (or non-positive) then SSlib uses an algorithm that attempts to distribute aggregators for a particular MPP architecture assuming 4 or 16 tasks per node.

The keys that take a size argument can consist of an integer value followed by an optional multiplier suffix which may be any one of kB (or k, kb), MB (or M, m, mb), or GB (or G, g, gb) to indicate 2^10, 2^20, or 2^30. Suffixes can only be used with the environment variable; property lists always specify values in terms of bytes.

Keys which take a Boolean value can be set to “yes”, “on”, “true”, “no”, “off”, or “false” for the environment variable; Boolean properties always have an integer value.

Return Value: Returns non-negative on success; negative on failure.

Parallel Notes: When setting values for a particular file the call must be collective across the blob’s file communicator, with the non-blob-owning tasks passing the blob’s top-level scope instead. When setting library-wide defaults (blob is the null pointer) then the call must be collective across the library communicator. In any case all tasks must pass identical properties.

See Also:

  • I/O: Introduction for current chapter