Synchronization Algorithm

Each persistent object belongs to exactly one scope, which is “owned” by some subset of the tasks in the file’s communicator and these tasks are allowed to append new objects to the tables of the scope and to modify existing objects in those tables. Since it is of paramount importance that all tasks owning a scope have the same information to write to the underlying dataset, we must insure that objects are maintained consistently across the scope’s communicator. There are at least three approaches:

  • Ensure that any object update is collective and that all tasks make identical changes.
  • Ensure that all updates are collective but use communication to broadcast from one task to the others.
  • Allow objects to become out of sync across the tasks and synchronize them before writing to disk.

The problem with the first two approaches is that object updates are collective. The original VBT layer used this approach, which was made even worse by the fact that VBT updates used the file communicator. The second approach is worse than the first because not only is it collective, but it also requires communication. SSlib uses the third approach, which is substantially more complicated but results in an API where each task can independently create and/or modify objects, or subsets of the scope communicator can cooperate to define a single object. The process of getting an object to disk becomes a two part problem: (1) synchronize the object across the scope’s communicator, then (2) flush the object if dirty to the file. It is hoped that a synchronization approach can perform better because it will result in fewer, larger inter-process messages.

First, some definitions:

  • A dirty object is any object whose data has changed subsequent to being stored in a file. This is a local characteristic–each task may have its own idea of whether an object is dirty. Normally only one task ever writes a particular object to the file, so only that one task ever sets dirty bits back to zero (but that’s fine because the dirty bits are not used for any other purpose).
  • A clean object is any object that is not dirty.
  • Whereas dirty and clean refer to whether an object was changed after being committed to storage, synchronized and unsynchronized refer to whether an object was changed after some point in time when all tasks agreed on its value. These characteristics are also local to a task, although the act of marking an object synchronized (i.e., synchronizing the object) is a collective operation.
  • A new object is any object that was created without the collective cooperation of all tasks of the object’s scope. Such an object is born dirty and unsynchronized and is assigned a temporary slot in the table that holds the object. The act of synchronizing the object assigns the object to a permanent table slot. Objects can also be created with the SS_ALLSAME bit, but since those objects are immediately given a permanent slot assignement and marked as synchronized (but dirty) they are not referred to as new objects.
  • An unresolved link is a persistent object link that points to a new object. Since a new object must exist in memory, an unresolved link must be in the SS_PERS_LINK_MEMORY state and point to an object that uses only indirect indexing (i.e., the link itself contains an indirect index and the object’s mapidx field also contains an indirect index). Any link that points to a non-new object is a resolved link.
  • An unresolved object is any persistent object that contains at least one unresolved link. All other objects are said to be resolved.
  • To synchronize an unsynchronized object means to communicate among the scope’s tasks so that all tasks have the same information.
  • To clean an object means to write its data to the file.
  • To resolve a link means to examine the object to which it points and adjust the link contents if the object is found to be non-new. Otherwise the link remains unresolved.
  • To resolve an object means to resolve all links emanating from that object.

The synchronization algorithm operates on one table at a time via ss_table_synchronize and is usually invoked in a particular order to minimize table dependencies since only resolved objects can be synchronized.