Commiting S Assignments

This is an extension of the discussion (e.g., on page 178) of assignment expressions, particularly at the top level. The model for evaluating these expressions is simple, but it can have some unexpected consequences. In addition, we discuss the related options(immediate="true") and its effect.

The assignment operation

    scaleRes = myscale(curfit$residuals)
always has the effect of associating the dataset name, "scaleRes" with the evaluated expression, in the local frame–the frame in which the expression is evaluated. When that frame is frame 1, the task frame, the evaluator takes an additional action, depending on the value of the "immediate" option. For the moment, assume that option has its default value, FALSE. In this case the evaluator in effect schedules an assignment operation to be added to the exit actions for frame 1.
    if(error.level() == 0)
      assign("scaleRes", scaleRes, where = 1)
The actual computation is done internally in the evaluator, but the call to assign is the way to think about what happens. The call will find the value of scaleRes on frame 1 and copy it to the working database. The condition that the error level be zero protects commitment of the assignment in the case that the task ends in an error.

So what could be confusing here? Well, mainly that the commitment is an exit action at the end of the task. It's the value of the object in frame 1 at that time that gets copied. Intermediate assignments or replacements to the object in frame 1 will alter the contents of the object copied. This is obvious with other assign operations, but it also applies to calls to the assign function on frame 1:

    assign("scaleRes", newScaleRes, frame=1)
If this takes place after the first assignment, this will be the object saved. Conversely, assign calls directly to the working database during the task will be overwritten, naturally, by the corresponding call on exit:
    assign("scaleRes", newScaleRes, where=1)
You almost certainly didn't want to do this!

One related point: removing the object from frame 1 will cause the evaluator to in effect cancel the assign call on exit.

Now, about the "immediate" option. When this is set to TRUE, the evaluator commits assignments to the working data immediately; in effect, the assign call is done right after assigning the object to frame 1. You lose the protection of the commitment, but that may be just what you want, to be sure in case of problems that you have the latest version of the object on the working data. The function synchronize does a related task: it commits the assignments when it's called, so you can have commitment on demand rather than immediately or at the end of the task.

More on synchronize: when it is called with no argument, it only has the effect mentioned of committing assignments to the working database. If you give it an argument identifying a database, however, it also ``synchronizes'' that database at the end of the task. In this case, synchronization means in effect detaching and re-attaching the database. If some other process had also been creating objects on the same directory database, for example, only by synchronizing the database in this way can the two processes expect to see the same objects. Having multiple processes access the same directory database when one or more of them can be modifying the database is a tricky business, and not usually a good idea. However, in a non-threaded language like S it may be the only general way to operate in parallel on the database.

Another warning: The action of synchronizing the database takes place at the end of the top-level task, and only then if the task did not end in an error. Don't expect another process to see the changes in the middle of a task. The reason for implementing synchronize(i) this way is that references to all or part of the objects in memory from the database may be hidden in other objects during the evaluation. To fully protect against corrupted pointers would require a lot of copying and put us in danger of memory leaks. The decision was to be safe and wait until the end of the task to re-attach the database.


John Chambers<jmc@research.bell-labs.com>
Last modified: Wed Sep 1 14:59:07 EDT 1999