Example Source for User-Defined Database Class

Here is about the simplest definition possible for a user-defined class and the corresponding methods. The class writes and reads objects in symbolic dump format, in a directory specified when the object from the class is created. For compatibility with the S function data.dump, and to decrease the chance of errors from trying to read arbitrary files as objects, the file name has ".Sdata" appended to the object name.

With this definition, every method is a one-line function definition, and the one utility function, to create the appropriate path name, is just a call to paste. Only the generator function manages to run a little longer, mainly because it tries to be friendly about creating the directory if it doesn't exist, and because it stores the full path of the directory.

Class Definition

Defines the class to extend the virtual class "database" and to contain one slot, for the string path name of the corresponding directory.
    setClass("dumpDataBase",
        representation("database", path = "character"))

Generator Function for the Class

Takes as argument a character string for the directory's path and, optionally, a flag to say whether to create the directory if it does not exist.
"dumpDataBase" =
  ## make an object of class `"dumpDataBase"' corresponding to the directory
  ## `path'.  The flag `create' allows or suppresses creating the directory if it
  ## does not exist (by default, the user is asked, if S is in interactive mode).
  function(path,
           create = identical(dialog(paste("Directory \"", path,
             "\"for database does not exist; create?", sep=""), c("y", "n"), 2),1))
{
  path = as(path, "character")
  if(length(path)!=1) stop("needed a single character string for path")
  if(!isDirectory(path)) {
    if(create) {
      shell(paste("mkdir", path), mustWork=T)
      file = paste(path, "objects", sep="/")
      cat("", file=file, sep="")
    }
    else
      warning("Directory \"", path,
              "\"for database does not exist; path validity not checked")
  }
  if(isDirectory(path))
    ## get a fully-qualified path
    path = shell(paste("cd", path, "; pwd"))
  new("dumpDataBase", path = path)
}
The reason for generating the fully-qualified path is that, otherwise, the behavior of the database object would depend on the directory in which the S process is running, very likely not what the user wants.

Method for dbobjects: Listing the Object Names

This method returns a character vector of the current object names in the directory. It is normally only called when the database is being attached. From then on, the S evaluation manager maintains its own table of the contents of the database. As with ordinary directory databases, this means that multiple processes using the same database simultaneously are not automatically co-ordinated. The method is just a one-line shell command, to get all the file names ending in ".Sdata", since we decided to write and read files with that suffix. A small free extension comes from using the find shell command: objects can actually be in subdirectories of the current directory. Their names will have an embedded "/" and the simple-minded methods shown here don't make subdirectories automatically, so the feature is probably worth about what it costs.
setMethod("dbobjects", "dumpDataBase",
function(database)
shell(paste("find", database@path, "-name '*.Sdata' -print|
    sed -e 's:.*/::' -e 's/.Sdata$//'"))
)

Method for dbread: Read an Object

Given the database and the name of an object guaranteed to be in the datbase, this method reads it in. Since all objects are written in the standard S symbolic dump format by dataPut, this method just calls dataGet to do the reading.
setMethod("dbread", "dumpDataBase", 
function(database, name)
dataGet(dumpDBPath(database, name))
)

Method for dbwrite: Write an Object

This method takes the database, an object to write out, and the name the object should have when assigned in the database, by calling the dataPut function with the path for the given name.
setMethod("dbwrite", "dumpDataBase", 
function(database, name, object)
dataPut(object, dumpDBPath(database, name))
)

Method for dbremove: Remove an Object

This function carries out the removal of the object, just by removing the corresponding file. Notice that it doesn't worry about whether the file exists: in principle, this method only gets called after the S evaluator determines that the object exists in the database's directory.
setMethod("dbremove", "dumpDataBase", 
function(database, name)
shell(paste("rm '", dumpDBPath(database, name), "'", sep=""), output=F, mustWork=T))

Method for dbexists: Does an Object Exist?

This method is only called when there is no internal directory for the database in the S evaluator manager; that is, when the dbobjects method returns NULL; see the discussion of implementation without a directory.

The Path Utility

Here is the function to create the path string for an object name:
dumpDBPath = function(database, name)
  paste(database@path, "/", name, ".Sdata", sep="")

Implementation without Directory

The following class definition extends the previous class to allow individual databases to be attached without creating an internal directory in the S evaluation manager.
setClass("dumpFilesDB", representation("dumpDataBase", directory = "logical"))
The only method that needs changing is dbobjects, which now returns NULL if it finds that the directory slot is FALSE.
setMethod("dbobjects", "dumpFilesDB", 
function(database)
if(database@directory) shell(paste("find", database@path, 
         "-name '*.Sdata' -print| sed -e 's/.Sdata$//'"))
else NULL
)

John Chambers<jmc@research.bell-labs.com>
Last modified: Wed Oct 7 17:13:40 EDT 1998