home *** CD-ROM | disk | FTP | other *** search
- Newsgroups: comp.object
- Path: sparky!uunet!paladin.american.edu!darwin.sura.net!wupost!emory!sol.ctr.columbia.edu!eff!world!sss
- From: sss@world.std.com (Sergiu S Simmel)
- Subject: Object Persistence Infrastructure -- Join the Kala Forum
- Message-ID: <SSS.92Nov15210227@world.std.com>
- Sender: sss@world.std.com (Sergiu S Simmel)
- Organization: Penobscot Development Corporation, Arlington MA
- Date: Mon, 16 Nov 1992 02:02:27 GMT
- Lines: 421
-
-
-
-
-
- ---------------------------------------------------------------------
-
- PENOBSCOT DEVELOPMENT CORPORATION
-
- announces
-
- A NEW FORUM FOR INFORMATION EXCHANGE
-
- on
-
- THE KALA(tm) TECHNOLOGY AND PRODUCT
-
- ---------------------------------------------------------------------
- = third announcement =
-
-
- Welcome to Kala -- The Persistent Data Server. Now, you can ...
-
- o keep yourself up-to-date with Kala's new developments,
- o share your experience in using Kala with others,
- o hear what others have to say about it,
- o ask us questions and benefit from the answers we provide to
- others,
- o participate in discussions on Kala-related technical topics,
- such as data/object persistence, visibility management,
- databases and file systems, etc.,
- o learn more about the Kala technology, including as yet
- undocumented details,
- o give us important technical and business feedback on our
- products,
-
- o and more ...
-
- [for a sample posting, see attachment to this announcement]
-
-
- ... by subscribing to our new Kala Forum. This forum, organized as a
- mailing list, will be moderated, so we will try to keep its focus and
- orientation straight, and the junk mail out.
-
- If you'd like to see a past issue, you can access them via anonymous
- FTP from world.std.com. The entire Kala Forum archive is located in
- the pub/kala/KalaForum directory. All files are ASCII text.
-
- To subscribe (and subsequently for any other requests regarding your
- subscription, such as change of address, unsubscription, etc.), direct
- your request to:
-
- --------------------------
- kala-request@world.std.com
- --------------------------
-
- To contribute to the on-going discussions address your messages to:
-
- ------------------
- kala@world.std.com
- ------------------
-
- Since we value most advertizing by word-of-mouth and personal
- reference, please forward this message to whomever you believe could
- also benefit from subscribing.
-
- We are looking forward to your subscription request and your
- participation in
-
-
- ### ============== ###
- ### The Kala Forum ###
- ### ============== ###
-
-
-
-
-
-
-
- _ _ ____ _ ____ tm ____________________________________
- \\ / | \ \ | \ \\\\
- \\ /__ \ __ \ \ \ __ \ \\\\
- \\ \ \ \ \ \ \ \\\\
- \\ \ \ \ \ \ \ \\\\ No more than you need !!!
- \\' \' \' \' '----' \' \' \\\\ No less than you want !!!
- ........................................................................
- Penobscot Development Corp. 50 Princeton Road Arlington Mass. 02174-8253
- voice: +1-617-646-3951 fax: +1-617-646-5753 email: kala@world.std.com
-
-
-
-
-
-
- For BYTE Magazine, December 1992 Issue
-
- Persistent Data Servers
- --Maintain the structure of data
- --Perform like file systems
- --Offer database features
-
-
- =========================================================================
-
- Objects of Substance
-
- Persistent data servers provide a new way to store object-based data.
-
- Sergiu S. Simmel and Ivan Godard
-
- =========================================================================
-
-
- One knock often made against software objects is their transient
- nature. Traditionally, an object is ephemeral; it is defined,
- manipulated, and destroyed by the program that creates it. It has no
- existence beyond the program's execution. Unlike real-world objects,
- and unlike computer generated data that exists outside of a program in
- a file system or database system, software objects are usually not
- persistent. The only way one program can share an object it creates
- with another program is for the two programs to be executing at the
- same time. This requirement puts a crimp in any plans for developing
- distributed object systems.
-
- Object-oriented database management systems provide one means of
- giving objects the characteristic of persistence; file systems provide
- another. Neither solution, however, is ideal for all applications,
- situations, and implementations. That's the rationale behind a new
- class of storage software called Persistent Data Servers.
-
- ---------------
- Hobson's Choice
- -----------------------------------------------------------------------
- The simplest persistent data storage available to you is the file
- system on your disk drive. File systems have some attractive
- characteristics; their performance is good, they can hold any data,
- they're easy to use, and, of course, the price is right. Conversely,
- files are unreliable. They provide no mechanism for in maintaining
- data consistency and only primitive data sharing facilities. Few file
- systems offer version control and all require that you transform data
- between "internal" and "external" forms all the time.
-
- Unlike a file system, a true database management system provides
- mechanisms for sharing data and for ensuring the integrity of the
- data. It supports transactions and version control, although the
- specifics of these functions may not be exactly what your application
- needs. Finally, a database system is scalable, and much more robust
- than a file when your hardware or software fails.
-
- The downside to a database system is that, compared to a file
- system, it is slower by an order of magnitude or more. Also, a
- database system generally confines you to dealing only with the kind
- of data that it can handle. In addition, a database is usually very
- complicated, difficult to learn and use, and expensive, both in terms
- of your cost of operation and in the amount of system resources they
- consume.
-
- Whether you choose a file system or a database manager, then, you
- have to sacrifice either economy or performance. Is there a happy
- medium? Something with the speed and flexibility of files, the
- reliability, shareability and robustness of databases, and at a cost
- that won't break your wallet or the available hardware? A new breed of
- products, persistent data servers, aims squarely at the yawning gap
- between DBMSs and file systems.
-
- --------------
- An Alternative
- -----------------------------------------------------------------------
- Kala is a persistent data server from Penobscot Development
- Corporation (Arlington, MA). It is a software subassembly, available
- to applications and database managers, that manages both the state and
- visibility of persistent data. It takes care of the how and the where
- (how data is stored and retrieved, and where it is stored), and also
- copes with the who, which and when of data management -- who can store
- and retrieve which data and when.
-
- Kala is similar to a file system in its simplicity, high
- performance, low semantic level (although it also supports pointers,
- not just bits), and low cost operation. And, it is similar to a DBMS
- in its robustness, support for transactions, security features, access
- control, configuration ability, reliability, scaleability, and so
- forth. But, at the same time, it is different than either of these
- environments.
-
- Kala combines the benefits of both these worlds while avoiding the
- drawbacks of each. This type of storage software can provide low level
- persistent data services. No more, no less.
-
- --------------
- Managing State
- -----------------------------------------------------------------------
- Like file systems, a persistent data server offers a get/put interface
- to the storage subsystem and can store any kind of data. Unlike file
- systems or the BLOBs (Binary Large OBjects) used by some database
- systems, a persistent data server lets the stored data retain its
- internal structure, no matter how complex. Suppose your application
- builds a linked list in memory and saves the list to the persistent
- store. When you retrieve that data it will still be a linked list --
- topologically the same as the original even though the memory
- addresses of the nodes are completely different (see the figure).
-
- Of course, object databases can also store references, but the links
- used by the persistent data server are regular machine pointers, not
- performance-costly object-oriented pointers. Your stored data can have
- any representation, including packed structures and executable code.
- You aren't restricted to a few primitive data types or the type of
- structures offered by a specific access language. Kala is as happy
- storing C++ or COBOL data as it is Lisp, assembler, or Smalltalk.
-
- -----------------
- Development Steps
- -----------------------------------------------------------------------
- The type of persistent data storage Kala provides lets you forget
- the distinction between in-memory and on-disk data or object
- "formats." You can program using Kala as if your code never had to
- remember anything across executions or applications. Write your
- applications as a demo, with dummy data and no storage i/o. You can
- lay out your data or objects in memory in the way best suited for
- in-memory-only processing and fastest execution of your algorithms.
-
- Once you are satisfied with the execution of your new "demo"
- application, you can think about a production-level persistent store
- for your objects. You first decide what the "unit of transfer" is,
- that is, which data should go to store and come back together as a
- unit. The ability to choose the transfer unit improves performance
- because you can bring in at once all the different pieces of data your
- application requires. These pieces may be many different objects or
- parts of objects -- the application doesn't care.
-
- For example, if the data you're using is a linked graph structure,
- you can either transfer the entire graph at once or just each node as
- you need it. Or, you can load in the entire graph except the contents
- of a single large but rarely referenced field in each node. You can
- even bundle the graph with other data, or choose some other unit of
- transfer. The transfer unit you select consists of hunks of bits and
- pointers possibly spread all over memory.
-
- Using convenient calls to the API, you tell the software where the
- data is and where, within that data, the machine pointers are. The
- persistent data server takes care of the rest. It copies that data (no
- more and no less) onto the persistent store, and gives you a "claim
- check" in return. When you present that claim check, the server will
- promptly retrieve that same data and lay it out in the application
- memory.
-
- -------------------
- Types Without Limit
- -----------------------------------------------------------------------
- Persistent data servers can handle anything that's made out of bits
- and pointers including objects, source code, records, images,
- executable code, noise, video, and so forth. This "model neutrality"
- makes a persistent data server an ideal interoperability point in the
- storage domain. It can reside "below" all other subassemblies and
- components that support only one or at most a few data organizations.
- In this respect, the role of a persistent data server in the storage
- domain resembles the X Window System in the display domain, or
- Postscript in the printing domain.
-
- For example, an object management system can interpret data as
- object slots and methods. Because a persistent data server isn't bound
- to any particular notion of object, it can simultaneously support
- several types of objects. The access to and visibility of these
- objects is guaranteed to remain the same for different language
- systems, different hardware platforms, and different object management
- systems.
-
- -------------------
- Managing Visibility
- -----------------------------------------------------------------------
- Conventional DBMSs and file systems deal with transactions, access
- control, security, licensing, version control, and configuration
- control as separate services. This practice has led to a proliferation
- of transaction managers, security managers, configuration managers,
- etc. The net result is unnecessarily complex, large, and
- overhead-burdened products.
-
- A persistent data servers works differently. It recognizes that all
- the services offered by traditional DBMSes are simply facets of the
- same basic problem: controlling the visibility of data.
-
- If you analyze the nature of a transaction commit in a
- conventional database, you find that it is a means making new values
- visible to the rest of the world by replacing the old values. Look at
- security grants. A security grant is simple a means of making
- data accessible (visible) to qualified agents until the access is
- revoked. You can think of a license as a means of making a
- dataset available (visible) to someone on the basis of pre-paid
- rights. A configuration is simply the bundling of a collection of data
- so that the collection is always visible as a unit. Each DBMS has its
- own idea of how to implement the semantics of these services.
-
- Take transactions, for example. Many useful transaction models
- exist, because the needs of automated teller machines are different
- from those of CASE repositories, which, in turn, are different from
- those of Personal Information Managers. Several useful access control
- schemes also exist. Security is treated differently in each
- organization, while all information vendors have different needs for
- their licensing models. Mathematically, all models are equivalent
- because each can be used to implement any of the others. But, in
- practice, trying to do so leads to unwarranted complications,
- overhead, and bulkiness.
-
- Persistent data software should be different. An application like
- Kala doesn't provide a single model, or a "one-size-fits-all" solution
- for each service. Instead, it provides a handful of primitives that
- you can use to build the right model for the application. Simple
- models typical of conventional DBMSs can be supplied prebuilt for you
- to use.
-
- --------------------
- Managing Performance
- -----------------------------------------------------------------------
- The performance of a persistent data server for a single user is equal
- to the performance of a good file system when reading and writing the
- same data. Perhaps surprisingly, its <I>relative performance
- actually improves with multiple users in a client/server
- configuration. This phenomenon occurs partly because of the seek
- optimization and shared buffering of common data used by Kala, and
- partly because it is no longer necessary for each application to
- individually open and close files.
-
- Kala is algorithmically faster than equivalent conventional
- technology exactly when you need it most: at peak server loads. It
- uses a non-write-in-place strategy, never overwriting a prior value.
- This feature gives it an effective 50 percent update performance
- advantage in transaction contexts such as OLTP (On Line Transaction
- Processing) applications. Kala requires only 1 + 1/n disk accesses per
- update (one to write the new data to free storage, and a fraction to
- record the commit where the commit record is shared with other
- transactions). A high-performance conventional DBMS needs 2 + 1/n disk
- accesses for the same task (one to write the former value in case of
- crash, one to write the new data back over the former value, and
- again, a fraction for the commit). This performance gain is not at the
- expense of data reliability and recoverability.
-
- -----------------------------------------------
- Persistent Data Servers Versus Object Databases
- -----------------------------------------------------------------------
- Any quality ODBMS can recover all transactions that have been
- committed, even if it were only milliseconds before the crash.
- Persistent data servers can do the same, while performing as fast as
- less reliable systems such as file systems.
-
- Many conventional ODBMS that have good performance as single-client
- applications with systematic access patterns, but degrade badly in
- multiple- client applications such as groupware, or when used
- concurrently by different applications that randomly access large
- pools of data. Many ODBMSes are tuned to display quick response to
- predictable access patterns. Thus they often achieve local
- (per-client) optimums at the expense of global (across clients)
- slow-down.
-
- For example, some ODBMSs improve object faulting performance by
- page- mapping databases using the file-mapping facilities of the OS.
- In this instance, the unit of transfer is the fixed size virtual
- memory page (or a multiple of it). These ODBMSes show no sensitivity
- to the actual access patterns of the application.
-
- If the data is a payroll database, an application may need pay
- records scattered throughout the database. The result in a page-based
- ODBMS is that it may bring a 4- or 8-KB page into memory to get an
- object that may be a few hundred bytes at most. The remainder --
- perhaps 80 or 90 percent of the total space and access time -- is
- wasted. The ODBMS may be performing well but the application grinds to
- a halt due to thrashing in the operating system's page manager.
-
- By contrast, Kala's user-specified units of transfer eliminate
- internal fragmentation. You get only what you requested -- that is, as
- little as one byte and as much as the size of the virtual memory, or
- more. In a multi-user environment, this feature also takes care of the
- severe security loopholes introduced by page- mapping based approaches
- -- another acute real-world problem.
-
- In conventional systems with single users, you can overcome
- thrashing and other performance problems by having the user manually
- cluster the data, relying on the programmer's ability to predict the
- access patterns of a single application and thus optimize the database
- for that application alone. However, this traditional technique breaks
- down badly when one application needs one selection of data from the
- database, and a second application, perhaps running concurrently for
- other users, needs different selections. The result is
- less-than-optimal global behavior.
-
- Kala doesn't employ such local optimizations. Instead, it uses
- actual access history to dynamically rearrange the store, so that
- global optimum occurs. If there is only a single user application,
- this type of software should be able to achieve clustering as good as
- the best packing performed by hand. It also should give globally
- optimum performance in multiple applications, without requiring the
- services of an expensive database administrator to tune the
- clustering.
-
- --------------
- Moving Forward
- -----------------------------------------------------------------------
- Persistent data servers such as Kala provide a new and exciting middle
- ground between the performance and simplicity of file systems and the
- capabilities of database managers. They are particularly useful as the
- underpinnings of object stores because they maintain the structure of
- the data on the disk, making it independent of the application that
- created it.
-
- More and more, applications need access to complex data types. More
- and more, applications must support multiple users in a distributed
- environment. From flat files to objects, persistent data servers can
- handle them all.
-
-
- -----------------------------------------------------------------------
- Ivan Godard and Sergiu S. Simmel are co-founders of Penobscot
- Development Corporation of Arlington, Massachusetts. Simmel holds an
- MS in Computer and Information Sciences from the University of
- Minnesota. His areas of expertise include CASE, hypermedia, and
- object-oriented databases. Godard has contributed to the development
- of Algol68, Ada, and Mary, the first wide-spectrum language, and has
- taught post-graduate courses at Carnegie-Mellon and the University of
- Maine. You can reach them at +1-617-646-3951 or on the Internet as
- kala@world.std.com.
- -----------------------------------------------------------------------
-
-