RAQUEL the Language

The Third Manifesto

The Third Manifesto is a specification of what a fully-fledged, modern relational database system should be, writen by Chris J. Date and Hugh Darwen. The relational model expressed by the RAQUEL notation is that defined in The Third Manifesto. When the RAQUEL Database Management System is complete, it will execute the specification in its entirety.

The Third Manifesto (= TTM) is an evolutionary development of Ted Codd's original relational model. Note that it has some significant differences from SQL !

In addition to specifying the relational model, TTM specifies the model of data subtyping and inheritance to be used with it, if subtyping and inheritance are used with the relational model (which is not mandatory). The relational model per se is defined by a series of relational and 'other orthogonal' prescriptions, proscriptions and 'very strong suggestions'. The subtyping and inheritance model (which includes multiple inheritance) is defined by a series of prescriptions.

As it has evolved, TTM has been published in a series of three books written by Chris Date and Hugh Darwen. The first book is entitled Foundation for Object/Relational Databases : The Third Manifesto, the second Foundation for Future Database Systems : The Third Manifesto, and the third Databases, Types and the Relational Model : The Third Manifesto. All were published by Addison Wesley, in 1998, 2000 and 2006 respectively. The essential ideas are comprehensively described in the first edition, with the later editions revising and refining them and providing more descriptive material. Most recently, the book Database Explorations : Essays on The Third Manifesto and Related Topics contains a refinement of the relational model in chapter 1 and an updated definition of the subtyping and inheritance model in Part III; the book was published by Trafford in 2010.

The Third Manifesto and RAQUEL

TTM states that the relational model can be expressed in abstract terms via a language called D. However abstract concepts need to be expressed in concrete terms so they can be read and recognised. So TTM devises a concrete language called Tutorial D to express the abstract D. In principle there can be many languages that qualify as a valid D, i.e. give concrete expression to it. Tutorial D is one candidate; RAQUEL is another and accepted as such by Chris Date and Hugh Darwen. RAQUEL implements D by generalising its concepts, together with their corresponding syntactic notation. This eliminates exceptions and minimises the number of concepts. It also makes them orthogonal, i.e. independent of each other, so that they may be combined in any possible way. Thereby the functionality of RAQUEL is maximised and at the same time its conceptual complexity minimised.

To see how RAQUEL fulfills the requirements for D, see the document How RAQUEL Meets the Requirements for a Valid D.

Websites Supporting the Third Manifesto

There is an official Third Manifesto website managed by Hugh Darwen. It contains a wide variety of supporting information, including information about TTM related projects, reference material from the 3rd edition of the book, 'Questions and Answers', related documents, presentations and papers, etc.

There is also a Database Debunkings website - subtitled "Dispelling Persistent Prevalent Database Management Fallacies" - with a wealth of interesting and useful information on it that is generally supportive of TTM. Its editor, Fabian Pascal, focuses on education in relational concepts as opposed to training in commercial products.

"Crucial Logical Differences" Underpinning TTM

In order to specify its relational model, The Third Manifesto specifies 3 logical 'differences'. They are not explicit parts of the model but rather 3 parts of the logical foundations on which the relational logical model is based. In each case, TTM distinguishes 2 aspects that might otherwise be conflated or confused with each other. It compares and contrasts the 2 aspects, defining each of the 2. The 3 'differences' are :

Model versus Implementation
Values versus Variables
Values versus Representations
Model versus Implementation

It is essential to differentiate between a logical model (in this case the relational model) and its implementation.
A logical model is an abstract, logical definition of a set of entities and a set of means of handling the entities, which together constitute an abstract machine with which a user could interact.
An implemetation of a logical model is a suite of software modules that run on a physical computer system and provide a realisation of the logical model with which a user can actually interact.

The distinction is particularly important in DB systems because of the need for a DBMS to be able to cope with non-trivial volumes of data, which in turn requires a portfolio of means of physically storing and manipulating data in order to achieve performance, recovery, concurrency, security and other goals. Therefore it is essential that the DB logical model makes no reference of any kind to its physical implementation; e.g. no reference to indexes or physical addresses used in data storage. Any requirements to reference physical storage (which do in practice arise, say to design or arrange the storage) must be kept completely separate from references to the logical model.

The distinction is what enables Physical Data Independence to be attained. RAQUEL provides complete Physical Data Independence. The menu option The ANSI-SPARC Database Architecture gives a full description of how Physical Data Independence, and indeed Logical Data Independence, is achieved. As a consequence, those parts of RAQUEL which represent the logical database model are completely separate from those that manage Physical (and Logical) Data Independence.

In passing, it is worth noting that the logical model of TTM and RAQUEL is a formalism in the sense that mathematical differentiation and integration are formalisms. When RAQUEL is used to create a DB, that formalism is used to create a logical model of a specific DB. The DB is an application of the abstract relational model in the same way that a particular differential equation is an application of the use of the differentiation formalism. A specific DB is a specific logical model that uses the general-purpose abstract relational model to express and embody the semantics of that specific DB.

Values versus Variables

In mathematics, and therefore in TTM and RAQUEL, variables are distinguished from values.

A value is a single, coherent piece of data, such as the numeric value 1, the text value 'RAQUEL', a picture taken by a digital camera, etc. As long as the value is coherent and can be manipulated in its entirety (as opposed to always having to be manipulated a component at a time) then its internals can be of arbitrary size and complexity.
A variable is a holder of a value. The value of the variable can change. In principle it can be changed, or varied, at any arbitrary point in time. There is no limit to the number of times a variable's value can be changed.

Variables and values arise at scalar, relational, and DB schema levels of abstraction. A DB schema is a variable whose value at any point in time is given by the set of relations in it. These relations are themselves variables whose values at any point in time are given by the attribute values in them. Currently there are no scalar variables in DB schemas, although they are a future possibility. Scalar values do exist however, as attribute values in relations.

Values versus Representations

A value is an abstract concept, and is made visible by means of a representation of that value. For example, 468 and 4.68E2 are two different representations of the same scalar numeric value. Thus a value and a representation of a value are different things.

For simplicity, the RAQUEL notation contains only one representation of relvalues as containers of attribute values (c.f. the attribute values themselves) and DB schemas as containers of relvars.

However the RAQUEL notation permits multiple representations of scalar values if the relevant scalar types implement it. In fact RAQUEL permits a representation architecture that corresponds to the ANSI/SPARC 3-layer architecture for DBs. The central layer of the ANSI/SPARC's 3-layers specifies all the real relvars in the whole DB. A DB schema in the upper layer specifies a particular 'presentation' of the DB that is made available to a specific user or set of users. The mapping between that DB schema and the central layer provides Logical Data Independence. The lower layer specifies the storage facilities of all the real relvars. The mapping between the central and lower layers provides Physical Data Independence. Scalar values have a corresponding 3-layer architecture that provides Logical and Physical Data Independence for scalar values. (This is a generalisation of what is proposed in The Third Manifesto).

Scalar values of a given type can be construed as abstractions held in the central layer of the scalar 3-layer architecture. One - at least - or more logical representations of values of that type exist in the upper layer of the scalar 3-layer architecture. These logical representations are available in the RAQUEL logical model for users to exploit. One - at least - or more physical representations of values of that type exist in the lower layer of the scalar 3-layer architecture. These physical representations are used by the RAQUEL DBMS, and are not part of the RAQUEL logical model. The DB Administrator may choose the most suitable of them for any given DB.

For example, a Point type (whose permissible values represent points on a surface) may have a Cartesian Co-ordinate logical representation and a Polar Co-ordinate logical representation available to users. The same scalar point value can be represented in both logical representations; the user chooses which to use, and may use a combination of the two. The implementer of the Point type may use several physical representations for storing point values, perhaps depending on the need to conserve physical storage space, or the need to provide efficient support for the different logical representations. If say a Distance operator were provided that determined the distance between 2 points, it would need to be able to handle points represented by either logical representation (including one point in each representation) as well as handling the different physical representations that might be used to store the 2 point values.

Note that is up to the implementation of a scalar data type to exploit the potential of a 3-layer architecture. An implementation need only have one logical representation and one physical representation (which could even be the same representation) if that were sufficient.