This technology preview implements the core feature set of
the XML DCD submission while using the element names of XML Data. While the two
notes are similar in structure, there is some difference in keywords and the
like. Note that the inheritance and subclassing features described in an
appendix of the DCD note are not included in the preview implementation.
Generally speaking, XML Schema loses relations, aliases, inheritance, and
complex types as defined in XML Data.
At the time of writing, we are referencing the DCD note of
31 July 1998 and the XML Data Note of 5 January 1998.
The preview supports the following element types:
Schema
Root element of a schema definition
ElementType
Defines a class of elements
AttributeType
Defines a class of attributes
datatype
Specifies the type of an ElementType
or AttributeType element
element
Names a declared element class whose instances may
appear in instances of the element class defined by an ElementType element
attribute
Names a declared attribute class whose instances may
appear in instances of an AttributeType
declaration
description
Provides textual documentation for ElementType and AttributeType
elements
group
Defines a collection class
We will now explore XML Schema by building a simple schema
definition for documents describing employees. This will be a very simple
schema, with the following DTD:
<!ELEMENT EMPLOYEE (NAME, HIREDATE, MANAGER, DEPARTMENT)>
<!ATTLIST EMPLOYEE employment-category (full | part |contract) #REQUIRED>
<!ELEMENT NAME #PCDATA>
<!ELEMENT HIREDATE #PCDATA>
<!ELEMENT MANAGER #PCDATA>
<!ELEMENT DEPARTMENT #PCDATA>
The element is the
root of any schema definition. It is used to declare the name of the schema and
any namespaces required in the schema. It will typically declare the namespaces
for schemas and datatypes. Thus, for our example, we have:
Element types are declared using the ElementType
element. This element has one required attribute and four optional attributes:
Attribute
Description
name
required; name of the element
content
optional an enumeration of empty, textOnly (i.e., PCDATA),
eltOnly (i.e., elements), and mixed (i.e., elements and PCDATA)
dt:type
optional; data type of the element
model
optional; an enumeration of open (can contain content not defined in the schema) and closed
(can only include content defined in the schema)
order
optional; an enumeration of one
(i.e., only of a set of options, analogous to the | DTD operator), seq (i.e., the specified elements must
appear in the specified order), and many
(i.e., permits any of the named elements to appear zero or more times in any
order)
Take the definition of the NAME
element:
<ElementType name="NAME"
content="textOnly"/>
We've defined an element NAME
that contains text, i.e., PCDATA. Let's look at a
more challenging example, the EMPLOYEE
element.
We've said EMPLOYEE
can only contain other elements (content="eltOnly")
and only those elements we've specified in our schema (model="closed").
Moreover, the elements must all appear in the order listed (order="seq"). From there, we go on
to provide a list of the attributes (in this case, employment-category)
and elements (NAME, HIREDATE, MANAGER,
and DEPARTMENT) that are contained in an EMPLOYEE element.
Attributes
The EMPLOYEE definition
included an attribute element. Clearly, we have to be able to declare
attributes in a manner similar to the way we define elements. Not surprisingly,
the AttributeType element exists to do just that.
This element has the following attributes:
Attribute
Description
name
required; name of the attribute being defined
dt:type
optional; data type of the attribute
dt:values
optional; when dt:type
has the value enumeration, this
attribute provides the permissible values
default
optional; default value for the attribute. If dt:type appears, the value of this attribute
must be legal for that type.
required
optional; either of the enumerated values yes or no.
Denotes whether the attribute is required to appear on an element.
The definition for the employment-category
attribute looks like this:
<AttributeType name="employment-category"
required="yes" dt:type="enumeration"
dt:values="full part contract"/>
We can see that this attribute is required to appear in any
element like EMPLOYEE that uses the
attribute. It is an enumerated type with the permissible values full, part,
and contract.
The Complete EMPLOYEE
At this point, it is worthwhile to present the entire schema
for our simple document type. We've managed to do a few things here that we
couldn't do in the DTD version of this document model. We pulled in foreign
namespaces to allow us to talk about schemas and data types. We've strongly
typed our elements and attributes, making life somewhat simpler for our
application programmers. The content and model information was elevated to an
explicit statement through the use of attributes. In the DTD, you had to parse
the line:
to realize the EMPLOYEE
element has sequential order and can contain only elements. Here, we've said so
explicitly. Moreover, the model attribute lets us open up our model if our
applications require us to do so, whereas DTDs are closed by definition.
Our little example omitted two schema elements supported in
the technology preview: datatype and group. The datatype
element is an extension of the dt:type and
dt:values attributes for ElementType and AttributeType
elements. The datatype element allows
us to specify not only the type of an element or attribute, but also minimum
and maximum values. It has the following attributes:
Attribute
Description
dt:type
optional; specifies the type of the element or attribute
dt:values
optional; when dt:type
has the value enumeration, this
attribute allows us to specify the permissible values
dt:max
optional; maximum value inclusive of the
given value
dt:maxExclusive
optional; maximum value exclusive of the given value
dt:min
optional; minimum value inclusive of the given value
dt:minExclusive
optional; minimum value exclusive of the given value
dt:maxlength
optional; allows us to limit the length of certain data types
Let's apply this to HIREDATE.
Suppose our company came into existence on July 20, 1969 and will be disbanded
when the founder retires on December 31, 1999 no Y2K worries for us then! The
definition becomes:
The group element organizes
content into a sequence. It specifies which elements appear, how often, and in
what sequence. The permissible attributes are:
Attribute
Description
maxoccurs
optional; the enumerated values 1 and * (at most one or many occurrences)
minoccurs
optional; the enumerated values 0 or 1 (a minimum of zero or one)
order
required; one of the enumerated values one, seq,
many
The attributes minoccurs and maxoccurs specify the minimum and maximum
number of times the group can occur. The order attribute specifies the sequence and content of the group.
The literal one means exactly one of the elements of the
group may occur. This is like the | (OR) operator in DTDs. The seq
attribute value means the elements in the group all appear, and appear in the
specified order. The value many means
that any of the elements may appear (or not) and in any order.
If we wanted to modify our EMPLOYEE
element so that an employee could belong to multiple departments with a
supervisor in each, e.g., the employee belongs to multiple teams, we would say: