Topics

Next topics

Step Two: Writing a Document Type Definition

The goal of this step is building a DTD. DTD authoring can be difficult but by following the instructions outlined in this step, by referring to the examples and by keeping your DTD simple, you should be able to write your own DTD.

DTDs are made up of markup declarations, which are element type, attribute list, entity and notation declarations. These declarations define the structure, descriptive and storage information for a document type. They allow you to describe various parts of your document type so that authors can produce conforming documents with ease. As stated in step one, element type and attribute list declarations are the most important and commonly used markup declarations. This tutorial will concentrate on these declarations, as they are the most useful and relatively easy to learn.

Element type declarations define element types, which are resolved in markup, in documents, as elements. Each element type includes a name, content, and possibly a set of attributes and can be resolved many times as conforming elements of that type. Element content can take four forms, of which the most common are mixed- and element-content. Element-content models list groups of elements and groups of groups of elements in certain relationships, to define the particular content of element types. Sequence groups describe a required and ordered occurrence of their members, which are separated by the , symbol. Choice groups describe a single occurrence of only one of their members, which are separated by the | symbol. The occurrence of each of the members of these groups and the groups themselves can be modified by the following occurrence indicators, which define occurrence as: +, required and repeatable; *, optional and repeatable, and; ?, optional. Element type declarations take the following form: <!ELEMENT NAME CONTENT>. The following example includes an element type name and as content, a sequence group and a nested optional-repeatable choice group: <!ELEMENT EXAMPLE (TITLE?,PARA,(PARA|NOTE|CODE)*)>. This element type declaration describes an element type that can include a TITLE, must include a PARA and that can include any number of unordered occurrences of PARA, NOTE and CODE.

Mixed-content element type declarations are quite similar to their element-content counterparts but differ in important respects. They must employ an optional-repeatable choice group and must also include #PCDATA, text, as their first member. Mixed-content element types that do not include element-content but only text do not use an optional-repeatable choice group. The two following mixed-content element type declarations illustrate the syntactic difference between mixed-content element types that contain element-content and those that do not: <!ELEMENT PARAGRAPH (#PCDATA|Keyword)*> <!ELEMENT Keyword (#PCDATA)>.

Attribute declarations are much more variable than element type declarations, having ten different types. The most common and simplest types of attributes are string and enumeration attributes. String type attributes, commonly called CDATA attributes, allows you to capture virtually unconstrained text strings to describe element content or behaviour. Enumeration attributes are similar to string type attributes but require you to set a list of options for authors to pick from. Both of these attributes require attribute defaults, which can be: #REQUIRED, required; #IMPLIED, optional; #FIXED "value", a fixed value, and; "value", a default but overridable value. String type attributes take the form: <!ATTLIST ELEMENT-NAME NAME CDATA DEFAULT>. Enumeration attributes take the form: <!ATTLIST ELEMENT-NAME NAME (choice1|choice2|choicen) DEFAULT>. Both attribute types are resolved in markup in the following form, in element start-tags: <ELEMENT-NAME NAME="value">.

Your Valid XML Instance

You should build your own XML document, as you go through the tutorial. Look to Your Valid XML Instance as you read through the tutorial, for guidance and as an example. Your document may be quite different than this example but should be built in a similar way. At this point in valid XML document creation, you don't have a document at all, but the DTD to which your document will link. Do notice how this example uses element type declarations to capture the content of the document and attributes to describe certain portions of it.

<!ELEMENT MEMO     (TO,FROM,SUBJECT,BODY,SIGN)>
<!ATTLIST MEMO     importance   (HIGH|MEDIUM|LOW) "LOW">
<!ELEMENT TO       (#PCDATA)>
<!ELEMENT FROM     (#PCDATA)>
<!ELEMENT SUBJECT  (#PCDATA)>
<!ELEMENT BODY     (P+)>
<!ELEMENT P        (#PCDATA)>
<!ELEMENT SIGN     (#PCDATA)>
<!ATTLIST SIGN     signatureFile CDATA #IMPLIED
                   email         CDATA #REQUIRED>

Sequence and Choice Operators and Occurrence Indicators

The following sequence and choice operators and occurrence indicators modify the nature of element types.

Term
Meaning
,
Sequence operators separate members of a sequence list, which require sequential use of all members
|
Choice operators separate members of a choice list, which require use of one and only one member
This non-symbol indicates a required occurrence
+
This symbol indicates a required and repeatable occurrence
*
This symbol indicates an optional and repeatable occurrence
?
This symbol indicates an optional occurrence