In the last part of this tutorial, I talked about the basics of XML. In this part, I will talk about XML components. This is going to be a long part, but just hang in there:)
Words to Know
XML Parser (also called an XML processor) - A piece of software that reads XML documents. Examples of parsers are: AlphaWorks XML for Java, used by IMB, Microsoft XML Parser, used in Internet Explorer, and Exapt, used in Netscape 6.
Application - A group of programs intended for people to access XML documents. Do not get this term confused with an "XML application" which is indicating that another program has been created to display XML.
Fatal Error - An error in an XML document that the XML parser must detect. Once detected, it does not continue normal processing, yet it looks for further errors.
The Prolog
You may have heard of a prolog as a preface or the beginning to a book. The XML prolog is just the same.
A prolog can consist of five different components:
An XML declaration
Processing instruction(s)
A Document type declaration (DTD)
Comments
White space
In the last part of this tutorial, I gave you this example:
Code:
<?xml version="1.0"?>
<!-- Your first XML code -->
<message>
<saying>
<friendly>How are you?</friendly>
<improper>Hey! You!</improper>
</saying>
</message>
In that example, the prolog would be:
Code:
<?xml version="1.0"?>
<!-- Your first XML code -->
The XML Declaration
The XML declaration (also called the header) should always be on the first line of the XML document, and nothing should come before it. The declaration is:
Notice that the basic syntax is <?xml?>. This is a lot like the PHP basic syntax. There are also three attributes defined in that declaration: the XML version number (the "1.0"), the document's language encoding designation (the "UTF-8"), and the standalone specification (the "yes"). I'll explain them to you for better understanding.
The XML Version Number
The XML version number attribute really states what it is - the current version number of XML. Currently, there is only one version, but later versions will be coming. For now, it is mandatory that you put 1.0 in the attribute.
The Language Encoding Designation
This attribute is optional. Because it is optional, if you do not put this attribute in, the default is "UTF-8". Still there are other chices you can choose from. They are: Unicode, UCS-2, UCS-4, and several others.
The Standalone Specification
This attribute is also optional. This attribute simply says that if the document has any entities (talked about later in this part) then it should be set to "yes". But if there are entities, it should be set to "no". The default is set to "yes".
This part will be continued because it is soo long.
computergeek67
Part 2 Continued
The Document Type Declaration (DTD)
XML does not actually need a DTD, but if you want to put one in, here is the code:
Code:
<!DOCTYPE rootname options>
I'll explain the code.
"DOCTYPE" tells XML that this statement is a DTD. The "rootname" indicates what the document will be about. For example, if you want to have an XML document about gaming, the word "gaming" would replace "rootname" in the code above.
The "options" part tells XML where DTDs are located. This may include attributes, elements, or entities. As I said before, putting in a DTD is optional. But if the coder wants to put in an external DTD, i.e. a DTD from another file, then the DTD has to be included.
Comments
You might want to add comments to explain what some code means. For comments, XML uses the same syntax as HTML:
Code:
<!-- this is a comment-->
Things to remember about comments:
Comments can be put anywhere you want except for before the XML declaration statement and inside markup statements.
Do not use "--" anywhere except for the end. If done, XML will misinterpret it and think that it is the end of the comment.
XML Document Structure
I gave you this example earlier:
Code:
<?xml version="1.0"?>
<!-- Your first XML code -->
<message>
<saying>
<friendly>How are you?</friendly>
<improper>Hey! You!</improper>
</saying>
</message>
I'll go over the parts of this document.
The "<message>" tag is called the root or parent element of the document, since everything else is contained within it. This concept is called "nesting".
The subelement is the "<saying>" tag. Also called a "child element", this tag is contained within the parent element. Because this tag also has elements within it,<friendly> and <improper>, the "<saying>" tag is also a parent element.
Entities
Just as in HTML, there are entities for XML. Have you ever copy-and-pasted a code with multiple tags and then when you put them in your HTML editor, there are symbols and characters in place of those < and >s? Those are entities. Here are some examples of entities:
The left angle bracket <: lt
The right angle bracket >: gt
The apostrophe ': apos
The quotation mark ": quot
The ampersand &: amp
CDATA Sections
Sometimes in XML, you want to display the ampersand (otherwise known as the "and symbol") without typing the word "and". Then you realize that XML recognizes the ampersand as a markup symbol, as it does with the other symbols mentioned above. There is a way to "trick" XML to not recognize the ampersand as a markup symbol. Here's how:
Code:
<colors>
<![CDATA[
Our product comes in red & white!
]]>
</colors>
Notice the "<![CDATA[ ]]>" code. That is the key to making XML think that the ampersand is not a markup symbol.