****************************************************************************
*****  DRAFT 0.1 - SANITY CHECK AND PROOF READ ONLY. NOT FOR RELEASE.  *****
****************************************************************************

                   DNS and Mail Exchange, a simple guide.

----------------------------------------------------------------------------
    This document is copyright Michael Lawrie, 1998. It may be used and
   distributed freely, provided it is not changed in any way. The author
          accepts no responsibility for anything at all, ever.

                This is a draft copy - Version 0.1 19980416
----------------------------------------------------------------------------

-- Introduction:

This document was written for users of our systems to understand how
the domain name system works, how it relates to the delivery of
electronic mail and how to use them safely, and properly.

I have made the document as generic as possible, and included some
background information so that it can be used elsewhere. It is not an
overly technical document because it is not aimed at technical people.

The aim of this document is to provide a brief but accurate description
of DNS (Domain Name System). The industry bible for DNS is a book called
"DNS and BIND", by Paul Albitz and Cricket Liu (pub: O'Reilley, ISBN:
1-56592-236-0). Generally speaking, this is an excellent book but, in
their introduction they proclaim "DNS is a big topic - big enough to
require two authors, anyway". This sets the tone for the book. In reality
DNS is amazingly simple but people tend to overcomplicate it all.

-- IP addresses.

Every host on the Internet has a number by which it is addressed. This
is called the IP (Internet Protocol) address. Effectively, a host's IP
address is pretty similar to a telephone number or street address. If
you want to deliver anything to the host, you must know its number.

An IP address is usually written as 4 sets of dotted numbers between
0 and 255. In some situations, there are some illegal combinations of
numbers and some whole address ranges are reserved for various reasons.

Whilst this next part shouldn't be taken as the complete truth, there
are various generalities you can make about IP addresses.

. An individual host is addressed by all four parts of the IP address.
An example of a "host address" is 193.61.112.61

. If only 3 parts of the IP address are written, and the last digit is
zero, or, the address has a "/" followed by a number after it, it is 
probably a network address rather than a host address. A network address
refers to a group of hosts on the same network, rather than a single host.

. If the last digit of a "host address" is 255, then it is not a host
address really. 

. If your host address begins with 127, it is probably referring to the
host you are currently using.

-- Host tables.

Host addresses are not the easiest things in the world to remember. It
seemed logical at some point to create a lookup table so these things
could be given human-friendly names; just like a telephone book in fact.
These directorys became known as host tables and basicaly contained an
IP address by which the host was really addressed and a list of names
by which the host could be addressed. What this meant, was instead of
typing: telnet 193.61.112.61 people could simply type: telnet baphomet

Host tables were all very well on small networks, but as more and more
networks joined together, there was more and more duplication of host
names. In telephone books, having a duplicate hostname would be like
discovering there were two "John Smith"s. In a telephone book, what
would actually happen is that a location would be included with the
directory entry and this exactly what happened with the larger host
tables. Host names were extended to include a location in an attempt
to prevent name duplication.

As an example of this, imagine a machine called "teapot" in the
computer science department of Leicester University in the UK. A typical
name for this would be teapot.computer-science.leicester.university.uk. A
machine called "teapot" in Warwick University would be much the same, but
it would be teapot.computer-science.warwick.university.uk. In reality,
"university" is a bit specific, so "academic" was a better category and
people like to abbreviate so they would end up as teapot.cs.le.ac.uk and
teapot.cs.warwick.ac.uk.

-- Central host table.

As more and more small networks joined together to form other networks
various people with an administrative bent decided that there should be
a single definitive list of all the hosts on that network. On the
Internet, this host table became known as "HOSTS.TXT" and was stored on
a machine at Stanford Research Institute. Once every week or so, systems
administrators from all over the Internet would transfer this file to
their own machines and convert it to a format that their systems could
understand. Stanford (SRI) would add hosts every now and then as they
received notification but they didn't do much in the way of checking
for validity or enforcement of 'proper naming'. It is fair to say that
there were quite a few cockups in the early days. In the UK there was
a system called the NRS (Name Registration Service) that was a lot
more heavily 'policed'. The NRS also insisted that host names were
the opposite way around to the way the Americans did so for a while,
people on the Internet would refer to a host as, say: hicom.lut.ac.uk
whereas people in the UK would refer to it as: uk.ac.lut.hicom.

The central host table was a nice idea whilst the number of hosts was
quite small but as the number grew massively, this file became larger,
more and more out of date and SRI started to suffer from all of the
hosts downloading it all of the time.

-- Nameservers.

A namesserver is a server that takes a host name from a client and gives
a host address back in return. In some respects, SRI was acting as a
'nameserver' except it gave back all of the names and addresses on the
network as its answer. To cut down on all of the traffic caused by this
huge file being shipped all over the place, the aim was to have servers
that could respond to 'host lookup' requests, all using the same database.

It was proposed (by a chappie called Paul Mockapetris) that instead of
having a single name server and a single large database, there should be
a number of nameservers all with their own bits of database. A client
wanting information on a particular name would go to the relevant 
nameserver to get the peice of information. Thus, the idea of distributed
name server was born. It made sense that an organisational unit (for
example, a University Computer Science Dept.) would maintain its own
database and if people wanted to know about a host there, they would
ask a nameserver based there about it. All that was needed was a way of
sticking them all together and directing name lookups to the right server.

-- Internet Domain names.

The hierarchical structure of hostnames now became a standard. A set of
top level domains was created in order to put other organizational units
underneath. Most of the countrys got a two letter country code (eg: "uk"
for the United Kingdom, "fr" for France and "no" for Norway). Because
the system was created in the United States and because the Americans are
strange about this sort of thing, they also created a set of top level
domains for their own use: "com" for commercial organisations, "mil" for
the military, "gov" for government, "net" for network providers, "edu" for
educational facilities and "org" for things that didn't really fit elsewhere.
They also created "us" for the USA, but very few people really bother
using it. Today, the top level "com", "net" and "org" domains have been
adopted as international names, which makes things seem a little bit more
sensible. If the same were done for "mil", "edu" and "gov" it would be more
logical again, but there you go.

These top level domains exist simply to put other domains underneath, in
the heirarchy. Some countrys like the UK provide another administrative level.
"co.uk" is for UK companys, "nhs.uk" is for the National Health Service,
"ac.uk" is for UK academic sites, "net.uk" for UK network providers, "org.uk"
for most other things and there are a few more, for police and government.

If parts of this structure is drawn out as a map, it starts to look something
like this (this will look silly with proportional fonts loaded):

                                 (.)
                                  |
                   ----------------------------------
                  /        |      |       |           \
                com       net    edu     org          uk
              /     \      |      |       |        /  |   \
             /       \                            /   |    \
           baa     uknet                        ac   co   org
                  /  |   \                      /     |    |
              mail  www  weevil               leeds 
                                             /     \
                                            cs     leva
                                           /  \
                                       vax1   vax2

The root of this heirarchy (or tree) is usually referred to as "dot". It is
useful if this is always considered as part of the domain. At the end of a
lot of these branches, are actual host names so some hosts from this example
are:

    baa.com.
    www.uknet.com.
    vax1.cs.leeds.ac.uk.
    leva.leeds.ac.uk.

Note the dot, at the end of the domain name, often this is missed out of
"fully qualified domain names" but occasionally, especially when dealing
with DNS, it can be very important.

-- Distributed name servers.

As previously explained, the idea of distributed name servers is to give
a domain administrator control over what is in their domain. A domain in
this sense is anything that is 'above' a host in the tree structure.
Referring to the previous example, the domains are as follows:

The root: "dot"
The top level domains: com. net. edu. org. uk.
Normal domains: uknet.com. ac.uk. co.uk. org.uk. leeds.ac.uk. cs.leeds.ac.uk.

Each of these domains must have a domain server that describes what hosts
or 'sub domains' are within it. 

Now, imagine a web browser wants to find the IP address associated with the
name "www.uknet.com.". The name is passed onto part of the computer's
Internet software called the resolver. Most resolvers are configured to
first look in the local machine's host-table to see if the address is in
there and if it isn't, to use DNS to resolve the name. The resolver has a
list of DNS servers to send its query to, if it doesn't get a response from
the primary it will try its secondary, and so on. When the resolver finds a
working DNS server it will send the domain name to the server, and ask for
its IP address. The following description of how it does this is a mostly
accurate description of how a DNS server gets its information.

When a DNS server receives a request for a name the first thing it will
do is to check its caches to see if it has already had requests for this
domain before. If it finds the address in its cache, it will respond
with the required information. If not it knows that it will have to look
elsewhere (note: If the DNS information is actually held locally on this
server then it will be in the cache by default, so there is no need for an
extra check).

If it is not in the cache then it needs to find out where the information
for the host "www.uknet.com." is stored. It knows that the address record
for "www.uknet.com." will be stored in the nameserver for "uknet.com." so
what it has to now do is find out where this nameserver is.

Firstly it looks in its own caches to see if it has nameserver information
for the domain "uknet.com.", if it doesn't, it knows that the domain "com."
has this information in it. It then looks in its caches to see if it has
nameserver information for "com." and again, if not, it knows that this
information exists in the root domain, or "dot". All nameserver caches are
pre-loaded with the nameserver information for "dot" (in naming terms,
these are in the domain ROOT-SERVERS.NET.) so, our nameserver now sends a
request to "dot" to ask what the address of the nameservers for "com." are.

When it gets a reply, it puts this information into its cache, and then asks
the nameservers for "com." what the address of the nameserver for
"uknet.com." is. Remember, the nameserver for "com." will have "uknet.com."
in its caches because "uknet.com." is a local domain to it so it will send
back a quick response. Our nameserver now knows the address of the nameserver
for "uknet.com." so it pops it into its cache in case it is needed later and
then asks that server what the address of "www.uknet.com." is. The nameserver
for "uknet.com." will have this in its caches (assuming it is a valid domain)
and so sends back an address, or an error. Assuming our DNS server gets a valid
address, it is popped into the cache and the DNS server finally tells the
resolver the address for "www.uknet.com.".

As you can see, this simple example involved quite a lot of steps. With a
domain like "vax1.cs.leeds.ac.uk." there would be quite a lot of steps. In
reality, a lot of responses come straight from caches and frequently used
nameservers can quickly aquire huge amounts of information in cache.

-- Caches.

One of the advantages of DNS over the original single host table was that
the information changes were dynamic. One of the problems with allowing
nameservers to accumulate large caches is that if a change is made, this
may not be seen because a resolved may be using previously cached data on
a nameserver. To get around this, each domain has caching certain information
in its data file. This information tells the nameserver caches the following
time periods.

. Refresh: How often the nameserver should refresh this information from
  the original source.
. Retry: If the nameserver fails to refresh the information for some
  reason, it should retry every now and then. This is the period of time
  between retrys.
. Expire: If the nameserver can't get new information within this time
  period, the data should be considered "dead" and thrown away.
. TTL (Time to live): If the nameserver is set to throw away dead data
  before the expire time it should keep it for at least this amount of
  time. This is used when the nameserver holding the host information
  is very unreliable.

-- Types of record.

The database format of DNS can hold a lot more information than the old
host files could. Each of the records in a domain file (or more accurately
a "zone file") has various tags to identify what sort of information it is.
Specific querys return specific types of information. In the "www.uknet.com."
example, the original information being requested was for the host address,
in the same example, however, the servers were also asking for nameserver
addresses. These are examples of two types of query.

The Microsoft implementation of DNS tries to make things easy for people
to use, and lets them get away with what BIND fans would call mistakes.
Most DNS information is written in the BIND format so it is worth knowing
a bit about the file format. This explanation is very much simplified, if
you need to write proper BIND database files, then read "DNS and BIND".

Firstly, there are two types of zone file. The normal zone file and the
"reverse" zone file. The normal zone file contains a list of records
within a domain along with a data identifier and a value for that data.
Reverse domain files are the same but may only have certain data fields
within them. Both types of file must have an SOA (Start Of Authority)
record that says who is responsible for the domain and contains the
cache timeout values; they will also usually have one or more NS
(nameserver) records to identify the other nameservers that can be
considered to hold authoritive data for that domain.

The SOA record contains the following information:

. Primary name server DNS name: This is the name of the nameserver which
holds the 'authoritive' data for this zone

. Responsible person mailbox: This is the mail address of the person who
is responsible for this domain. The "@" sign is replaced with a "." so
if the responsible person is "michael@uknet.com" it would be written as
"michael.uknet.com."

. Serial number: This is the serial number of the current data in the zone
file. Every time this data is changed the serial number should be changed
as well so that remote nameservers know there has been a change. It has
become convention that the serial number is the full data followed by a
two digit 'serial' number, for example, 1998041701 would be the first
change on the 17th of April, 1998 and 1998041903 would be the third change
on the 19th of April, 1998.

. Cache timeouts in seconds: Refresh, Retry, Expire and TTL.

An example of an SOA record is:

uknet.com.      SOA     elfish.noc.uknet.net.  hostmaster.uknet.net. (
                1998040501      ;serial
                10800   ;refresh
                7200    ;retry
                604800  ;expire
                86400 ) ;minim

Within a zone file, any hostname without a "." at the end is considered
to be a host within the domain. This tends to make zone files more
readable but can really screw things up when people forget to put the
dot when it is needed.

The simplest record is probably the "A" record. This record is an address
record and simply provides an address for the name. Referring to our
previous example (www.uknet.com.) the entry in uknet.com.'s zone file
could say either:

www              A    193.61.112.61                 or:
www.uknet.com.   A    193.61.112.61

To confuse things, the BIND database requires an extra tag to say that
this data referes to an Internet record so this is often written as:

www        IN    A    193.61.112.61

Basically, if there is an "IN" in the second column, ignore it, this
document will do so from now on.

The following records are allowed in a zone file. It is not a complete
list because the complete list is silly.

CNAME: Stands for "Canonical Name" and means "alias". For example:

www               CNAME      fred
www.uknet.com.    CNAME      fred.uknet.com.

Incidentally, if the last dot had been missed off in "fred.uknet.com."
and the record had been:

www.uknet.com.    CNAME      fred.uknet.com

This would actually mean:

www.uknet.com.    CNAME      fred.uknet.com.uknet.com.

Since if the dot isn't there, the zone file name is tagged on.

AAAA:
HINFO:
MX:
NS:
RP:
TXT:
WKS:

-- Bells and whistles.

-- Old time mail delivery.

-- Modern mail delivery.

-- POP3.

-- Other things that crop up
   -- Classes

-- Further reading.