**************************************************************************** ***** DRAFT 0.1 - SANITY CHECK AND PROOF READ ONLY. NOT FOR RELEASE. ***** **************************************************************************** DNS and Mail Exchange, a simple guide. ---------------------------------------------------------------------------- This document is copyright Michael Lawrie, 1998. It may be used and distributed freely, provided it is not changed in any way. The author accepts no responsibility for anything at all, ever. This is a draft copy - Version 0.1 19980416 ---------------------------------------------------------------------------- -- Introduction: This document was written for users of our systems to understand how the domain name system works, how it relates to the delivery of electronic mail and how to use them safely, and properly. I have made the document as generic as possible, and included some background information so that it can be used elsewhere. It is not an overly technical document because it is not aimed at technical people. The aim of this document is to provide a brief but accurate description of DNS (Domain Name System). The industry bible for DNS is a book called "DNS and BIND", by Paul Albitz and Cricket Liu (pub: O'Reilley, ISBN: 1-56592-236-0). Generally speaking, this is an excellent book but, in their introduction they proclaim "DNS is a big topic - big enough to require two authors, anyway". This sets the tone for the book. In reality DNS is amazingly simple but people tend to overcomplicate it all. -- IP addresses. Every host on the Internet has a number by which it is addressed. This is called the IP (Internet Protocol) address. Effectively, a host's IP address is pretty similar to a telephone number or street address. If you want to deliver anything to the host, you must know its number. An IP address is usually written as 4 sets of dotted numbers between 0 and 255. In some situations, there are some illegal combinations of numbers and some whole address ranges are reserved for various reasons. Whilst this next part shouldn't be taken as the complete truth, there are various generalities you can make about IP addresses. . An individual host is addressed by all four parts of the IP address. An example of a "host address" is 193.61.112.61 . If only 3 parts of the IP address are written, and the last digit is zero, or, the address has a "/" followed by a number after it, it is probably a network address rather than a host address. A network address refers to a group of hosts on the same network, rather than a single host. . If the last digit of a "host address" is 255, then it is not a host address really. . If your host address begins with 127, it is probably referring to the host you are currently using. -- Host tables. Host addresses are not the easiest things in the world to remember. It seemed logical at some point to create a lookup table so these things could be given human-friendly names; just like a telephone book in fact. These directorys became known as host tables and basicaly contained an IP address by which the host was really addressed and a list of names by which the host could be addressed. What this meant, was instead of typing: telnet 193.61.112.61 people could simply type: telnet baphomet Host tables were all very well on small networks, but as more and more networks joined together, there was more and more duplication of host names. In telephone books, having a duplicate hostname would be like discovering there were two "John Smith"s. In a telephone book, what would actually happen is that a location would be included with the directory entry and this exactly what happened with the larger host tables. Host names were extended to include a location in an attempt to prevent name duplication. As an example of this, imagine a machine called "teapot" in the computer science department of Leicester University in the UK. A typical name for this would be teapot.computer-science.leicester.university.uk. A machine called "teapot" in Warwick University would be much the same, but it would be teapot.computer-science.warwick.university.uk. In reality, "university" is a bit specific, so "academic" was a better category and people like to abbreviate so they would end up as teapot.cs.le.ac.uk and teapot.cs.warwick.ac.uk. -- Central host table. As more and more small networks joined together to form other networks various people with an administrative bent decided that there should be a single definitive list of all the hosts on that network. On the Internet, this host table became known as "HOSTS.TXT" and was stored on a machine at Stanford Research Institute. Once every week or so, systems administrators from all over the Internet would transfer this file to their own machines and convert it to a format that their systems could understand. Stanford (SRI) would add hosts every now and then as they received notification but they didn't do much in the way of checking for validity or enforcement of 'proper naming'. It is fair to say that there were quite a few cockups in the early days. In the UK there was a system called the NRS (Name Registration Service) that was a lot more heavily 'policed'. The NRS also insisted that host names were the opposite way around to the way the Americans did so for a while, people on the Internet would refer to a host as, say: hicom.lut.ac.uk whereas people in the UK would refer to it as: uk.ac.lut.hicom. The central host table was a nice idea whilst the number of hosts was quite small but as the number grew massively, this file became larger, more and more out of date and SRI started to suffer from all of the hosts downloading it all of the time. -- Nameservers. A namesserver is a server that takes a host name from a client and gives a host address back in return. In some respects, SRI was acting as a 'nameserver' except it gave back all of the names and addresses on the network as its answer. To cut down on all of the traffic caused by this huge file being shipped all over the place, the aim was to have servers that could respond to 'host lookup' requests, all using the same database. It was proposed (by a chappie called Paul Mockapetris) that instead of having a single name server and a single large database, there should be a number of nameservers all with their own bits of database. A client wanting information on a particular name would go to the relevant nameserver to get the peice of information. Thus, the idea of distributed name server was born. It made sense that an organisational unit (for example, a University Computer Science Dept.) would maintain its own database and if people wanted to know about a host there, they would ask a nameserver based there about it. All that was needed was a way of sticking them all together and directing name lookups to the right server. -- Internet Domain names. The hierarchical structure of hostnames now became a standard. A set of top level domains was created in order to put other organizational units underneath. Most of the countrys got a two letter country code (eg: "uk" for the United Kingdom, "fr" for France and "no" for Norway). Because the system was created in the United States and because the Americans are strange about this sort of thing, they also created a set of top level domains for their own use: "com" for commercial organisations, "mil" for the military, "gov" for government, "net" for network providers, "edu" for educational facilities and "org" for things that didn't really fit elsewhere. They also created "us" for the USA, but very few people really bother using it. Today, the top level "com", "net" and "org" domains have been adopted as international names, which makes things seem a little bit more sensible. If the same were done for "mil", "edu" and "gov" it would be more logical again, but there you go. These top level domains exist simply to put other domains underneath, in the heirarchy. Some countrys like the UK provide another administrative level. "co.uk" is for UK companys, "nhs.uk" is for the National Health Service, "ac.uk" is for UK academic sites, "net.uk" for UK network providers, "org.uk" for most other things and there are a few more, for police and government. If parts of this structure is drawn out as a map, it starts to look something like this (this will look silly with proportional fonts loaded): (.) | ---------------------------------- / | | | \ com net edu org uk / \ | | | / | \ / \ / | \ baa uknet ac co org / | \ / | | mail www weevil leeds / \ cs leva / \ vax1 vax2 The root of this heirarchy (or tree) is usually referred to as "dot". It is useful if this is always considered as part of the domain. At the end of a lot of these branches, are actual host names so some hosts from this example are: baa.com. www.uknet.com. vax1.cs.leeds.ac.uk. leva.leeds.ac.uk. Note the dot, at the end of the domain name, often this is missed out of "fully qualified domain names" but occasionally, especially when dealing with DNS, it can be very important. -- Distributed name servers. As previously explained, the idea of distributed name servers is to give a domain administrator control over what is in their domain. A domain in this sense is anything that is 'above' a host in the tree structure. Referring to the previous example, the domains are as follows: The root: "dot" The top level domains: com. net. edu. org. uk. Normal domains: uknet.com. ac.uk. co.uk. org.uk. leeds.ac.uk. cs.leeds.ac.uk. Each of these domains must have a domain server that describes what hosts or 'sub domains' are within it. Now, imagine a web browser wants to find the IP address associated with the name "www.uknet.com.". The name is passed onto part of the computer's Internet software called the resolver. Most resolvers are configured to first look in the local machine's host-table to see if the address is in there and if it isn't, to use DNS to resolve the name. The resolver has a list of DNS servers to send its query to, if it doesn't get a response from the primary it will try its secondary, and so on. When the resolver finds a working DNS server it will send the domain name to the server, and ask for its IP address. The following description of how it does this is a mostly accurate description of how a DNS server gets its information. When a DNS server receives a request for a name the first thing it will do is to check its caches to see if it has already had requests for this domain before. If it finds the address in its cache, it will respond with the required information. If not it knows that it will have to look elsewhere (note: If the DNS information is actually held locally on this server then it will be in the cache by default, so there is no need for an extra check). If it is not in the cache then it needs to find out where the information for the host "www.uknet.com." is stored. It knows that the address record for "www.uknet.com." will be stored in the nameserver for "uknet.com." so what it has to now do is find out where this nameserver is. Firstly it looks in its own caches to see if it has nameserver information for the domain "uknet.com.", if it doesn't, it knows that the domain "com." has this information in it. It then looks in its caches to see if it has nameserver information for "com." and again, if not, it knows that this information exists in the root domain, or "dot". All nameserver caches are pre-loaded with the nameserver information for "dot" (in naming terms, these are in the domain ROOT-SERVERS.NET.) so, our nameserver now sends a request to "dot" to ask what the address of the nameservers for "com." are. When it gets a reply, it puts this information into its cache, and then asks the nameservers for "com." what the address of the nameserver for "uknet.com." is. Remember, the nameserver for "com." will have "uknet.com." in its caches because "uknet.com." is a local domain to it so it will send back a quick response. Our nameserver now knows the address of the nameserver for "uknet.com." so it pops it into its cache in case it is needed later and then asks that server what the address of "www.uknet.com." is. The nameserver for "uknet.com." will have this in its caches (assuming it is a valid domain) and so sends back an address, or an error. Assuming our DNS server gets a valid address, it is popped into the cache and the DNS server finally tells the resolver the address for "www.uknet.com.". As you can see, this simple example involved quite a lot of steps. With a domain like "vax1.cs.leeds.ac.uk." there would be quite a lot of steps. In reality, a lot of responses come straight from caches and frequently used nameservers can quickly aquire huge amounts of information in cache. -- Caches. One of the advantages of DNS over the original single host table was that the information changes were dynamic. One of the problems with allowing nameservers to accumulate large caches is that if a change is made, this may not be seen because a resolved may be using previously cached data on a nameserver. To get around this, each domain has caching certain information in its data file. This information tells the nameserver caches the following time periods. . Refresh: How often the nameserver should refresh this information from the original source. . Retry: If the nameserver fails to refresh the information for some reason, it should retry every now and then. This is the period of time between retrys. . Expire: If the nameserver can't get new information within this time period, the data should be considered "dead" and thrown away. . TTL (Time to live): If the nameserver is set to throw away dead data before the expire time it should keep it for at least this amount of time. This is used when the nameserver holding the host information is very unreliable. -- Types of record. The database format of DNS can hold a lot more information than the old host files could. Each of the records in a domain file (or more accurately a "zone file") has various tags to identify what sort of information it is. Specific querys return specific types of information. In the "www.uknet.com." example, the original information being requested was for the host address, in the same example, however, the servers were also asking for nameserver addresses. These are examples of two types of query. The Microsoft implementation of DNS tries to make things easy for people to use, and lets them get away with what BIND fans would call mistakes. Most DNS information is written in the BIND format so it is worth knowing a bit about the file format. This explanation is very much simplified, if you need to write proper BIND database files, then read "DNS and BIND". Firstly, there are two types of zone file. The normal zone file and the "reverse" zone file. The normal zone file contains a list of records within a domain along with a data identifier and a value for that data. Reverse domain files are the same but may only have certain data fields within them. Both types of file must have an SOA (Start Of Authority) record that says who is responsible for the domain and contains the cache timeout values; they will also usually have one or more NS (nameserver) records to identify the other nameservers that can be considered to hold authoritive data for that domain. The SOA record contains the following information: . Primary name server DNS name: This is the name of the nameserver which holds the 'authoritive' data for this zone . Responsible person mailbox: This is the mail address of the person who is responsible for this domain. The "@" sign is replaced with a "." so if the responsible person is "michael@uknet.com" it would be written as "michael.uknet.com." . Serial number: This is the serial number of the current data in the zone file. Every time this data is changed the serial number should be changed as well so that remote nameservers know there has been a change. It has become convention that the serial number is the full data followed by a two digit 'serial' number, for example, 1998041701 would be the first change on the 17th of April, 1998 and 1998041903 would be the third change on the 19th of April, 1998. . Cache timeouts in seconds: Refresh, Retry, Expire and TTL. An example of an SOA record is: uknet.com. SOA elfish.noc.uknet.net. hostmaster.uknet.net. ( 1998040501 ;serial 10800 ;refresh 7200 ;retry 604800 ;expire 86400 ) ;minim Within a zone file, any hostname without a "." at the end is considered to be a host within the domain. This tends to make zone files more readable but can really screw things up when people forget to put the dot when it is needed. The simplest record is probably the "A" record. This record is an address record and simply provides an address for the name. Referring to our previous example (www.uknet.com.) the entry in uknet.com.'s zone file could say either: www A 193.61.112.61 or: www.uknet.com. A 193.61.112.61 To confuse things, the BIND database requires an extra tag to say that this data referes to an Internet record so this is often written as: www IN A 193.61.112.61 Basically, if there is an "IN" in the second column, ignore it, this document will do so from now on. The following records are allowed in a zone file. It is not a complete list because the complete list is silly. CNAME: Stands for "Canonical Name" and means "alias". For example: www CNAME fred www.uknet.com. CNAME fred.uknet.com. Incidentally, if the last dot had been missed off in "fred.uknet.com." and the record had been: www.uknet.com. CNAME fred.uknet.com This would actually mean: www.uknet.com. CNAME fred.uknet.com.uknet.com. Since if the dot isn't there, the zone file name is tagged on. AAAA: HINFO: MX: NS: RP: TXT: WKS: -- Bells and whistles. -- Old time mail delivery. -- Modern mail delivery. -- POP3. -- Other things that crop up -- Classes -- Further reading.