Here is a collection of my study notes on the Domain Name System (DNS) protocol. I’ve split this article in four pages.
DNS Hierarchy and Database
Remember the format of a URL:
DNS is hierarchical. It’s not flat. If you’ve worked with UNIX and Linux file systems, you’ll notice there is a similarity.
The next thing you’ll notice is you have a “dot” symbol at the top of the pyramid. This is the root domain, or root name server. Immediately below it, there are extensions that we all know such as “com” and “org”. These are the Top Level Domains TLDs:
Below the TLDs are the second-level domains, such as Cisco, Microsoft, etc. And every domain below them is simply called sub-domain:
The domain name is segmented into different domain levels. Each domain level contains many servers. For example, “ru.wikipedia.org” has four levels:
- the root domain “.”
- the Top Level Domain (TLD) “org”
- the domain “wikipedia”
- the subdomain “ru”
DNS is distributed, i.e. the DNS database is distributed across many servers. So no one Internet name server has a complete version of the DNS database.
Also, the database is replicated among servers with Anycast technology, which allows redundancy of the DNS information.
Each domain name is maintained by more than one server. For example, the information for the domain name Keyboardbanger.com exists on at least two name servers.
The DNS resolver is a software that resides on the client machine (or computer) and performs DNS lookup.
If there are further DNS requests made by the client to the same domain name, the resolver retrieves the answer from its DNS cache. We’ll talk about the DNS cache in the next paragraph.
The information that the Resolver receives from a DNS server is called Resource Records.
Facts About The Protocol
DNS is an application-layer protocol that provides translation service to other protocols such as HTTP, email protocols and FTP.
In the past, clients used the HOSTS.txt file to maintain names-to-IP resolutions. The file was updated regularly through a file transfer protocol from a server on the Internet
DNS queries have a default UDP destination port of 53 and a highly-numbered source port. DNS can also run on TCP port 53.
A DNS datagram can have at maximum size of 512B. If the DNS response datagram has a bigger size, then the response is truncated over multiple datagrams. This is possible thanks to the TC flag in the DNS message header.
It takes some time to propagate changes in name-to-IP resolutions across the DNS servers, and it ranges from a couple of minutes to a couple of hours. But this is accepted in the DNS world.
DNS servers on the Internet are usually Linux servers running the BIND process
Remember the End-to-End design principle? DNS servers obey the rules and is implemented at the “edges” of the network
DNS allows the following benefits:
- Host aliasing: one canonical hostname can be associated with one or more alias names. We usually tend to prefer alias names because they are more mnemonic
- Web and email server aliasing: instead of displaying websites or email addresses in canonical form, we can use aliases to display “fashionable” names. For example, having a website named “DC1-serv.contoso.com” is less desirable to type in a browser than “contoso.com”. And having an email address like “firstname.lastname@example.org” is not as cool as “email@example.com”
- Load distribution: DNS allows traffic load balancing among IP hosts associated with a domain name. Some busy web servers such as cnn.com run on more than one host. The domain name “cnn.com” is mapped to more than one IP address. Each time a DNS query is received, the DNS server replies with all the IP addresses mapped to the domain name, but reversing their order each time.
In order to avoid querying name servers and to make the DNS process a bit faster, DNS clients -the resolvers- have a DNS cache. The DNS cache stores DNS resource records locally for future use whenever a request to a same domain name is made.
The entries in the DNS cache -the resource records- have different aging times. Each record is kept for the duration of the value of TTL of the DNS message received. An implication of this fact is that, even if a resource record is updated somewhere on an Internet name server, the changes won’t be seen by the resolver unless the TTL of the local resource record entry expires.
For example, let’s say a record for keyboardbanger.com is cached and has TTL=3600 sec (1 hour). At time t=2mn, you make a change in a DNS resource record (on the Internet). The resolver won’t see these changes until t=60 – 2 = 58mn later. What does this mean is that, when your PC needs to resolve keyboardbanger.com, the DNS resolver willr return a local information, which is not up to date. In fact, it will take the resolver 58 minutes to clear the cache entry and learn about the resource record modification.
What about resolvers around the world that did not cache the old entry? Well, whenever they need to resolve the domain name keyboardbanger.com, after we made the manual changes, they will get the modified resource record.
So take this in mind whenever you want to make DNS changes.
Types of DNS Name Servers
Local Name Servers
Local name servers are geographically close to the querying host, such as corporate DNS servers or DNS servers for residential Internet users. In the latter case, the local name server sits a couple of routers away from the residential LAN.
Local name servers are able to respond to queries with the mapped IP address, if it’s about the same domain name.
On a computer, local name servers are usually configured automatically through DHCP or manually.
Local name servers have a file called root hints file that stores the IP addresses of root name servers.
Root Name Servers
Root name servers form the root domain. It is maintained by 13 highly secured logical servers. See that I said “logical” and not “physical”, because there are hundreds of physical servers dedicated to the root domain.
When solicited, if the root name server has the IP address of the requested name, it responds with it to the querying host. But this is not common because root name servers should not be burden with such a task. Often, it does not reply with an IP address. Instead, it queries an authoritative name server or forwards the query to an intermediate name server. Beware that the root name server queries the authoritative name server directly only if it knows its IP address.
Authoritative Name Servers
A name server is called “authoritative” for hostnameX if it has the mapping hostnameX-to-IP.
An authoritative domain is domain name on which the name server has authority, i.e. it can map the name to IP:
- Root name servers are authoritative for the root zone.
- Local name servers are often authoritative name servers for their respective domain names.
How DNS Works
We will understand how DNS works by looking at an example. In this example, all the interactions -except for the root name server- are assumed to be recursive, although some may argue that all queries should be iterative.
- “Surf.eurocom.fr” needs to communicate with “gaia.cs.umass.edu”. We don’t know whether it is a HTTP communication, an FTP transfer or an email service. Either way, there must be a DNS name resolution. So “Surf.eurocom.fr” DNS client, the resolver, sends a DNS request to its local DNS name server, “dns.eurocom.fr”. The request litterally contains the name “gaia.cs.umass.edu”.
- “dns.eurocom.fr” looks up “gaia.cs.umass.edu” into its DNS database. It does not find any entry. So it requests it on behalf of “Surf.eurocom.fr” to the root name server.
- the root name server does not know how to resolve this domain name; it responds with the IP address of an intermediate name server that can help with the request. The intermediate name server host name is “dns.umass.edu”
- the local name server requests the intermediate name server “dns.umass.edu”
- on behalf of the local name server, the intermediate name server queries an authoritative name server, “dns.cs.umass.edu”
- the authoritative name server has the needed mapping and replies with the IP address of the host “gaia”
- the intermediate name server transfers the reply back to the local name server,
- the local name server transfers the reply back to the requesting host.
- the requesting host installs the DNS response in its DNS cache
Recursive queries vs Iterative queries
A DNS query is said recursive when a host A (it could be a name server too) sends a DNS query to a name server B that will fetch the mapping on behalf of A.
A DNS query is said iterative (or non recursive) when a host A (it could be a name server too) sends a DNS request to a name server B, which immediately responds with a reply.
As we saw above, we assumed that a DNS request follows a chain of queries from one server to another. Let’s call it query chain.
In a query chain, often all DNS requests are recursive, except requests between local name servers and root name servers. Root name servers reply with NS records and A records which contain IP addresses of authoritative name servers (see glue records) and not the final mapped IP address. That’s because root name servers must not be burdened with such “basic” tasks.
Sometimes DNS requests in the query chain are iterative. Each time the DNS resolver requests name servers -whether they are root name servers or others-, it receives responses and caches them in its DNS cache.
But how can the DNS resolver contact the root name server? Is it true that it needs a name resolution to get the IP address of the root name server? In an article on IANA website, it is indicated that client DNS applications implement the IP addresses and names of root name servers in file called root hints file.