Here is a collection of my study notes on the Domain Name System (DNS) protocol.
DNS Hierarchy and Database
Remember the format of a URL:
DNS is hierarchical. It’s not flat. If you’ve worked with UNIX and Linux file systems, you’ll notice there is a similarity.
The next thing you’ll notice is you have a “dot” symbol at the top of the pyramid. This is the root domain, or root name server. Immediately below it, there are extensions that we all know such as “com” and “org”. These are the Top Level Domains TLDs:
Below the TLDs are the second-level domains, such as Cisco, Microsoft, etc. And every domain below them is simply called sub-domain:
The domain name is segmented into different domain levels. Each domain level contains many servers. For example, “ru.wikipedia.org” has four levels:
- the root domain “.”
- the Top Level Domain (TLD) “org”
- the domain “wikipedia”
- the subdomain “ru”
DNS is distributed, i.e. the DNS database is distributed across many servers. So no one Internet name server has a complete version of the DNS database.
Also, the database is replicated among servers with Anycast technology, which allows redundancy of the DNS information.
Each domain name is maintained by more than one server. For example, the information for the domain name Keyboardbanger.com exists on at least two name servers.
The DNS resolver is a software that resides on the client machine (or computer) and performs DNS lookup.
If there are further DNS requests made by the client to the same domain name, the resolver retrieves the answer from its DNS cache. We’ll talk about the DNS cache in the next paragraph.
The information that the Resolver receives from a DNS server is called Resource Records.
Facts About The Protocol
DNS is an application-layer protocol that provides translation service to other protocols such as HTTP, email protocols and FTP.
In the past, clients used the HOSTS.txt file to maintain names-to-IP resolutions. The file was updated regularly through a file transfer protocol from a server on the Internet
DNS queries have a default UDP destination port of 53 and a highly-numbered source port. DNS can also run on TCP port 53.
A DNS datagram can have at maximum size of 512B. If the DNS response datagram has a bigger size, then the response is truncated over multiple datagrams. This is possible thanks to the TC flag in the DNS message header.
It takes some time to propagate changes in name-to-IP resolutions across the DNS servers, and it ranges from a couple of minutes to a couple of hours. But this is accepted in the DNS world.
DNS servers on the Internet are usually Linux servers running the BIND process
Remember the End-to-End design principle? DNS servers obey the rules and is implemented at the “edges” of the network
DNS allows the following benefits:
- Host aliasing: one canonical hostname can be associated with one or more alias names. We usually tend to prefer alias names because they are more mnemonic
- Web and email server aliasing: instead of displaying websites or email addresses in canonical form, we can use aliases to display “fashionable” names. For example, having a website named “DC1-serv.contoso.com” is less desirable to type in a browser than “contoso.com”. And having an email address like “firstname.lastname@example.org” is not as cool as “email@example.com”
- Load distribution: DNS allows traffic load balancing among IP hosts associated with a domain name. Some busy web servers such as cnn.com run on more than one host. The domain name “cnn.com” is mapped to more than one IP address. Each time a DNS query is received, the DNS server replies with all the IP addresses mapped to the domain name, but reversing their order each time.
In order to avoid querying name servers and to make the DNS process a bit faster, DNS clients -the resolvers- have a DNS cache. The DNS cache stores DNS resource records locally for future use whenever a request to a same domain name is made.
The entries in the DNS cache -the resource records- have different aging times. Each record is kept for the duration of the value of TTL of the DNS message received. An implication of this fact is that, even if a resource record is updated somewhere on an Internet name server, the changes won’t be seen by the resolver unless the TTL of the local resource record entry expires.
For example, let’s say a record for keyboardbanger.com is cached and has TTL=3600 sec (1 hour). At time t=2mn, you make a change in a DNS resource record (on the Internet). The resolver won’t see these changes until t=60 – 2 = 58mn later. What does this mean is that, when your PC needs to resolve keyboardbanger.com, the DNS resolver willr return a local information, which is not up to date. In fact, it will take the resolver 58 minutes to clear the cache entry and learn about the resource record modification.
What about resolvers around the world that did not cache the old entry? Well, whenever they need to resolve the domain name keyboardbanger.com, after we made the manual changes, they will get the modified resource record.
So take this in mind whenever you want to make DNS changes.
Types of DNS Name Servers
Local Name Servers
Local name servers are geographically close to the querying host, such as corporate DNS servers or DNS servers for residential Internet users. In the latter case, the local name server sits a couple of routers away from the residential LAN.
Local name servers are able to respond to queries with the mapped IP address, if it’s about the same domain name.
On a computer, local name servers are usually configured automatically through DHCP or manually.
Local name servers have a file called root hints file that stores the IP addresses of root name servers.
Root Name Servers
Root name servers form the root domain. It is maintained by 13 highly secured logical servers. See that I said “logical” and not “physical”, because there are hundreds of physical servers dedicated to the root domain.
When solicited, if the root name server has the IP address of the requested name, it responds with it to the querying host. But this is not common because root name servers should not be burden with such a task. Often, it does not reply with an IP address. Instead, it queries an authoritative name server or forwards the query to an intermediate name server. Beware that the root name server queries the authoritative name server directly only if it knows its IP address.
Authoritative Name Servers
A name server is called “authoritative” for hostnameX if it has the mapping hostnameX-to-IP.
An authoritative domain is domain name on which the name server has authority, i.e. it can map the name to IP:
- Root name servers are authoritative for the root zone.
- Local name servers are often authoritative name servers for their respective domain names.
How DNS Works
We will understand how DNS works by looking at an example. In this example, all the interactions -except for the root name server- are assumed to be recursive, although some may argue that all queries should be iterative.
- “Surf.eurocom.fr” needs to communicate with “gaia.cs.umass.edu”. We don’t know whether it is a HTTP communication, an FTP transfer or an email service. Either way, there must be a DNS name resolution. So “Surf.eurocom.fr” DNS client, the resolver, sends a DNS request to its local DNS name server, “dns.eurocom.fr”. The request litterally contains the name “gaia.cs.umass.edu”.
- “dns.eurocom.fr” looks up “gaia.cs.umass.edu” into its DNS database. It does not find any entry. So it requests it on behalf of “Surf.eurocom.fr” to the root name server.
- the root name server does not know how to resolve this domain name; it responds with the IP address of an intermediate name server that can help with the request. The intermediate name server host name is “dns.umass.edu”
- the local name server requests the intermediate name server “dns.umass.edu”
- on behalf of the local name server, the intermediate name server queries an authoritative name server, “dns.cs.umass.edu”
- the authoritative name server has the needed mapping and replies with the IP address of the host “gaia”
- the intermediate name server transfers the reply back to the local name server,
- the local name server transfers the reply back to the requesting host.
- the requesting host installs the DNS response in its DNS cache
Recursive queries vs Iterative queries
A DNS query is said recursive when a host A (it could be a name server too) sends a DNS query to a name server B that will fetch the mapping on behalf of A.
A DNS query is said iterative (or non recursive) when a host A (it could be a name server too) sends a DNS request to a name server B, which immediately responds with a reply.
As we saw above, we assumed that a DNS request follows a chain of queries from one server to another. Let’s call it query chain.
In a query chain, often all DNS requests are recursive, except requests between local name servers and root name servers. Root name servers reply with NS records and A records which contain IP addresses of authoritative name servers (see Glue records) and not the final mapped IP address. That’s because root name servers must not be burdened with such “basic” tasks.
Sometimes DNS requests in the query chain are iterative. Each time the DNS resolver requests name servers -whether they are root name servers or others-, it receives responses and caches them in its DNS cache.
But how can the DNS resolver contact the root name server? Is it true that it needs a name resolution to get the IP address of the root name server? In an article on IANA website, it is indicated that client DNS applications implement the IP addresses and names of root name servers in file called root hints file.
DNS Namespace and Zones
A DNS namespace is the set of domain names created under the original domain name. for example, “mail.google.com” and “drive.google.com” form the DNS namespace for “google.com”.
A DNS zone is a subset of the namespace that can be managed (administered) separately from the original domain name. In the figure, the domain “microsoft.com” is a domain name that is under the top level domain “com”. It forms a DNS zone. A subdomain named “example” is created. The domain “example.microsoft.com” can be in the same zone as “microsoft.com” or in its own zone and administered by a different entity.
In a particular zone, records are stored on more than one name server for redundancy. Otherwise, if a name server crashes, a number of name servers would no longer be resolvable.
The way records are replicated between name servers of a zone is called zone transfer, in a client-server communication over TCP. The client is a name server requesting its database to be updated, and the server is either a master server (primary server) or another secondary server.
DNS and UDP
DNS runs over UDP server port 53. The maximum size of a UDP datagram that carries DNS data is 512B (without counting neither UDP nor IP header overheads). If there is more than 512B to put in a UDP datagram, then the DNS message is truncated and the TC flag is set. (The TC flag is part of the DNS header.
The DNS message is composed of a the following sections:
- Question (if this is a query or a response)
- Answer (if this a response)
- Additional information
Message format: the Header section
The DNS message header is 12B long (32*3/8). The DNS message header is colored in salmon in the figure.
- ID, aka Identification or Transaction ID. It identifies a query-response pair in a DNS communication
- from QR to RCODE, these are flags. We will learn them below,
- QDCOUNT: the number of questions in the message,
- ANCOUNT: the number of RRs in the answer in the message,
- NSCOUNT: the number of authority RRs in the message,
- ARCOUNT: the number of additional information RR in the message.
Here are some of the flags that appear in the DNS message header (whether the message is a query or a response):
- QR: identifies whether this message is a query or a response.
- Opcode:if this value is 0 then the message is a standard query.
- Authoritative Answer: this bit is available only for DNS replies. It tells whether the server is authoritative for the requested domain name.
- AA = 1; the DNS server is authoritative on the domain name
- AA = 0; the DNS server is not authoritative on the domain name
- TrunCated: tells whether the message is truncated or not. A DNS message is truncated when it can not fit in a single UDP datagram with a maximum size of 512 Bytes.
- Recursion Desired: expresses the querying host’s desire to make a recursive query or not.
- If set to 1, then it means “the querying host desires a recursive query”. This flag is copied in the response too.
- Recursion Available: appears only in DNS responses. It tells whether the name server that receives the query can do recursive queries or not
- If set to 1, this flag says “I can do recursive queries”,
- if set to 0, this flag says “Sorry dude I can not do recursive queries”.
- Zero : actually set to 0. This flag is developed for future use.
- ReplyCODE: This flag is only valid in a DNS reply. It tells whether the response contains errors or not
- if RCODE = 0, then there are no errors,
- else: there is an error
The remaining sections
We learned the parts that constitute the Header. This is just one part of a DNS message. In fact, the DNS message contains, in addition to the header, other sections :
- Question section, if it’s a DNS query or a reply. Strangely enough, the DNS reply contains the Question section too,
- Answer section, if it’s a DNS reply,
- Authority section (optional in a DNS reply),
- Additional section (optional in a DNS reply).
The Question section
When the DNS message is a DNS query, then it has a header and a Question section. The Question section is composed of the three 16-bits fields:
- QNAME: QNAME is the domain name encoded in labels (we will learn about labels later),
- QTYPE: determines the type of the query. You can find a complete list of types on Wikipedia,
- QCLASS: determines the class of the query. Usually QCLASS has the value “IN” to mean “INternet”.
The Answer Section
A DNS response message has at least a header, a Question section and an Answer section. I said “at least” because there can be an Authority section and an Additional section.
The Answer section contains the following parts:
- NAME: has the same format as QNAME
- CLASS: has the same format as QCLASS
- TYPE: has the same format as QTYPE
- TTL: describes how much time – in seconds- can this record be cached before it must be discarded
- RDLENGTH: describes the length of the RDATA field
- RDATA: contains the resource itself. For example, if the RR is of type A, then RDATA is an IPv4 address. If the RR is of type NS, then RDATA is a name server alias hostname.
We saw that QNAME, NAME and RDATA (in case of NS record) contain an alias hostname. But how is the hostname written in the DNS packet?
DNS Resource Record RR
A name server contains one or more Resource Records (RR). When a DNS server is queried, it may answer with one or more resource records, depending on the type of the query.
A resource record is in the following format:
Name, TTL, Class, Type, rdata
- Name: the domain name
- TTL: Time to Live: the amount of time the RR can exist in a DNS server cache. This is similar to the concept of IP TTL field but not to be confused with.
- Class: you will almost always find this field equal to “IN”, which means INternet
- Type: defines the type of the RR. The most common types are: A, NS, CNAME and MX. The Type field defines what “Name” and “rdata” will be.
- rdata: also known as Value in some other documentation. This is the record data.
Some DNS Resource Record Types
We said that resource records are in the format: Name, TTL, Class, Type, rdata.
- If Type = A, then the resource record is an “A record” and it simply provides the hostname-to-IP mapping. So:
- Name = alias hostname
- rdata = IP address mapped to the alias hostname
There can be multiple A records for the same domain name (many IP addresses are mapped to the same domain name). Similarly, there can be multiple A records for the same IP address (many domain name pointed to the same IP address).
- If Type = NS, then the resource record is a “NS record”. This record does not provide a hostname-to-IP mapping. Instead, it provides a hostname of a name server that has authority on a domain that contains the hostname. So
- Name = alias hostname,
- rdata = hostname of a name server that has authority on the domain that contains Name
- If Type = CNAME, then the record is a “CNAME record”. This record provides the canonical hostname of an alias hostname:
- Name = alias hostname,
- rdata = canonical hostname
- If Type = MX, then the record is an “MX record”. It provides the hostname of a mail server that maps to the alias hostname.
- Name = alias hostname
- rdata = hostname of the email server
The complete list of DNS record types can be found here.
DNS Glue Records
Sometimes the name server we query is not an authoritative name server for the domain we want to resolve. In that case, the name server replies with a record that contain the domain name of an authoritative name server. However, this by itself may cause a loop because we don’t know how to reach the authoritative name server. For that reason, name servers often give additional information about the authoritative name server (such as an A record) in the Additional Record section.
TLDs often can reply with glue records because they have the IP addresses of authoritative name servers configured on them.
DNS Query Examples
Google provides a tool to do DNS queries, as part of their Google Apps series. I’m going to use it and query the default Google DNS server for the resource records I have mentioned above.
Example 1: DNS A record
When I query the default Google DNS server for an A record for the domain keyboardbanger.com, I get the IP address, which is 18.104.22.168:
Example 2: DNS NS record
When queried, the DNS server provides the NS record that shows the hostname of name servers that have authority on the domain Keyboardbanger.com: ns111.ovh.net and dns111.ovh.net
Example 3: DNS CNAME record
As described earlier, the CNAME record provides the canonical hostname of an alias hostname. The canonical hostnames of Keyboardbanger.com are dns111.ovh.net and tech.ovh.net
Example 4: DNS MX record
The hostname of the email server associated with the alias hostname of Keyboardbanger.com is redirect.ovh.net
How a domain name is represented inside the DNS packet
To understand how QNAME is represented (and also NAME and RDATA), we need to understand the way a domain name is represented in a DNS packet. It may seem a bit odd, but once you get the idea, it will become simple.
As I mentioned earlier, the domain name is represented in the form of labels separated by dots (.):
A label can be of two types:
- Data label
- Compression label
We’ll talk about compression labels in the DNS Name Compression paragraph. Let’s now define data labels.
A data label is composed of :
- Label length: a byte that describes the length of the current label (just before the dot). The value ranges from 0 to 63.
- the bytes of the current label. Their maximum size is 63 bytes.
For example, the domaine name “www.microsoft.com” has three labels and is encoded in the DNS message in this format:
3 ‘w’ ‘w’ ‘w’ 9 ‘m’ ‘i’ ‘c’ ‘r’ ‘o’ ‘s’ ‘o’ ‘f’ ‘t’ 3 ‘c’ ‘o’ ‘m’ 0
Notice the trailing “0”. This is called the label of length 0 and means the root label. Remember that all DNS domain names end with a root domain; the dot.
The maximum size of a domain name is 255 Bytes (or 255 characters if we suppose that each character is encoded in 1 Byte), including the length bytes and the trailing label of length 0. The dots between labels are not counted.
Pay attention that this is not name compression. It’s just how DNS domain names are encoded in a DNS message.
This domain name encoding is found in the following fields: QNAME, NAME and RDATA.
DNS name compression
A DNS reply message can contain the same domain name multiple times. This repetition is a waste of bits in the message. A compression technique can be used to reduce the number used bits and replace the repeated domain name by a pointer.
Remember that we talked earlier about compression label? A compression label is a pointer that occupies the NAME field of the Answer section (16 bits). So the pointer is written on 16 bits and has the following format:
Remember when I said that the label length of a data label is one byte long, and its value is between 0 and 63? 63 is 00111111 in binary. The compression label, however, has the first two bits set to 1, to differentiate it from data labels.
The compression label can only be used if the pointed domain name (called compression target) has already been mentioned in the DNS message (you can’t point to a something that does not exist already).
The significance of the compression label is as follows: the first 2 bits are set to 1, the 14 remaining bits describe the offset, i.e. the position of the compression target from the beginning of the DNS message.
DNS name compression example
Here is an example to illustrate the use of name compression. Let’s assume a DNS UDP datagram needs to use the following domain names: F.ISI.ARPA, FOO.F.ISI.ARPA and ARPA. In this example we assume that bytes start at offset 0.
The first domain name F.ISI.ARPA will be encoded as a data label because it appeared for the first time in the DNS packet. Here is the coding of the domain name F.ISI.ARPA :
Now here is how FOO.F.ISI.ARPA is encoded: FOO appears for the first time so it is encoded as a data label. Then, since F.ISI.ARPA was already mentioned in the datagram, we can simply leverage compression and point to F.ISI.ARPA by indicating the offset (here F.ISI.ARPA appeared the first time at byte number 20).
Also, ARPA appeared before at byte number 26. So we simply point to it by its offset:
DNS Analysis with Wireshark
Here is the setting:
- Client IP address = 192.168.170.8
- DNS server IP address = 192.168.170.20
The DNS header is the following section in Wireshark:
This is a DNS communication, since the transport protocol is UDP and the destination port is 53:
The identification field has the same value for both the DNS query and the DNS response:
The first packet sent is a DNS query:
DNS Query Message
A DNS query message does not contains an Answer field. It only contains a header and a Question fields.
The query contains QNAME, QTYPE and QCLASS
DNS response message
We see that the DNS response contains both “queries” field and “answers” field. This is as if the DNS server says “hey your question is blablabla, and the answer to it is blablabla”. Also, in the flags field, the Questions flag is activated although this is a DNS response. It’s just a way to tell that this message has answers to a previous question.
Here is an example that shows how ANCOUNT equals the number of answer RRs in the DNS message:
DNS response: additional records
Here is an example of the Additional information field of the DNS message. The querying host asked for an MX record of google.com. The DNS server replied with all the email servers that have the alias hostname of google.com, and as additional information, it gives the name-to-IP address mapping of these email servers in A records (pretty generous):
The Domain Name System and its group of over-the-world scattered servers has grown and continues to grow each day. Newer DNS techniques appeared later such as dynamic DNS.
- Stanford University
- DNS RFC
- The TCP/IP Guide: A Comprehensive, Illustrated Internet Protocols Reference
- TCP/IP Illustrated, Volume 1: The Protocols (2nd Edition)
- Wikipedia: DNS
- Wikipedia: root name servers
- Computer Networking: A Top Down Approach, Kurose and Ross
- DNS RFC
- Carnegie Mellon University