Skip to content

Appunti - DNS

Functionality

The DNS (Domain Name System) is an infrastructure that implements a name table. This table is used for identifying a piece of textual information, called a value, by means of a domain name. A domain name is a sequence of strings called labels and separated by the dot character (‘.’). In a nutshell, you give the DNS a domain name and you obtain from the DNS the corresponding value. In the following we will use the term “name” as a shorthand for “domain name”.

Each row of the name table is called a resource record (RR) and maps (or translates) a name to a value.

The infrastructure consists of thousands of servers distributed across the world that interact among themselves. These servers are called name servers. A process that needs to obtain the value associated with a given name sends a request containing that name to its default name server and receives back a response containing the RR with the specified name. Note that requests specify a name: they do not specify a value.

The IP address of the default name server is specified in the configuration of a host. Usually all hosts in the same organization have the same default name servers. The port number of a name server is 53.

The term DNS is used with several different meanings. It may denote either the infrastructure, or a server of the infrastructure, or the protocol used by name servers. One has to understand from the context what the intended meaning is.

The name table contains, in addition to the name and value columns, a type column. Each RR is thus composed of three fields: name, type, value. The type field may assume a small set of values defined by the DNS protocol:

  • A is for a RR that maps the name to an IP address (i.e., the value field is an IP address).
  • CNAME is for a RR that maps the name to an alias for that name (i.e., the value of this RR will be the name of another RR).
  • MX is for a RR that maps the name of an email domain to the name of the mail server responsible for that email domain (i.e., the former is the name field while the latter is the value field).

A request contains a pair , i.e., it does not contain only a name. A response contains all the RRs matching the request (or an error code in case no such RR exists). RRs of type CNAME match any type specified in the request.

There may be multiple RR with the same name: - Such RRs may have a different type (e.g., one RR of type A, another of type MX). - Or, they may have the same type(e.g., one RR of type A that maps a name to a certain IP address and another RR of type A that maps that name to a different IP address).

The set of all RRs with the same name is called a domain.

The set of all domains that exist in the Internet may be conceptually organized as a tree called domain tree. Each node of the domain tree is associated with a domain and contains a label. The domain associated with a given node is the one whose name is obtained by concatenating all the labels along the path from that node to the root of the tree, left to right and separated by the dot character (‘.’).

The label of the root of the domain tree is the dot character ‘.’. It follows that all names terminate with the dot character. The user interface of programs usually hide the terminating dot character.

The domain tree is an abstract and idealized representation of all the existing domains. This representation does not include types and values of the RRs contained in each domain, nor does it include the number of RRs contained in each domain.

Implementation

Zones

The name table does not exist in its entirety anywhere. Each name server contains a set of RRs of the name table, i.e., the name table is distributed. The distribution of the name table is based on the domain tree, as follows.

The domain tree is partitioned in zones. All domains of the same zone are stored on the same name server, that is the name server of the zone.

A zone is a partition of the domain tree rooted at a domain and that extends below that domain, either until leaf nodes of the domain tree or until the root of another zone. The root of the domain tree is a zone. All the children of the root (top level domains, TLD) are also zones.

Intuitively, one may think of a zone as of a subtree of the domain tree. This definition is not fully precise because the mathematical definition of a subtree consists of all the descendants of the root of the subtree, while in this case one or more of those descendants could be root of other zones.

The domain that is the root of a zone must contain a RR of type NS. A RR of this type maps the name of a zone to the name of the name server of that zone. The name of the name server of a zone may or may not be in that zone.

In order to contact the name server of a zone Z1, it is necessary to have two RRs: the RR of type NS mapping the name of Z1 to the name of a name server; and, the RR of type A mapping the name of that name server to the corresponding IP address.

Notes:

  • The domain tree representation does not include its partitioning in zones (the fact that a domain is the root of a zone is described by the existence of a RR of type NS within that domain, but RRs are not part of a domain tree representation).
  • The domain tree representation does not include name servers.
  • A name server may be name server of one or more different zones.

As already observed, all domains of a zone are stored on the name server of that zone. Certain RRs must be stored also on other name servers, as follows:

  1. The pair of RRs of types NS/A for the root zone must be stored on every name server.
  2. The pair of RRs of types NS/A for zone Z1 must be stored on the name server of the zone above Z1 (actually, in case the name of the name server is not in Z1, then this constraint applies only to the NS RR and it does not apply to the A RR; this special case is beyond the scope of the course: we may forget about this special case and assume that NS/A pairs for Z1 are always stored in the zone above Z1; more details in the appendix of this page).

RRs specified in 2 are called glue information. These requirements must be satisfied to guarantee that, for any possible name, it is always possible to contact the server that is certainly able to respond to a request for that name.

Iterative name resolution

A common configuration for organizations that have a name server N is as follows:

  1. N acts as the default name server for all the hosts internal to the organization.
  2. Any DNS outbound traffic not originated by N is blocked at the organization boundary.
  3. Any DNS inbound traffic not directed to N is blocked at the organization boundary.
  4. All hosts in the organization (i.e., all processes that have to translate a domain name) send their requests to N and ask it to perform a recursive resolution. This means that, in case N does not know the required RR, N locates the server where the required RR is stored, obtains the RR and responds with the required RR.
  5. N executes recursive resolutions only for DNS requests originated by hosts internal to its organization.

Configurations of this kind allow organizations to centralize the monitoring of all name resolutions originated within the organization, as well as the blocking of selected name resolutions. Rules 2 and 3 are implemented at the IP networking layer.

The procedure executed at step 4 above by N for obtaining a RR that is not locally available is called iterative name resolution. This procedure consists in navigating along name servers starting from the name server of the root zone, based on the domain name itself (one label at a time, right to left) and using the glue information appropriately. (we omit the details). The name resolution procedure, thus, is guaranteed to complete after contacting a very small number of name servers.

Note that iterative name resolution is performed only by name servers. It is never performed by hosts (that are not name servers): such hosts always send requests to their respective default name server; that name server will always obtain the required RRs as needed, in case it does not have them locally available then the default name server will navigate in the domain tree starting from the root zone (i.e., the default name server will execute an iterative resolution).

Replication

The name server of a zone must be replicated. That is, each zone must have multiple name servers each storing a copy of all RRs of that zone. This requirement ensures that failure or unreachability of a name server does not make a full zone, and all zones below that zone, not accessible.

One of the name servers of a zone is called primary; the other name servers are called secondary. When a RR has to be modified, it is modified only at the primary by the zone administrator; secondary name servers contact the primary name server periodically to be notified of all changes to RRs of the zone (including creation and deletion).

At a given instant, thus, different replicas for a zone could have different RRs for that zone. This feature is intrinsic into the DNS functioning. When a response contains a RR, the content of that RR does not have any information enabling to tell whether that RR is the most up-to-date copy.

All name servers of a zone must be described by NS/A pairs, that is, there must be one such pair for each name server replica; the NS RR will have the same name in all pairs (because they all describe the same zone) but different value (because each RR identifies a different name server for that zone). The requirements for the glue information apply to all these RRs.

Replicas of a zone are equivalent among themselves in the sense that RRs of type NS describing the name servers of that zone do not tell which name server is the primary.

Caching

Each name server maintains a local copy of all RRs that has seen in the past. This local copy is called cache. The cache is used for constructing responses. When a name server receives a request, it first analyzes the cache; the name server sends a request to a name server only in case the cache does not allow answering the request.

A RR may not remain in a cache indefinitely but it must be deleted after a time interval whose length is contained in the RR itself, in a column called TimeToLive (TTL).

A response obtained from a cache is called non authoritative and is flagged as such. In this case, the TTL of the returned RR is decremented of the time interval already spent in the cache. A response obtained from one of the name servers of the corresponding zone is called authoritative.

Notes:

  • The name server of a zone does not know how many copies of the RRs of that zone exist in the DNS caches distributed across the Internet.
  • When a RR is updated, the update is not immediately reflected to the copies of that RR possibly stored in DNS caches.
  • If the TTL value for a RR at the primary is T seconds, then it is guaranteed that all copies of that RRs will disappear everywhere after at most T seconds.

Creating and managing a domain

Registrants and registrars

The entity (person, company or organization) that owns a domain name is called registrant. The registrant has the right to choose types and values of RRs in the domain, as well as the right to create subdomains of that domain.

The entity that actually creates domain names upon requests by registrants is called registrar. A registrar can only create domains in a set of predefined zones with which it has established technical and business relationships. A registrar may allow other organizations, called resellers, to sell domain names on its behalf. For the purpose of this discussion, resellers are equivalent to registrars.

The technical and business interactions between registrant and registrar are not part of the DNS protocol. The technical means for ensuring that a domain name can only be managed by its registrant are not part of the DNS protocol. Usually a registrant interacts with a registrar and operates on the RRs it owns by means of a web application.

The RRs of the newly created domain will be stored in name servers managed by the registrar itself. Of course, there will be not be a new set of nameservers for each created domain: the same set of nameservers will store the RRs for all the created domains.

The RRs of the newly created domain usually include one of type NS. Creating a domain thus usually involves creating a zone with the name of the domain. The nameservers of that zone will be managed by the registrar, as observed above.

As observed above, a registrar can only create domains in a set of predefined zones with which it has established technical and business relationships. Usually, those zones are top-level domains (TLD). A registrant may thus create only domains that are second level (i.e., children of a TLD).

A TLD often introduces some TLD-specific constraints on the domains that can be created. For example, there could be certain domain names that cannot be bought by any registrar (e.g., internet.it) or domain names that can be bought only by certain registrars (e.g., parlamento.it). These constraints are of normative nature and have no underlying technical reason.

In general, there can be no zones “below” a domain created by a registrant. Creation of new zones in a subtree rooted at a domain created by a registrant requires specific business arrangements with registrars. The University of Trieste (registrant of domain units.it) has one of those arrangements with the .it registrar (i.e., it is possible to create zones “below” units.it, such as inginf.units.it)..

Real identity of registrars and WHOIS

A registrar must maintain the association between each domain name it has created and the corresponding registrants. This information is stored outside of the DNS in a distributed infrastructure called WHOIS. This infrastructure can be queried programmatically in a variety of ways. Many freely accessible web applications, for example, take a domain name as input and return the corresponding WHOIS information that describes the registrant, the date of domain name creation and so on.

A registrant is identified by means of a string. In practice, there is a very weak connection between that string and the entity which actually created the domain name. In other words, it is very easy for an entity to obfuscate or falsify its real identity. The underlying reason is because creation of a domain name occurs through a web application, thus an entity may insert any string for describing itself as a registrant. The only connection with the real identity will be through the payment transaction, but this can be obfuscated or falsified easily, for example by using an anonymous payment method or a stolen credit card number.

The internals of the WHOIS infrastructure (e.g., links among WHOIS servers, procedure for finding the IP address of the server which knows the registrant for a given domain name, WHOIS protocol) are not part of this course.

Appendix: Glue information

This Appendix elaborates on the following sentence from the "Zones" section:

The pair of RRs of types NS/A for zone Z1 must be stored on the name server of the zone above Z1 (actually, in case the name of the name server is not in Z1, then this constraint applies only to the NS RR and it does not apply to the A RR; this special case is beyond the scope of the course: we may forget about this special case and assume that NS/A pairs for Z1 are always stored in the zone above Z1).

I iterate that the content of this appendix is beyond the scope of the course.

Let us consider the following example. The name server of zone acm.org is named ns.yale.edu (i.e., the name server of a zone Z1 is not in Z1). According to our lectures, the name server of zone org must contain both of the following RRs (glue information):

  • acm.org NS ns.yale.edu
  • ns.yale.edu A IP-ns-y

This requirement guarantees that every iterative name resolution for a name whose suffix is acm.org and "passes through" the name server of zone org may continue with the next step, i.e., with the name server of zone acm.org.

Now suppose that only the first RR is stored in the name server of zone org. Would the above name resolution fail?

Let us consider a domain tree navigation for resolving a name whose suffix is acm.org, e.g., pippo.acm.org A ?. When the name server of zone org receives the request, it will now respond only with the first RR (as that name server does not have the second RR locally available in its glue information). The receiving name server, i.e., the name server that is executing the domain tree navigation would thus reason this way:

  1. The name server of zone org does not have the requested RR;
  2. however, it does know that acm.org is a zone and that the name server of this zone is ns.yale.edu;
  3. I thus have to continue the domain tree navigation by sending pippo.acm.org A ? to ns.yale.edu;
  4. I do not have the IP address of ns.yale.edu thus I have to execute another domain tree navigation for resolving ns.yale.edu A ?; having found this RR I will continue the navigation for resolving pippo.acm.org A ?

That is, the name resolution would not fail: the second domain tree navigation will pass through the name servers of the root zone, of zone edu, of zone yale.edu and will succeed. It follows that, in this case, the presence of ns.yale.edu A IP-ns-y is not needed in the glue information of zone org.

Now suppose that the name server of zone acm.org is named instead ns.acm.org (i.e., the name server of a zone Z1 is in Z1). According to our lectures, the name server of zone org must contain both of the following RRs (glue information):

  • acm.org NS ns.acm.org
  • ns.acm.org A IP-ns-o

Now suppose that only the first RR is stored in the name server of zone org. In this case, as explained in the following, the above name resolution will fail:

  1. The name server of zone org does not have the requested RR;
  2. however, it does know that acm.org is a zone and that the name server of this zone is ns.acm.org;
  3. I thus have to continue the domain tree navigation by sending pippo.acm.org A ? to ns.acm.org;
  4. I do not have the IP address of ns.acm.org thus I have to execute another domain tree navigation for resolving ns.acm.org A ?; having found this RR I will continue the navigation for resolving pippo.acm.org A ?

It is simple to realize that the domain tree navigation for ns.acm.org A ? will fail (or execute and endless loop):

  1. The name server of zone org does not have the requested RR;
  2. however, it does know that acm.org is a zone and that the name server of this zone is ns.acm.org;
  3. I thus have to continue the domain tree navigation by sending ns.acm.org A ? to ns.acm.org (endless loop!);

In summary:

  1. If the name server of a zone Z1 has a name in Z1 then the glue information for Z1 must contain two RRs (NS/A)
  2. If the name server of a zone Z1 has a name that is not in Z1 then the glue information for Z1 may contain only one RR (NS)
  3. To make things easier to understand and to remember, we assume that the glue information for Z1 always contain two RR (NS/A).