Development of the Domain Name System Paul V. Mockapetris, Keven J. Dunlap Presenter: Gigis Petros ACM SIGCOMM 88
Introduction What was before DNS? Before DNS we were using HOSTS.TXT system for publishing the mapping between host names and addresses The HOSTS.TXT was headed for problems
What is HOSTS.TXT? HOSTS.TXT is a simple text file, which is centrally maintained on a host at the SRI Network Information Center (SRI-NIC) Distributed to all hosts in the Internet via file transfers
Problems with HOSTS.TXT The file size and cost of distribution were becoming too large Centralized control of updating the file was against the trend toward more distributed management of the Internet. Increasement of hosts, organizations and file transfers was much larger than linear increase
Introduction Organizations were being forced into management of local network addresses, gateways, etc.. Need to partition the database and allow local control of name and address spaces A distributed name system was needed The existing distributed name systems were not suitable for the DARPA Internet
And Eureka! Domain Name System (DNS)
Initial Design of DNS Initial designed started in 1983 Base design assumptions for the DNS: 1. Provide at least all the same information as HOSTS.TXT 2. Allow database to be maintained in a distributed manner 3. No obvious size limits for names, types, etc 4. Interoperate across the DARPA Internet and other non TCP/IP projects 5. Provide tolerable performance
Initial Design of DNS Independence of Network topology Capable of encapsulating other name spaces OS independent Hierarchical design Distribution & size requirements Main Goal: Lean but Distributed Lean more implementation & quick availability Lead to removal of dynamic update of db with related atomicity, voting and backup considerations General design = more applications, increase functionality and increase number of enviroment under which DNS is deployed
Architecture of DNS Two major active components: Name Servers: Repositories of information Answer queries using whatever information they possess Resolvers: Interface to client programs Algorithms to contact name servers
The Name Space 1/2 The DNS internal name space is a variable-depth tree Each node has an associated case-insensitive label Labels are variable length 8-bit octets (Max = 63 octets) Name are restricted to 256 octets Tree structure allows overloading of names across different subtrees The domain name of a node is the concatenation of all labels on the path from the node to the root of the tree The zero length label is reserved for the root Top level domains are for country or broad organizational codes (uk, com, edu, etc)
The Name Space 2/2
Data attached to names Data for each name is organized as a set of resource records (RRs) Multiple values of the same type are represented as separate RRs Limited size datagram (UDP) - dont have to deal with breaking up maximum sized packet Each RR carries a well-known type and class field, followed by applications data Types represent abstract resources or functions (example: Host Addresses and mailboxes) Class field specifies the protocol family or instance (Internet, DARPA, CHAOS, ISO, etc)
Data Distribution Two major mechanisms for transferring data Zones Contiguous portion of the domain name space for which administrative responsibility has been delegated to an organization Caching Mechanism whereby data acquired in response to a client s request can be locally stored against future requests Both mechanisms invisible to the user who should see a single database
Zones Contiguous description of tree space and pointer information for other relevant contiguous zones Could be an intire subtree or just a node Distribute several copies to several servers to handle requests by clients Creation Begins with request to parent zone to delegate a subnode Grow as a tree within the node Parent records in RRs that zone records for particular node have a zone division and point to new zone holder Parent should not be involved henceforth
Zones Organization maintains zones and distributes zones appropriately Master file distributed to all users manually or DNS based algorithm - zone refresh Serial number based Zone transfer requires TCP Single name server can handle multiple zones (master/slave) Geographical distribution of data Data retrieved from zone = authoritative
Zones
Zones
Caching DNS resolvers programs responsible for caching A TTL field (in sec) is attached to each RR A low TTL is desirable because it minimizes inconsistency A high TTL minimizes traffic and allows caching to mask periods of server unavailability Recommended TTL value for host names is 2 days
Current Implementation Status
Current Implementation Status In 1987 HOSTS.TXT was still used by older hosts, but DNS became the recommended mechanism 5,500 host names were in HOSTS.TXT Over 20,000 host names available via DNS Domain space was petitioned into roughly 30 top level domains The delegation of subdomains by the SRI-NIC has grown steadily In 1987 roughly 300 domains were delegated In 1988 roughly 650 domains were delegated
Implementation Status 1988 Two good examples of contemporary DNS use: Root servers Berkley subdomain
Root Servers in 1988 Root domain is supported by 7 redundant name servers Typical traffic rate a query/sec Four types of queries: All information (25-40%) Address mappings (30-40%) Address to host mappings (10-15%) MX less than 10%
Berkeley in 1988 Due to growth in the campus network they developed BIND (Berkeley Internet Name Domain) server First organization on the DARPA Internet to bring up machines with all their network applications solely dependent on DNS Difficult to adopt it from users
Surprises Operation of the DNS has revealed several issues that came as surprises to the developers Refinement of semantics Performance Negative caching
Surprises Refinement of Semantics The DNS is to act as a repository for information and the initial assumption was that the form and content of the information in DNS was well-understood. (order, metric for multiple addresses to host) Performance Performance of the underlying network was much worse than the original design expected The reasons for the longer delays were: 1. Growth in the number of networks 2. Growth in load 3. The addition of many lower speed links At the root servers clients typically see response times of 500 ms to 5 secs. At delegated domain 3-10 secs and some times 30-60 secs
Negative Caching DNS provides two negative responses to queries 1. The name in query does not exist 2. The name in query exists but the requested data does not (query maybe for a mailbox) Initial monitoring of root server showed a very high percentage 20-60% of these responses Expected negatives responses to go down, but they stayed in the 10-50% range Decided they needed caching for negative results
Successes Variable depth hierarchy Additional section processing Organizational structuring of names Datagram access Caching Mail address cooperation
Variable depth hierarchy Successes Huge number of workstations lead to organizations better organizing themselves instead of single flat file Organizations are of different sizes depth should be of different levels Additional section processing Add any additional information as long as the data fits in a single datagram Can answer a request before it was asked Cuts query traffic in half
Successes Organizational structuring of names Names are independent of network topology, etc. was popular Datagram access Datagram used to access name servers was successful because of the bad performance of the DARPA Internet Drawback: need to develop and refine retransmission strategies
Successes Caching Caching discipline of the DNS works well and was essential to the success of the system But caching make it less reliable or useful Mail address cooperation Different Internet communities agreed to use organizationally structured domain names for mail addressing and routing
Shortcomings Type and class growth Easy upgrading of applications Distribution of control vs distribution of expertise or responsibility
Shortcomings Type and class growth Difficult to make new definitions Need to clearly design and publish their semantics Create applications to use them Easy upgrading of applications Not easy to convert network applications to use the DNS DNS resolver must be part of the OS Distribution of control vs distribution of expertise or responsibility Organizations should have been required to have redundant servers with real data before they were given a domain Documentation should always be well-written and simple
Was the DNS a good idea? Modifications to the HOSTS.TXT scheme could have postponed the need for a new system but need to distribute functionality was crucial and with the new functionality and the opportunities for future services it was a good idea!
Conclusion Things they wished they had known earlier: Caching works well, but caching for negative responses is needed It is more difficult to remove a function than to add new one Optimizations are not considered if the system performs at the expected level Allowing variations in the provided service causes problems
Backup