Section 1.2.3 📖 ~8 min read

The Design Principles: Distributed, Hierarchical, Cacheable

How DNS's three core design principles — distributed architecture, hierarchical namespace, and aggressive caching — solved the HOSTS.TXT scaling crisis.

DNS didn’t succeed by accident. Its longevity comes from three core design principles that directly addressed the failures of HOSTS.TXT: distributed, hierarchical, and cacheable. These principles work together as an interlocking system — remove any one, and DNS wouldn’t scale.

Principle 1: Distributed

The first and most fundamental shift was from a centralized file to a distributed database.

What “Distributed” Means

In the HOSTS.TXT world:

One organization (SRI-NIC) maintained all data
Every computer downloaded the complete dataset
Changes flowed through a single point

In DNS:

Thousands of organizations maintain their own portions
Computers query only for what they need
Changes happen locally without central coordination

Why Distribution Works

Consider what happens when MIT wants to add a new server:

HOSTS.TXT approach:

MIT emails SRI-NIC
SRI-NIC staff process the request (1-3 days)
SRI-NIC updates HOSTS.TXT
Every computer on the internet downloads the new file
Propagation takes days to weeks

DNS approach:

MIT updates their own nameserver
Done

MIT’s administrator makes the change in minutes. Within the TTL window, everyone has the new address. No central authority involved.

The Autonomy Benefit

Distribution means autonomy. Each organization controls their own namespace without asking permission (after initial delegation). This enabled the internet to scale in ways a centralized system never could.

Imagine if every website change required approval from a central authority. The modern internet — with millions of sites updating constantly — would be impossible.

Failure Isolation

When part of a distributed system fails, only that part is affected. If MIT’s nameservers go down, mit.edu becomes unreachable, but berkeley.edu and stanford.edu work fine.

With HOSTS.TXT, if SRI-NIC went down, nobody could get updates. The single point of failure was also a single point of control — and a bottleneck.

Principle 2: Hierarchical

DNS names have structure: www.example.com isn’t just a string, it’s a path through a tree.

The Tree Structure

                        . (root)
                        |
        +-------+-------+-------+-------+
        |       |       |       |       |
       com     edu     gov     net     org
        |       |
    +---+---+   +---+---+
    |       |   |       |
 example  foo  mit   berkeley
    |             |
   www           ai

Reading from bottom to top:

www.example.com = www under example under com under root
ai.mit.edu = ai under mit under edu under root

Why Hierarchy Matters

Hierarchy solves the naming collision problem. In a flat namespace:

Who gets MAIL?
Who gets SERVER1?
How do you avoid conflicts?

With hierarchy:

MIT gets mail.mit.edu
Stanford gets mail.stanford.edu
No collision possible

Each organization has unlimited naming freedom within their subdomain. MIT can create anything.mit.edu without consulting anyone else.

Delegation = Scalability

The key innovation is delegation. Root servers don’t need to know about www.example.com. They only need to know “.com is handled by these nameservers.”

The .com servers don’t need to know about www.example.com either. They only need to know “example.com is handled by these nameservers.”

This delegation creates a chain:

Query: www.example.com?

Ask root server → "Try .com at these addresses"
Ask .com server → "Try example.com at these addresses"  
Ask example.com server → "www.example.com is 93.184.216.34"

Each level handles one step of delegation. No single server needs to know everything.

Trust Flows Downward

Hierarchy also establishes trust. You trust root servers (because they’re hardcoded in your software). Root servers vouch for TLD servers. TLD servers vouch for domain servers.

This chain of trust would later become formalized with DNSSEC, but the hierarchical structure made it possible.

Principle 3: Cacheable

Every DNS response includes a TTL — Time To Live. This number says “you can remember this answer for X seconds.”

Why Caching Is Crucial

Let’s do some math. Imagine a popular website like google.com:

Billions of queries per day globally
Without caching: Every query hits Google’s nameservers
With caching: Resolvers remember the answer

If the TTL is 300 seconds (5 minutes), then during any 5-minute window:

First query: Goes all the way to Google’s nameserver
Next 10,000 queries from that resolver: Served from cache

Caching reduces query traffic by orders of magnitude.

How TTL Works

When an authoritative server responds, it includes TTL values:

www.example.com.    300    IN    A    93.184.216.34
                    ^^^
                    TTL: 300 seconds

The resolver stores this answer. For the next 300 seconds, anyone asking that resolver for www.example.com gets the cached answer immediately.

After 300 seconds, the cached entry expires. The next query goes back to the authoritative server.

The Trade-off

Caching creates a trade-off between freshness and load:

Short TTL (60 seconds): Changes propagate quickly, but more queries hit your servers
Long TTL (86400 seconds = 1 day): Lower server load, but changes take longer to propagate

Most domains use TTLs between 300 seconds (5 minutes) and 86400 seconds (1 day). The right choice depends on how often records change and how much server load you can handle.

Negative Caching

DNS also caches negative responses. If you query for nonexistent.example.com and get “NXDOMAIN” (no such domain), that answer gets cached too.

This prevents repeated queries for domains that don’t exist — which could otherwise be used for denial-of-service attacks.

How the Principles Work Together

These three principles form an interlocking system:

Distributed + Hierarchical: Hierarchy enables distribution. Because names have structure, we can delegate authority at each level. Without hierarchy, there’s no natural way to divide the namespace.

Hierarchical + Cacheable: Caching works at each level. Not only do you cache final answers, but you cache the referrals too. Once your resolver knows the .com nameservers, it doesn’t need to ask root again for any .com query.

Cacheable + Distributed: Caching makes distribution work at scale. If every query had to traverse the full hierarchy every time, the root servers would be overwhelmed. Caching shields upper levels from the query volume of lower levels.

What If We Removed One?

Without Distribution

A hierarchical but centralized system would still have the single-point-of-failure problem. You’d need to query a central authority for everything, even if names were structured.

Without Hierarchy

A distributed but flat namespace would require some way to avoid collisions. You’d need a global registry of names — which brings back central coordination.

Without Caching

A distributed, hierarchical system without caching would work, but root servers would be overwhelmed. The current root servers handle about 100,000 queries per second. Without caching, they’d need to handle every DNS query on earth — billions per second.

Real-World Numbers

How well do these principles work in practice?

Root Server Load: Despite 300+ million domain names and billions of daily queries, root servers handle only ~100,000 queries/second. That’s caching in action.

Delegation Scale: Over 1,500 TLDs exist today. Each manages its own namespace with full autonomy. That’s hierarchy and distribution.

Change Propagation: When you update a DNS record, worldwide propagation typically happens within hours (often minutes for short TTLs). That’s the distributed system working.

Key Takeaways

DNS’s success stems from three interlocking design principles
Distributed: No single point of control or failure; each organization manages its own data
Hierarchical: Structured naming prevents collisions and enables delegation
Cacheable: TTL-based caching reduces load by orders of magnitude
These principles directly addressed HOSTS.TXT failures: centralization, flat namespace, and traffic explosion
Remove any one principle, and DNS couldn’t scale to the modern internet

With the design understood, we need an implementation. Enter BIND — the Berkeley Internet Name Domain software that became the de facto DNS server for decades.