# DHT & Peer Discovery

KayakNet uses a Kademlia-based Distributed Hash Table (DHT) for peer discovery and decentralized data storage.

## What is a DHT?

A DHT is a decentralized key-value store distributed across all nodes:

* No central server
* Data is replicated
* Lookups are efficient (O(log n))
* Survives node failures

## Kademlia Protocol

KayakNet uses Kademlia, the same DHT protocol used by BitTorrent.

### XOR Distance

Nodes are assigned IDs, and "distance" is calculated using XOR:

```
Distance(A, B) = A XOR B
```

This creates a metric space where:

* Each node knows about nodes "close" to it
* Lookups get exponentially closer with each hop

### Routing Table

Each node maintains a routing table of k-buckets:

```
┌─────────────────────────────────┐
│ Bucket 0: Distance 2^0 - 2^1    │  ← Very close nodes
├─────────────────────────────────┤
│ Bucket 1: Distance 2^1 - 2^2    │
├─────────────────────────────────┤
│ Bucket 2: Distance 2^2 - 2^3    │
├─────────────────────────────────┤
│ ...                             │
├─────────────────────────────────┤
│ Bucket 255: Distance 2^255+     │  ← Very far nodes
└─────────────────────────────────┘
```

Each bucket holds up to K nodes (K=20 in KayakNet).

## Operations

### PING

Check if a node is alive:

```
Node A → PING → Node B
Node A ← PONG ← Node B
```

### STORE

Store a key-value pair:

```
Node A → STORE(key, value) → Closest nodes
```

### FIND\_NODE

Find nodes close to a target ID:

```
Node A → FIND_NODE(target_id) → Node B
Node A ← [list of close nodes] ← Node B
```

### FIND\_VALUE

Retrieve a stored value:

```
Node A → FIND_VALUE(key) → Node B
Node A ← value OR [closer nodes] ← Node B
```

## Peer Discovery Process

### Joining the Network

```
1. Connect to bootstrap
   │
2. Get initial peers
   │
3. Perform FIND_NODE on own ID
   │
4. Discover close neighbors
   │
5. Populate routing table
   │
6. Announce presence
```

### Iterative Lookup

To find a node or value:

```
Start with α closest known nodes
    │
For each node, send query
    │
Receive responses with closer nodes
    │
Query the newly discovered closer nodes
    │
Repeat until no closer nodes found
    │
Return result
```

Parameters:

* α (alpha) = 3 (parallel queries)
* K = 20 (nodes per bucket)

## Data Storage

### What's Stored in DHT

| Key Type        | Value                   | TTL       |
| --------------- | ----------------------- | --------- |
| `peer:<id>`     | Node address            | 1 hour    |
| `room:<name>`   | Room metadata           | 24 hours  |
| `domain:<name>` | Domain record           | Permanent |
| `listing:<id>`  | Marketplace listing     | 7 days    |
| `msg:<hash>`    | Message (store-forward) | 1 hour    |

### Replication

Data is stored on the K closest nodes:

* Provides redundancy
* Survives node failures
* Automatic re-replication

### TTL and Refresh

* Records have TTL
* Publishers refresh before expiry
* Expired records are deleted
* Popular content stays available

## Configuration

```json
{
  "dht": {
    "k": 20,              // Bucket size
    "alpha": 3,           // Parallel queries
    "record_ttl": "24h",  // Default TTL
    "refresh_interval": "1h"
  }
}
```

## Performance

### Lookup Efficiency

For N nodes in network:

* Average hops: O(log N)
* With 1M nodes: \~20 hops
* With caching: Often 1-3 hops

### Bandwidth Usage

* PING: \~100 bytes
* FIND\_NODE: \~500 bytes
* STORE: Variable (depends on value)

## Security Considerations

### Sybil Protection

Without protection, attacker could:

* Create many nodes
* Control routing
* Censor lookups

KayakNet mitigation:

* Proof-of-Work for new nodes
* Peer scoring
* Diverse peer selection

### Eclipse Protection

Without protection, attacker could:

* Surround victim node
* Control all their peers
* Censor or manipulate

KayakNet mitigation:

* Require peers from different subnets
* Maintain bootstrap connections
* Periodic peer refresh

### Data Integrity

All DHT records are signed:

```
Record = {
  key: "domain:example.kyk",
  value: { ... },
  publisher: <public_key>,
  signature: <Ed25519_sig>,
  timestamp: 1234567890
}
```

Unsigned or invalid records are rejected.

## Comparison with Other DHTs

| DHT      | Used By              | Differences             |
| -------- | -------------------- | ----------------------- |
| Kademlia | BitTorrent, KayakNet | XOR distance, k-buckets |
| Chord    | Academic             | Ring topology           |
| Pastry   | Academic             | Prefix routing          |
| Coral    | CDN                  | Hierarchical clusters   |

## Troubleshooting

### Few Peers

```bash
# Check peer count
curl http://127.0.0.1:8080/api/network/peers

# If < 5, wait or restart
```

### Slow Lookups

* Check network connectivity
* Verify bootstrap is reachable
* Increase alpha for more parallelism

### Data Not Found

* Publisher may be offline
* TTL may have expired
* Try again after network refresh


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.kayaknet.io/architecture/dht.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
