# DHT & Peer Discovery

KayakNet uses a Kademlia-based Distributed Hash Table (DHT) for peer discovery and decentralized data storage.

## What is a DHT?

A DHT is a decentralized key-value store distributed across all nodes:

* No central server
* Data is replicated
* Lookups are efficient (O(log n))
* Survives node failures

## Kademlia Protocol

KayakNet uses Kademlia, the same DHT protocol used by BitTorrent.

### XOR Distance

Nodes are assigned IDs, and "distance" is calculated using XOR:

```
Distance(A, B) = A XOR B
```

This creates a metric space where:

* Each node knows about nodes "close" to it
* Lookups get exponentially closer with each hop

### Routing Table

Each node maintains a routing table of k-buckets:

```
┌─────────────────────────────────┐
│ Bucket 0: Distance 2^0 - 2^1    │  ← Very close nodes
├─────────────────────────────────┤
│ Bucket 1: Distance 2^1 - 2^2    │
├─────────────────────────────────┤
│ Bucket 2: Distance 2^2 - 2^3    │
├─────────────────────────────────┤
│ ...                             │
├─────────────────────────────────┤
│ Bucket 255: Distance 2^255+     │  ← Very far nodes
└─────────────────────────────────┘
```

Each bucket holds up to K nodes (K=20 in KayakNet).

## Operations

### PING

Check if a node is alive:

```
Node A → PING → Node B
Node A ← PONG ← Node B
```

### STORE

Store a key-value pair:

```
Node A → STORE(key, value) → Closest nodes
```

### FIND\_NODE

Find nodes close to a target ID:

```
Node A → FIND_NODE(target_id) → Node B
Node A ← [list of close nodes] ← Node B
```

### FIND\_VALUE

Retrieve a stored value:

```
Node A → FIND_VALUE(key) → Node B
Node A ← value OR [closer nodes] ← Node B
```

## Peer Discovery Process

### Joining the Network

```
1. Connect to bootstrap
   │
2. Get initial peers
   │
3. Perform FIND_NODE on own ID
   │
4. Discover close neighbors
   │
5. Populate routing table
   │
6. Announce presence
```

### Iterative Lookup

To find a node or value:

```
Start with α closest known nodes
    │
For each node, send query
    │
Receive responses with closer nodes
    │
Query the newly discovered closer nodes
    │
Repeat until no closer nodes found
    │
Return result
```

Parameters:

* α (alpha) = 3 (parallel queries)
* K = 20 (nodes per bucket)

## Data Storage

### What's Stored in DHT

| Key Type        | Value                   | TTL       |
| --------------- | ----------------------- | --------- |
| `peer:<id>`     | Node address            | 1 hour    |
| `room:<name>`   | Room metadata           | 24 hours  |
| `domain:<name>` | Domain record           | Permanent |
| `listing:<id>`  | Marketplace listing     | 7 days    |
| `msg:<hash>`    | Message (store-forward) | 1 hour    |

### Replication

Data is stored on the K closest nodes:

* Provides redundancy
* Survives node failures
* Automatic re-replication

### TTL and Refresh

* Records have TTL
* Publishers refresh before expiry
* Expired records are deleted
* Popular content stays available

## Configuration

```json
{
  "dht": {
    "k": 20,              // Bucket size
    "alpha": 3,           // Parallel queries
    "record_ttl": "24h",  // Default TTL
    "refresh_interval": "1h"
  }
}
```

## Performance

### Lookup Efficiency

For N nodes in network:

* Average hops: O(log N)
* With 1M nodes: \~20 hops
* With caching: Often 1-3 hops

### Bandwidth Usage

* PING: \~100 bytes
* FIND\_NODE: \~500 bytes
* STORE: Variable (depends on value)

## Security Considerations

### Sybil Protection

Without protection, attacker could:

* Create many nodes
* Control routing
* Censor lookups

KayakNet mitigation:

* Proof-of-Work for new nodes
* Peer scoring
* Diverse peer selection

### Eclipse Protection

Without protection, attacker could:

* Surround victim node
* Control all their peers
* Censor or manipulate

KayakNet mitigation:

* Require peers from different subnets
* Maintain bootstrap connections
* Periodic peer refresh

### Data Integrity

All DHT records are signed:

```
Record = {
  key: "domain:example.kyk",
  value: { ... },
  publisher: <public_key>,
  signature: <Ed25519_sig>,
  timestamp: 1234567890
}
```

Unsigned or invalid records are rejected.

## Comparison with Other DHTs

| DHT      | Used By              | Differences             |
| -------- | -------------------- | ----------------------- |
| Kademlia | BitTorrent, KayakNet | XOR distance, k-buckets |
| Chord    | Academic             | Ring topology           |
| Pastry   | Academic             | Prefix routing          |
| Coral    | CDN                  | Hierarchical clusters   |

## Troubleshooting

### Few Peers

```bash
# Check peer count
curl http://127.0.0.1:8080/api/network/peers

# If < 5, wait or restart
```

### Slow Lookups

* Check network connectivity
* Verify bootstrap is reachable
* Increase alpha for more parallelism

### Data Not Found

* Publisher may be offline
* TTL may have expired
* Try again after network refresh
