The problem with gnutella's
distributed layout is one of
scalability. Consider this:
1) The Gnutella network has U users
2) Each user adds (on average) B bandwidth to the network
3) Each user makes S searches
The total bandwidth available can be calculated with:
Total Bandwidth = U * B
Since Gnutella doesn't have a central server indexing songs, your request has to be sent to many other users, that they may check to see if they have the file you want. Therefore, when you make S searches:
Search Bandwidth = S * U
But you are not the only person wanting to make searches. There's U users, hence:
Total Search Bandwidth = S * U * U = S * U2
As you can see, the Total Bandwidth rises steadily as users increase evenly, but thte Total Search Bandwidth rises exponentially.
A graph would be nice here, but I can't put one in. Instead, here are some figures:
I will assume that each user brings on average 10 units of bandwidth (throughput/second) to the network. I will also assume that every user performs, on average, one search using 1 bandwidth unit per 100 bandwidth units provided.
Users
Total network throughput per time unit
Searches per user per time unit
Search bandwidth per time unit
(users*10) (1/100) (1/100)*users*users
10 100 0.01 1
20 200 0.01 4
30 300 0.01 9
50 500 0.01 25
100 1000 0.01 100
150 1500 0.01 225
200 2000 0.01 400
300 3000 0.01 900
500 5000 0.01 2500
1000 10000 0.01 10000
1500 15000 0.01 22500
2000 20000 0.01 40000
4000 40000 0.01 160000
6000 60000 0.01 360000
10000 100000 0.01 1000000
15000 150000 0.01 2250000
20000 200000 0.01 4000000
As you can see, with above 10,000 users, our imaginary network can't
support itself. The real gnutella would be more complex than this, but that's basically the
maths behind it, as far as I know.