From release v.42.1 on eMule features two different networks – the classic
server based eD2k network and a completely new server less topology based on
Kademlia.
In essence both networks have the same functions. They both provide a separate
means of finding other users or files you are wanting to download.
Basics
File identification
All files are given a hash value. This hash is a combination of numbers and
letters to uniquely identify the file. Numerous filenames may be associated
with a file, but this does not change anything about file’s hash value.
This allows each user to find all sources to a particular file no matter what
file name each user has given the file.
In addition, the files are broken into 9.28 MB of parts of data. Each part is
also given a hash value. For example a 600 MB file would contain 65 parts. Each
part is then given a hash value. Then the file hash is created from these part
hashes to be used in the networks.
Identifying other clients
Like the file hash, each user in the network gets a unique and permanent user
hash. This user identification is highly secured by a public / private key handshake
to prevent misuse.
Downloading Data
It is important to understand that the actual downloading in eMule is not affected
by the choice of the network. The network topology is only related to searching
for files and finding clients that are sources to a file.
Once a source has been found, your client contacts it. The source then reserves
a queue place for that specific download. When you reach the first queue place
after a certain waiting time you are entitled for receiving data.
Classic server based eD2k
Connecting to the network
The key to this network is the eD2k server. Each client must to be connected
to a server to enter the network.
When connecting your client to a server, the server checks to see if other clients
can freely connect to your client. If yes, the server assigns your client a
so-called high ID. If communication is blocked, the server assigns your client
a low ID.
After the ID is assigned, eMule will send a list of all shared files to the
server. The server adds the filenames and hash values you sent to its database.
Searching for files
Once connected to the network, the client can search for keywords in filenames.
A search can either be local or global. If it’s a local search (searches
only the server you are connected to), searches are quicker but will have fewer
results. If the search is a global search (searches all the servers within the
network), it will take longer but have more results. Each server looks up the
keyword in its local database and returns any file names (with the hash value)
that matches the keyword.
Finding sources for files
Downloads can be added by eMule’s search function or a special eD2k link
format offered on many websites.
Once they are in the Download list, eMule first queries the local (connected)
server then all other servers in the network for sources to that particular
download. The server looks up the file’s hash value in its database and
returns the clients it knows for having it.
Sources are other clients who have at least downloaded one entire part (9.28
MB) of the file matching the hash.
Kademlia serverless network
Connecting to the network
The only thing needed to connect to this network is the IP and port of any eMule
client already connected. This is called a Boot Strap.
Once a client is in the network, the client then requests for other clients
to determine if it can be contacted freely. This process is very similar to
the HighID/LowID check on the servers. If you can be freely contacted, you are
assigned an ID (similar to a HighID) and given an “open” status.
If you are not freely contacted, you are given a “firewalled” status.
Currently, firewalled users are not supported and you are then required to connect
to a server. Firewall support will be added later.
Searching in Kademlia
In this network it does not matter what you search for. Be it a search for filenames,
for sources of a download or for other users, all work pretty much the same.
There are no servers to keep track of clients and the files they share so it
has to be done by each participating client in the network – in essence,
every client is also a small server.
Since every client is identified by a unique hash value, the idea of Kademlia
is to associate a certain “responsibility” based on this hash. Each
client in the Kademlia network works as a server for certain keywords or sources.
The client’s hash determines the specific keywords or sources.
So the goal of any kind of search is to find those clients that have the responsibility
for the current search topic. This is accomplished by a complex calculation
of the possible distance to the target client by asking other clients for the
shortest route to it.
Summary
Both networks have a totally different concept of achieving the same: Searching
for files and finding sources to a file. The main goal of the Kademlia network
is to be independent of servers and improve scalability. Servers can only handle
a certain amount of users and should a large server go down the network is severely
handicapped.
Kademlia is self-organising and tunes itself for best possible performance depending
on the number of users and their connection qualities. Therefore, it is more
resistance to a large-scale network loss.