Freenet

From Wikipedia

HomePage | Recent changes | View source | Discuss this page | Page history | Log in |

Printable version | Disclaimers | Privacy policy

Freenet is a peer-to-peer file sharing network which pools the contributed bandwidth and storage space of member computers to allow users to anonymously publish or retrieve information of all kinds. The term also has an older meaning, refering to text-based community networks offering limited Internet services at little or no cost.

The Freenet file sharing network is designed to be highly survivable, with all internal processes completely anonymized and decentralized across the network. The system has no central servers and is not subject to the control of any one individual or organization. Even the designers of Freenet will not have any control over the overall system. The system is designed so that information stored in the system is encrypted and replicated across a large number of continuously-changing anonymized computers around the world. It is extremely difficult for an attacker to find out which participants are hosting a given file -- since the contents of each file are encrypted, and can also be broken into sections that are distributed over many different computers, even the participants themselves don't know what they are storing.

The end goal of a Freenet network is to store Documents and allow them to be retrieved later by Key, much the same as is now possible with protocols such as HTTP. The network is implemented as a number of Nodes that pass Messages among themselves peer-to-peer. Typically, a host computer on the network will run the software that acts as a node, and it will connect to other hosts running that same software to form a large distributed network of peer nodes. Certain nodes will be end user nodes, from which documents will be requested and presented to the human user. But these nodes communicate with each other and with intermediate routing nodes identically -- there are no dedicated "clients" or "servers" on the network.

Freenet protocol is intended to be implemented on a network with a complex network topology, much like IP (Internet Protocol). Each node knows only about some number of other nodes that it can reach directly (its conceptual "neighbors"), but any node can be a neighbor to any other; there is no hierarchy or other structure. Each document (or other message such as a document request) in Freenet is routed through the network by passing from neighbor to neighbor until reaching its destination. As each node passes a document to its neighbor, it does not know or care whether its neighbor is just another routing node forwarding information on behalf of another, whether it is the source of the document being passed, or whether it is a user node that will present the document to an end user. This is intentional, so that anonymity of both users and publishers can be protected.


Each node maintains a data store containing documents associated with keys. With each document it also stores the address of another node where that document came from (and possibly some limited metadata about the document). In addition it may have some keys for documents that have been deleted from the local node data store (due to lack of use, memory limits, etc.), but in that case it also retains a pointer to another node that may still have the data.

To find a document in the network given a key, a user sends a message to a node (probably one running on the same machine as the client program) requesting the document, providing it with the key. If no matching key is present in the local data store, the node then finds the "closest" key it does have and forwards the request to the node associated with that key, remembering that it has done so.

The node to which the request was forwarded repeats the process until either the key is found or a number of hops is reached to indicate failure. Along the route, if a node is visited more than once (and it will know this because it remebered forwarding the request the first time) then that node cuts off the loop by sending a message to the node that sent it the second request telling it to start looking at the node associated with the next-closest data item, the next-next-closest, and so on.

Eventually either the document is found or the hop limit is exceeded, at which point the node sends back a reply that works its way back to the originator along the route specified by the intermediate nodes' records of pending requests. The intermediate nodes may choose to cache the document being delivered to optimized later requests for it.

Essentially the same path-finding process is used to insert a document into the network, the document being stored at each node along the path.

Initially, each node has a purely random set of keys for every other node that it knows about. This means that the nodes to which it sends a given data item will depend entirely on what these random keys are. But since different nodes use different random keys, each node will initially disagree about where to look for or send data, given a key. The data in a newly-started Freenet will therefore be distributed somewhat randomly.

As more documents are inserted by the same node, they will begin to cluster with data items whose keys are similar, because the same routing rules are used for all of them. More importantly, as data items and requests from different nodes "cross paths", they will begin to share clustering information as well.

The result is that the network should self-organize into a distributed, clustered structure where nodes tend to hold data items that are close together in key space. There will probably be multiple such clusters throughout the network, any given document being replicated numerous times, depending on how much it is used. This is a kind of "spontaneous symmetry breaking", in which an initially symmetric state (all nodes being the same, with random initial keys for each other) leads to a highly asymmetric situation, with nodes coming to specialize in data that has certain closely related keys.

There are forces which tend to cause clustering (shared closeness data spreads throughout the network), and forces that tend to break up clusters (local caching of commonly used data). These forces will be different depending on how often data is used, so that seldom-used data will tend to be on just a few nodes which specialize in providing that data, and frequently used items will be spread widely throughout the network. This automatic mirroring counteracts the Slashdot effect, and due to a mature network's intelligent routing a network of size n should only require log(n) time to retrieve any given document. Freenet does not employ broadcast searches as used by Gnutella and other similar file sharing protocols.

One thing to keep in mind is that keys are hashes, hence there is no notion of semantic closeness when speaking of key closeness. Therefore there will be no correlation between key closeness and similar popularity of data as there might be if keys did exhibit some semantic meaning, thus avoiding bottlenecks caused by popular subjects.

There are two main varieties of keys in use on Freenet, the Content Hash Key (CHK) and the Signature Verification Key (SVK).

A CHK is an SHA-1 hash of a document and thus a node can check that the document returned is correct by hashing it and checking the digest against the key. This key contains the meat of the data on freenet. It carries all the binary data building blocks for the content to be delivered to the client for reassembly and decryption. The CHK is unique by nature and provides tamperproof content. A cancer node messing with the data under a CHK will immediately be detected by the next node or the client. CHKs also reduce the redundancy of data since the same data will have the same CHK.

SVKs are based on public-key cryptography. Currently Freenet uses the DSA system for pubkey crypto. Documents inserted under SVKs are signed by the inserter, and this signature can be verified by every node to ensure that the data is not tampered with. SVKs can be used to establish a verifiable anonymous identity on Freenet, and allow for documents to be updated securely by the person who inserted them. A subtype of the SVK is the Key Signed Key, or KSK, in which the key pair is generated in a standard way from a simple human-readable string. Inserting a document using a KSK allows the document to be retrieved and decrypted if and only if the requester knows the human-readable string; this allows for more convenient (but less secure) URIs for users to refer to.

History

Freenet is an enhanced Open Source implementation of the system described by Ian Clarke's 1999 paper "A distributed decentralized information storage and retrieval system." Work started on Freenet shortly after the publication of this paper in July 1999 by Clarke and a small number of volunteers. By March 2000 version 0.1 of Freenet was ready for release. Since March 2000 Freenet has been extensively reported on in the press, albeit primarily due to its implications for copyright rather than for its wider aim of freedom of communication.

At the time of writing the 0.4 release is nearing completion.


from The Freenet Project under the GNU documentation license

/Talk