YaCy

From Seo Wiki - Search Engine Optimization and Programming Languages

Jump to: navigation, search
YaCy
Developer(s) Michael Christen
Stable release 0.9 / June 23, 2009
Written in Java
Operating system Platform independent
Type Search engine
License GPL
Website http://yacy.net

YaCy (read "ya see") is a free distributed search engine, built on principles of peer-to-peer (P2P) networks. Its core is a computer program written in Java distributed on several hundred computers, as of September 2006, so-called YaCy-peers. Each YaCy-peer independently crawls through the Internet, analyzes and indexes found web pages, and stores indexing results in a common database (so called index) which is shared with other YaCy-peers using principles of P2P networks.

Compared to semi-distributed search engines, the YaCy-network has a decentralised architecture. All YaCy-peers are equal and no central server exists. It can be run either in a crawling mode or as a local proxy server, indexing web pages visited by the person running YaCy on his or her computer. (Several mechanisms are provided to protect the user's privacy.)

Access to the search functions is made by a locally running web server which provides a search box to enter the query and returns results of the search in form of a web page as usual on other search portals and engines

The program is released under the GPL license.

Contents

Architecture

YaCy search engine is based on four elements:[1]

Crawler
A search robot which traverses from web page to web page and analyzes their context.
Indexer
Creates a Reverse Word Index (RWI) i.e. each word from the RWI has its list of relevant URLs and Ranking information. Words are saved in form of word hashes.
Search and Administration interface
Made as a web interface provided by a local HTTP servlet with servlet engine.
Data Storage
Used to store the Reverse Word Index Database utilizing a Distributed Hash Table.

Advantages

  • As there is no central server, the results cannot be censored, and the reliability is (at least theoretically) higher.
  • Because the engine is not owned by a company, there is no centralised advertising.
  • Because of the design of YaCy, it can be used to index the 'hidden web', like Tor, I2P or Freenet.

Disadvantages

  • As there is no central server and the YaCy network is open to anyone, malicious peers are (theoretically) able to insert inaccurate or commercially biased search results.
  • The YaCy protocol uses HTTP-Requests, which is much slower than UDP-Protocols or TCP-Protocols with persistent connections.

See also

  • Sciencenet, a search engine for scientific knowledge, based on YaCy

External links

de:YaCy

es:YaCy fr:YaCy it:YaCy pl:YaCy

Personal tools

Served in 0.208 secs.