Nutch
From Seo Wiki - Search Engine Optimization and Programming Languages
| File:Nutch-logo.gif | |||||||||||||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Screenshot | |||||||||||||||||||||||||||||
| Developer(s) | Apache Software Foundation</td></tr> | ||||||||||||||||||||||||||||
| Stable release | 1.0.0 / March 23, 2009</td></tr> | ||||||||||||||||||||||||||||
| Written in | Java</td></tr> | ||||||||||||||||||||||||||||
| Operating system | Cross-platform</td></tr> | ||||||||||||||||||||||||||||
| Development status | Active</td></tr> | ||||||||||||||||||||||||||||
| Type | Search Engine</td></tr> | ||||||||||||||||||||||||||||
| License | Apache License 2.0</td></tr> | ||||||||||||||||||||||||||||
| Website | http://lucene.apache.org/nutch/</td></tr>
</table> Nutch is an effort to build an open source search engine based on Lucene Java for the search and index component.
[edit] FeaturesIt is coded completely in the Java programming language, but data is written in language-independent formats. Nutch has a highly modular architecture allowing developers to create plugins for the following activities: media-type parsing, data retrieval, querying and clustering. The fetcher ("robot" or "web crawler") has been written from scratch solely for this project. [edit] HistoryNutch originated with Doug Cutting (creator of both Lucene and Hadoop) and Mike Cafarella. In June 2003, there was a successful 100 million page demo system. To meet the multimachine processing needs of the crawl and index tasks, the Nutch project has also implemented a MapReduce facility and a distributed file system. These two facilities have been spun out into their own subproject called Hadoop. As of June 2005, Nutch has graduated from the Apache Incubator, and is now a subproject of Lucene. [edit] ScalabilityIBM Research studied the performance[1] of Nutch/Lucene as part of its Commercial Scale Out (CSO) project [2]. Their findings were that a scale-out system, such as Nutch/Lucene, could achieve a performance level on a cluster of blades that was not achievable on any scale-up computer such as the Power5. [edit] Related projects
[edit] Search engines built with Nutch[edit] References
[edit] Bibliography
[edit] External links
de:Nutch es:Nutch fr:Nutch ko:너치 it:Nutch nl:Nutch tr:Nutch |
||||||||||||||||||||||||||||