biology daily - the biology and biochemistry encyclopedia
biology daily articles and research Encyclopedia Dictionary Forums biology research links Weblinks Pictures Articles Blogs Newsletter

Nutch

Nutch is an effort to build an open source search engine. It uses Lucene for the search and index component. The fetcher (robot) has been written from scratch solely for this project.

Nutch has a highly modular architecture allowing developers to create plugins for the following activities: media-type parsing, data retrieval, querying and clustering.

Tim O'Reilly has a seat in Nutch's board of directors.

Doug Cutting is the lead developer.

Nutch is an Apache Incubator project.

It is completely coded in Java, but data is written in language-independent formats. In June 2003 there was a successful 100 million page demo system.

External links



07-14-2008 23:18:10
The contents of this article are licensed from Wikipedia.org under the GNU Free Documentation License. How to see transparent copy
BiologyDaily.com 2005. Legal info   Privacy