From Seo Wiki - Search Engine Optimization and Programming Languages
Methabot is a scriptable web crawler designed for flexibility and speed. It is free software written in C, distributed under the terms of the ISC licence.
- Support for the Robots Exclusion Standard
- User-defined filetype filtering and sorting, according to custom rules
- Heavy multi-threading
- Chaining of custom parsers
- Converts HTML to real XML for E4X compatibility