About ModularWebCrawler News

MWC Milestone 0.3 has been Released

Today is a good day :-), today I released Modular Web Crawler v0.3!

This milestone is a huge one and is very significant, it makes MWC from a proof of concept to a working web crawler, yes, now you can easily start a web crawl and it will work as it should albeit missing advanced features, but still a crawl can be initiated and the expected results will return to the user.

Not only basic crawling but also some optimizations are implemented in this release, mainly in the field of binary data downloading.

What are the main new things introduced in this release?

  • Redirect support
  • Binary file handling
  • Head fetches
  • fetch-by-size limiting
  • Many many bugs crushed
  • much better SSL handling
  • Upgraded code compatibility to java8
  • editor-config compatibility
  • Singleton services
  • Lombok usage for data boilerplate and logs
  • GitIgnore

A special care was given to the unit tests in this release

  • Tabulated results
  • Added an internal web server to host testing pages
  • Made all tests pass on every build
  • Lots of work on organizing the tests

And a lot of attention was given to the actual project

For the full changelog you are invited to the closed bugs grouped by the v0.3 milestone

Modular Web Crawler