Wiki Home

VFP Web Crawler


Namespace: WIN_COM_API
VFP Web Crawler
A free, open source, multi-threaded Visual Foxpro application that will crawl, retrieve, and store blog pages from http://blogs.msdn.com/(blog name). Screenshot: http://www.codeplex.com/vfpwebcrawler#screenshot . The original application was written by Calvin Hsia and posted in his blog. Hosted at Codeplex.
This is a specialized web crawler (also known as a web spider or web robot) but can more than likely be modified to do any kind of web crawling.

*New Version 2
Version 2 features include:
-ability to specify number of threads
-better switching between blogs
-debug option to make crawling visible
This webcrawler will crawl and store any MSDN blog.
Try changing the URL in the Options to some of the following:
blogs.msdn.com/calvin_hsia
blogs.msdn.com/yag
blogs.msdn.com/oldnewthing
blogs.msdn.com/mthree

http://www.codeplex.com/vfpwebcrawler

All source code is included. This is a great example of how to use multi-threading in desktop VFP. No configuration required - works after installation.

The goal of this project is to keep the VFP version better, faster, and with more features than the VB.NET version (also posted in Calvin's blog).

Links:

The post that started it all with the VFP version:
http://blogs.msdn.com/calvin_hsia/archive/2006/05/25/607588.aspx

The post with the VB.NET version:
http://blogs.msdn.com/calvin_hsia/archive/2006/06/12/628051.aspx
( Topic last updated: 2008.04.10 12:36:24 PM )