Archive for August, 2008

OutWit Hub and OutWit Kernel 0.8.1.83 Released

Sunday, August 31st, 2008

This update includes a complete refactoring of the Kernel and a new user interface where tabs are replaced by a hierarchical list of views.

New side panel

New side panel

The new improvements of this version are listed here.

Create your First Web Scraper to Extract Data from a Web Page

Friday, August 22nd, 2008

NOTE: This tutorial was created using version 0.8.2. The Scraper Editor interface has changed in version 0.8.9. More features were included and some controls now have a new name. We will update the tutorials as soon as the interface for the Pro version is completely stabilized. We are sorry for the inconvenience. In the meantime, the following should still be a good way to get acquainted with scrapers. The Sraper Editor can now be found in the ‘Scrapers’ view instead of ‘Source’ but the principle remains funamentally the same.

In many cases the automatic data extraction functions: tables, lists, guess, will be enough and you will manage to extract and export the data in just a few clicks.

If, however, the page is too complex, or if your needs are more specific there is a way to extract data manually: Create your own scraper.

Scrapers will be saved to your personal database and you will be able to re-apply them on the same URL or on other URLs starting, for instance, with the same domain name.

A scraper can even be applied to whole lists of URLs.

You can also export your scrapers and share them with other users.

Let’s get acquainted with this feature by creating a simple one.

(more…)

OutWit Hub Version 0.8.0.34 for Firefox 3 is online

Friday, August 1st, 2008

The Firefox 3 version was released yesterday. It can be downloaded here.

It includes a number of fixes and enhancements: Drag and drop of text, links and images to the catch, autocompletion in address bar, enhanced application of scrapers, better tree behavior, enhanced image extraction process (more high resolution images are found by the ‘images’ widget and the slideshow), enhanced navigation link recognition…