by jcc - March 29th, 2009
We released the first public version of OutWit Docs during the weekend as well as updated versions of OutWit Images and Outwit Hub.
OutWit Docs is a simple WebTop Document Finder, based on our Kernel. It allows you to search through Websites and search engines for documents and it will present the results as an operating system would, either in icon view or as a list of files.
oW Docs looks for text files, spreadsheets, presentations in various formats (including PDF, MS Office, OpenOffice documents, RTF, CSV…).
In this version, the filtering & automatic selection options are somewhat basic (name, file type…), but we are going to improve these along the way. As we cannot download all the result files to explore their contents, we are working on a multi-layered filtering process to refine the query, refine the selection and search the content of the most pertinent files only.
As for all our products, your suggestions will be extremely welcome. In the meantime, we hope that you’ll enjoy this program.
Posted in Uncategorized | 4 Comments »
by kl - November 25th, 2008
The newest update of the Hub contains some exciting new features. This tutorial will explain these new functionalities but for a more detailed explanation of how to use the Hub’s basic features please, refer to the existing list of tutorials.
(More…)
Tags: Execute and Catch, Select Different, Select Identical, Select Inversion, Select Similar
Posted in Tutorials | 9 Comments »
by kl - November 19th, 2008
In this example we’ll redo the scraper from the previous lesson using Regular Expressions. This will allow us to create a more precise scraper, which we can then apply to many URLs. When working with RegExps you can always reference a list of basic expressions and a tutorial by selecting ‘Help’ in the menu bar.
Recap: For complex web pages or specific needs, when the automatic data extraction functions (table, list, guess) don’t provide you with exactly what you are looking for, you can extract data manually by creating your own scraper. Scrapers will be saved on your computer then can be reapplied or shared with other users, as desired.
(More…)
Tags: extract data, Mult URLs, RegExp, scraper, Web Harvester
Posted in Tutorials | 25 Comments »
by kl - November 18th, 2008
Now that we’ve learned how to create a scraper for a single URL, let’s try something a little more advanced. In this lesson we’ll learn how to create a scraper which can be applied to a whole list of URLs using a simple method suited for beginners. In the next lesson a more complex scraper utilizing regular expressions will be demonstrated for our tech savvy users. Geeks, feel free to skip to: Creating a Scraper for Multiple URLs using Regular Expressions.
Recap: For complex web pages or specific needs, when the automatic data extraction functions (table, list, guess) don’t provide you with exactly what you are looking for, you can extract data manually by creating your own scraper. Scrapers will be saved on your computer then can be reapplied or shared with other users, as desired.
(More…)
Tags: Mult URLs, scraper, Web Harvester
Posted in Tutorials | 8 Comments »
by jcc - November 17th, 2008

The first beta version of Images was posted on Firefox Add-ons and came out of the experimental zone this weekend. This new outfit is an online image browser that not only allows you to view Web images as a slideshow or as a wall of thumbnails but also to grab the pictures and save them to your hard disk.
The feedback we are already receiving for this extension is extremely encouraging: with Images like with the Hub, people are actually managing to do things they simply couldn’t do before. And this really makes us happy…
Posted in Uncategorized | No Comments »
by jcc - October 24th, 2008
Version 0.8.1.126 was released yesterday. This update adds several features to the Kernel for the forthcoming release of OutWit Images, improving in particular the image extraction process and the slideshow. The version also includes, among other new features, enhanced bottom panels, with a series of additional criteria to refine your selections and filter the extracted data.
Posted in Uncategorized | No Comments »
by jcc - September 22nd, 2008
The Hub finally came out of the Experimental section of Mozilla Addons, after the review was kindly done by Brian King.
Posted in Uncategorized | No Comments »
by jcc - August 31st, 2008
This update includes a complete refactoring of the Kernel and a new user interface where tabs are replaced by a hierarchical list of views.

New side panel
The new improvements of this version are listed here.
Tags: New releases, Updates
Posted in New releases | 1 Comment »
by kl - August 22nd, 2008
In many cases the automatic data extraction functions: tables, lists, guess, will be enough and you will manage to extract and export the data in just a few clicks.
If, however, the page is too complex, or if your needs are more specific there is a way to extract data manually: Create your own scraper.
Scrapers will be saved to your personal database and you will be able to re-apply them on the same URL or on other URLs starting, for instance, with the same domain name.
A scraper can even be applied to whole lists of URLs.
You can also export your scrapers and share them with other users.
Let’s get acquainted with this feature by creating a simple one.
(More…)
Tags: extract data, harvest web, Outwit Hub, scraper, tutorial, Web scraper
Posted in Tutorials | 18 Comments »