by jcc - June 23rd, 2009
In our work for the coming versions of the kernel, one of our main chalenges is OutWit’s ability to explore the hidden Web. We are working on some very exciting features in this area (partly autonomous, partly user-driven explorations). The Deep Web is composed of Web pages and resources that are not indexed by search engines, simply because there are no links to them. One of the interesting functions we are working on is the generation of URLs and queries to the dark side of the Web.
(More…)
Tags: deep web, grab images, image extraction, Outwit Hub, OutWit Images, query generation matrices
Posted in New releases | 1 Comment »
by jcc - March 29th, 2009
We released the first public version of OutWit Docs during the weekend as well as updated versions of OutWit Images and Outwit Hub.
OutWit Docs is a simple WebTop Document Finder, based on our Kernel. It allows you to search through Websites and search engines for documents and it will present the results as an operating system would, either in icon view or as a list of files.
oW Docs looks for text files, spreadsheets, presentations in various formats (including PDF, MS Office, OpenOffice documents, RTF, CSV…).
In this version, the filtering & automatic selection options are somewhat basic (name, file type…), but we are going to improve these along the way. As we cannot download all the result files to explore their contents, we are working on a multi-layered filtering process to refine the query, refine the selection and search the content of the most pertinent files only.
As for all our products, your suggestions will be extremely welcome. In the meantime, we hope that you’ll enjoy this program.
Posted in Uncategorized | 6 Comments »
by kl - November 25th, 2008
The newest update of the Hub contains some exciting new features. This tutorial will explain these new functionalities but for a more detailed explanation of how to use the Hub’s basic features please, refer to the existing list of tutorials.
(More…)
Tags: Execute and Catch, Select Different, Select Identical, Select Inversion, Select Similar
Posted in Tutorials (Web Scraper) | 11 Comments »
by kl - November 19th, 2008
NOTE: This tutorial was created using version 0.8.2. The Scraper Editor interface has changed in version 0.8.9. More features were included and some controls now have a new name. We will update the tutorials as soon as the interface for the Pro version is completely stabilized. We are sorry for the inconvenience. In the meantime, the following should still be a good way to get acquainted with scrapers. The Sraper Editor can now be found in the ‘Scrapers’ view instead of ‘Source’ but the principle remains funamentally the same.
In this example we’ll redo the scraper from the previous lesson using Regular Expressions. This will allow us to create a more precise scraper, which we can then apply to many URLs. When working with RegExps you can always reference a list of basic expressions and a tutorial by selecting ‘Help’ in the menu bar.
Recap: For complex web pages or specific needs, when the automatic data extraction functions (table, list, guess) don’t provide you with exactly what you are looking for, you can extract data manually by creating your own scraper. Scrapers will be saved on your computer then can be reapplied or shared with other users, as desired.
(More…)
Tags: extract data, Mult URLs, Outwit Hub, RegExp, scraper, Web Harvester
Posted in Tutorials (Web Scraper) | 44 Comments »
by kl - November 18th, 2008
NOTE: This tutorial was created using version 0.8.2. The Scraper Editor interface has changed in version 0.8.9. More features were included and some controls now have a new name. We will update the tutorials as soon as the interface for the Pro version is completely stabilized. We are sorry for the inconvenience. In the meantime, the following should still be a good way to get acquainted with scrapers. The Sraper Editor can now be found in the ‘Scrapers’ view instead of ‘Source’ but the principle remains funamentally the same.
Now that we’ve learned how to create a scraper for a single URL, let’s try something a little more advanced. In this lesson we’ll learn how to create a scraper which can be applied to a whole list of URLs using a simple method suited for beginners. In the next lesson a more complex scraper utilizing regular expressions will be demonstrated for our tech savvy users. Geeks, feel free to skip to: Creating a Scraper for Multiple URLs using Regular Expressions.
Recap: For complex web pages or specific needs, when the automatic data extraction functions (table, list, guess) don’t provide you with exactly what you are looking for, you can extract data manually by creating your own scraper. Scrapers will be saved on your computer then can be reapplied or shared with other users, as desired.
(More…)
Tags: Mult URLs, scraper, Web Harvester
Posted in Tutorials (Web Scraper) | 22 Comments »
by jcc - November 17th, 2008

The first beta version of OutWit Images was posted on Firefox Add-ons and came out of the experimental zone this weekend. This new outfit is an online image browser that not only allows you to view Web images as a slideshow or as a wall of thumbnails but also to grab the pictures and save them to your hard disk.
The feedback we are already receiving for this extension is extremely encouraging: with Images like with the Hub, people are actually managing to do things they simply couldn’t do before. And this really makes us happy…
Download OutWit Images
Tags: OutWit Images
Posted in New releases | No Comments »
by jcc - October 24th, 2008
Version 0.8.1.126 was released yesterday. This update adds several features to the Kernel for the forthcoming release of OutWit Images, improving in particular the image extraction process and the slideshow. The version also includes, among other new features, enhanced bottom panels, with a series of additional criteria to refine your selections and filter the extracted data.
Posted in Uncategorized | No Comments »
by jcc - September 22nd, 2008
The Hub finally came out of the Experimental section of Mozilla Addons, after the review was kindly done by Brian King.
Posted in Uncategorized | No Comments »
by jcc - August 31st, 2008
This update includes a complete refactoring of the Kernel and a new user interface where tabs are replaced by a hierarchical list of views.

New side panel
The new improvements of this version are listed here.
Tags: New releases, Updates
Posted in New releases | 1 Comment »