Author Archive

Please download Version 1.0.1.11 – FF4 & Other Important fixes

Monday, January 24th, 2011

Version 1.0.1.11 of OutWit Hub solves a number of issues (see history), in particular in the recognition of Next Page links in series of pages. Do not hesitate to download this version from our site if the update was not automatically proposed to you.

OutWit Hub 1.0

Monday, November 1st, 2010

We have finally released version 1.0 of OutWit Hub, with all the features originally planned plus many others, imagined by our users.

Your help testing the program, reporting bugs and suggesting features was invaluable. We thank our tens of thousands of users for their interest and support and we are especially grateful to the hundreds of beta testers of the Pro features who so actively contributed (some with rather creative test cases). What we particularly appreciated is the diversity of usage we read about in our users feedback: from human resources to SEO, from e-commerce to personal collection, from education to research… And some of you are pushing the application to unexplored limits, which, in turn pushes us to optimize our code as much as we can. We hope you will enjoy the Hub Pro and that it will help you save time and clicks, leave the repetitive, tedious tasks to OutWit and focus on the interesting parts of your job.

Of course, version 1.0 is only the beginning and there are so many items in our to do list that we don’t dare to look at the whole thing anymore. We split it in milestones and, version by version, we will go from a rather powerful extraction tool to an autonomous web explorer, collecting, organizing and sharing data and media for you.

The next step is an internal tutorial/wizard system, which will allow us and anybody to produce walkthroughs for the most common extraction tasks. OutWit Hub’s Help system is pretty good. It covers all the programs’ features in a rather detailed way but, to be honest, users don’t read it. What is needed is a series of step-by-step tutorials to replace the few and outdated ones that can be found on this blog. They will reside in the application itself. Then, this wizard system will by extended into a scripting engine and finally a mashup generation system… No release calendar yet, but these are clearly our focus for the coming months.

Thanks again to all those who helped and I hope you will enjoy the program.

JC

Versions for Firefox 3.6 are online

Saturday, January 23rd, 2010

Thank you for your abundant feedback. We have put version 0.8.9.132 online with a few new functions as well as a list of features and fixes recently requested. We didn’t manage to exactly synchronize this update with the release of FF 3.6. but we have now corrected most glitches in the last 48 hours. We may have a to release updates a little more frequently during this month as we will try to go down the beta-test feedback and wish list as rapidly as possible in the coming weeks.

Thank you in advance for your patience.

JC

We Wish You a Beautiful and Happy Year 2010

Sunday, January 3rd, 2010

… We will try to help with our programs and make your life easier.

The new Kernel has been online for a few weeks now and we seem to have fixed all the regressions (not so many, in fact, after such fundamental architectural changes). I believe we can now advise those who haven’t done it yet, to install the updates as the general feedback is pretty positive.

You will, however, find very few features of the upcoming OutWit Hub Pro in the 0.8.9.x versions. These features will only be included progressively in the 0.9.x updates, as the very last beta versions before we release v 1.0. Your feedback will of course be very much appreciated.

We will be glad to propose the Pro version for half the price to all those (officially beta testers or not) who will have helped us identify or fix bugs or who have suggested new features which have been implemented or included in our to do list. So, please, do not hesitate to register on outwit.com and share your comments on the program.

Note about the use of your email address: We have never sent our registered users a single e-mail or newsletter up to now. We really dislike invasive mass e-mailings and the least we can do is respect our own principles. So you can be assured that your privacy is safe with us. A few weeks before the release of the Hub 1.0, we will nevertheless propose the update to our beta testers and users, as well as a feedback form for those who have a little time. We promise that this will be very exceptional.

Cheers to all.

A New Kernel

Monday, November 16th, 2009

We have been extremely busy in the last weeks with the complete refactoring of the OutWit Kernel, preparing the way for the advanced automation functionalities of OutWit Hub Pro. The coming version 0.8.9 will be the first using our new core library. You will not see very radical changes yet, except for the scraper editor, which should make many of you happy. Here are the changes that you will find:

The brand new scraper manager and editor

The big red Stop button that many have been asking for (which, by the way, also allows to abort ‘Apply Scraper to URLs’ processes)

A few changes in the interface, to prepare the integration of new automators in the following versions.

(more…)

How to Extract Links from a Web Page

Thursday, October 8th, 2009

In this tutorial we are going to learn how to extract links from a webpage with OutWit Hub.

Important Note: The tutorials you will find on this blog may become outdated with new versions of the program. We have now added a series of built-in tutorials in the application which are accessible from the Help menu.
You should run these to discover the Hub.

Sometimes it can be useful to extract all links from a given web page. OutWit Hub is the easiest way to achieve this goal.

(more…)

Grab Documents With OutWit Hub

Thursday, September 17th, 2009

In this tutorial we are going to learn how to download all the documents (.pdf, .doc, .xls,…) from a webpage with OutWit Hub.

Important Note: The tutorials you will find on this blog may become outdated with new versions of the program. We have now added a series of built-in tutorials in the application which are accessible from the Help menu.
You should run these to discover the Hub.

On some webpages, you can find links to different kinds of documents. Looking for each link would be really tiring: with OutWit Hub, you can automatically see all the links to documents, the name and extension of those, and download them to your hard-disk (also see OutWit Docs).

(more…)

How to scrape Search Engine Result Pages with OutWit Hub for SEO Audit (Video)

Thursday, September 17th, 2009

Important Note: The tutorials you will find on this blog may become outdated with new versions of the program. We have now added a series of built-in tutorials in the application which are accessible from the Help menu.
You should run these to discover the Hub.

In this tutorial, Dale Stokdyk, explains how to scrape Search Engine Result Pages (SERPS) with OutWit Hub Data Extractor for Firefox. OutWit Hub is very useful when you are performing an SEO Audit.

Step 1 : Create a web scraper for  Search Engine Result Pages

Step 2 : Scrape Search Engine Result Pages

Step 3 : Export data to Excel or Sql

Read the full tutorial on Marketing2OH

Semantic analysis

Tuesday, August 25th, 2009

I understand it is mean to talk about features that are not implemented in the downloadable versions, but I would like to share my ideas on the purpose behind our experimental semantic features.

The “mechanical” recognition and extraction algorithms used in most views of the Hub are mostly based on a combination of DOM analysis (when dealing with HTML pages) and morphological recognition of objects and strings. These techniques are very efficient for simple scraping of data, but they are not sufficient when we need to discriminately extract data about certain themes or topics. We are currently adding semantic capacities to our extractors (in professional applications only, for now).

At the moment, we are only focusing  on statistical analysis of the words and phrases, without performing any syntactic analysis of the texts. However, the results are very promising and seem to confirm our original ideas.

(more…)

Our mission

Tuesday, August 11th, 2009

At OutWit, we are working on adding intelligence to the Web browser.

The free beta applications that you have been downloading from our site are only parts of what we are developing. They are implementations of some of the recognition and extraction capacities that we are including in the OutWit Kernel. We have been talking about a public API for more than a year now and, although it is definitely still in the pipe, we have been delaying it (as for the complete help and documentation) until we can reach a stable enough version of the kernel and feel confortable with people starting to write code around it.

We are convinced that the future will prove it was a good idea to add semantic intelligence to the browser itself instead of exclusively focusing on the server side.