Archive for the ‘Tutorials (Web Scraper)’ Category

How to Extract Links from a Web Page

Thursday, October 8th, 2009

In this tutorial we are going to learn how to extract links from a webpage with OutWit Hub.

Important Note: The tutorials you will find on this blog may become outdated with new versions of the program. We have now added a series of built-in tutorials in the application which are accessible from the Help menu.
You should run these to discover the Hub.

Sometimes it can be useful to extract all links from a given web page. OutWit Hub is the easiest way to achieve this goal.

(more…)

Grab Documents With OutWit Hub

Thursday, September 17th, 2009

In this tutorial we are going to learn how to download all the documents (.pdf, .doc, .xls,…) from a webpage with OutWit Hub.

Important Note: The tutorials you will find on this blog may become outdated with new versions of the program. We have now added a series of built-in tutorials in the application which are accessible from the Help menu.
You should run these to discover the Hub.

On some webpages, you can find links to different kinds of documents. Looking for each link would be really tiring: with OutWit Hub, you can automatically see all the links to documents, the name and extension of those, and download them to your hard-disk (also see OutWit Docs).

(more…)

How to scrape Search Engine Result Pages with OutWit Hub for SEO Audit (Video)

Thursday, September 17th, 2009

Important Note: The tutorials you will find on this blog may become outdated with new versions of the program. We have now added a series of built-in tutorials in the application which are accessible from the Help menu.
You should run these to discover the Hub.

In this tutorial, Dale Stokdyk, explains how to scrape Search Engine Result Pages (SERPS) with OutWit Hub Data Extractor for Firefox. OutWit Hub is very useful when you are performing an SEO Audit.

Step 1 : Create a web scraper for  Search Engine Result Pages

Step 2 : Scrape Search Engine Result Pages

Step 3 : Export data to Excel or Sql

Read the full tutorial on Marketing2OH

OutWit Hub’s New Features

Tuesday, November 25th, 2008

Important Note: The tutorials you will find on this blog may become outdated with new versions of the program. We have now added a series of built-in tutorials in the application which are accessible from the Help menu.
You should run these to discover the Hub.

The newest update of the Hub contains some exciting new features. This tutorial will explain these new functionalities but for a more detailed explanation of how to use the Hub’s basic features please, refer to the existing list of tutorials.

(more…)

Creating a Scraper for Multiple URLs Using Regular Expressions

Wednesday, November 19th, 2008

Important Note: The tutorials you will find on this blog may become outdated with new versions of the program. We have now added a series of built-in tutorials in the application which are accessible from the Help menu.
You should run these to discover the Hub.

NOTE: This tutorial was created using version 0.8.2. The Scraper Editor interface has changed a long time ago. More features were included and some controls now have a new name. The following can still be a good complement to get acquainted with scrapers. The Sraper Editor can now be found in the ‘Scrapers’ view instead of ‘Source’ but the principle remains funamentally the same.

In this example we’ll redo the scraper from the previous lesson using Regular Expressions.  This will allow us to create a more precise scraper, which we can then apply to many URLs.  When working with RegExps you can always reference a list of basic expressions and a tutorial by selecting ‘Help’ in the menu bar.

Recap: For complex web pages or specific needs, when the automatic data extraction functions (table, list, guess) don’t provide you with exactly what you are looking for, you can extract data manually by creating your own scraper.  Scrapers will be saved on your computer then can be reapplied or shared with other users, as desired.

(more…)

Creating a Scraper for Multiple URLs, Simple Method

Tuesday, November 18th, 2008

Important Note: The tutorials you will find on this blog may become outdated with new versions of the program. We have now added a series of built-in tutorials in the application which are accessible from the Help menu.
You should run these to discover the Hub.

This tutorial was created using version 0.8.2. The Scraper Editor interface has changed a long time ago. Many more features were included and some controls now have a new name. The following can still be a good complement to get acquainted with scrapers. The Sraper Editor can now be found in the ‘Scrapers’ view instead of ‘Source’ but the principle remains funamentally the same.

Now that we’ve learned how to create a scraper for a single URL, let’s try something a little more advanced.  In this lesson we’ll learn how to create a scraper which can be applied to a whole list of URLs using a simple method suited for beginners.  In the next lesson a more complex scraper utilizing regular expressions will be demonstrated for our tech savvy users.  Geeks, feel free to skip to: Creating a Scraper for Multiple URLs using Regular Expressions.

Recap: For complex web pages or specific needs, when the automatic data extraction functions (table, list, guess) don’t provide you with exactly what you are looking for, you can extract data manually by creating your own scraper.  Scrapers will be saved on your computer then can be reapplied or shared with other users, as desired.

(more…)

Create your First Web Scraper to Extract Data from a Web Page

Friday, August 22nd, 2008

Important Note: The tutorials you will find on this blog may become outdated with new versions of the program. We have now added a series of built-in tutorials in the application which are accessible from the Help menu.
You should run these to discover the Hub.

Find a simple but more up-to-date version of this tutorial here

This tutorial was created using version 0.8.2. The Scraper Editor interface has changed a long time ago. Many more features were included and some controls now have a new name. The following can still be a good complement to get acquainted with scrapers. The Sraper Editor can now be found in the ‘Scrapers’ view instead of ‘Source’ but the principle remains funamentally the same.

In many cases the automatic data extraction functions: tables, lists, guess, will be enough and you will manage to extract and export the data in just a few clicks.

If, however, the page is too complex, or if your needs are more specific there is a way to extract data manually: Create your own scraper.

Scrapers will be saved to your personal database and you will be able to re-apply them on the same URL or on other URLs starting, for instance, with the same domain name.

A scraper can even be applied to whole lists of URLs.

You can also export your scrapers and share them with other users.

Let’s get acquainted with this feature by creating a simple one.

(more…)

Grab HTML Tables to Excel Spreadsheets

Saturday, July 19th, 2008

Important Note: The tutorials you will find on this blog may become outdated with new versions of the program. We have now added a series of built-in tutorials in the application which are accessible from the Help menu.
You should run these to discover the Hub.

While surfing the Web, you may have come across interesting data that you want to use offline. You then faced the tiresome task of copying and pasting all the information row by row, column by column. OutWit Hub‘s “Data” views can automatically do this for you.

In this tutorial, we are going to learn how to grab structured data from a Web page with the “Table” view and export it to an Excel spreadsheet.

(more…)

Auto-Browsing Through a Series of Pages

Wednesday, July 9th, 2008

Important Note: The tutorials you will find on this blog may become outdated with new versions of the program. We have now added a series of built-in tutorials in the application which are accessible from the Help menu.
You should run these to discover the Hub.

Have you ever wanted to download all the photos of your favorite star while surfing the Web then faced the tiresome task and given up?

If you have, OutWit Hub is the solution for you.

It downloads pictures automatically from a series of pages with the two buttons below:

– the “Next in Series” button:

– the “Browse” button:
(more…)

Download Pictures from a Web Page

Tuesday, July 1st, 2008

Important Note: The tutorials you will find on this blog may become outdated with new versions of the program. We have now added a series of built-in tutorials in the application which are accessible from the Help menu.
You should run these to discover the Hub.

In this tutorial, you are going to learn how to download whole collections of pictures with just a few clicks using OutWit Hub.

If you have ever surfed the Web in search of images to illustrate a presentation, get photos of your favorite stars or simply cool desktop backgrounds, you know that collecting large series of images and saving them to your computer can be tedious especially if you have to download them one by one. (See also OutWit Images)

With Outwit Hub, you can do this with the simple click of a few buttons.

(more…)