Icy Bison logo

Icy Bison

Web Development

Yellow Pages web scraper
Icy Bison can provide custom back-end programming in PHP and database applications for your website.

Web programming in PHP is used for many things, including the processing of web forms, such as a contact form. PHP is also used for accessing databases, uploading/downloading files, tracking users, secure login systems and much more. Entire websites can be built from data contaned in a database, such as our gps basecamp website.

In this example we have used PHP to build a Yellow Pages web scraper (ripper). Scrapers are used to read web pages and extract the desired information from a website to either the computer screen, a file or a database. This script was created to extract information on businesses and export the data as an Excel file.

Searches can be performed on an entire category or limited to specific keywords within a category, such as hotels, restaurants, or a specific city. The program is intelligent in that it will ignore listings that are paid advertisements.

As an example, we can search for restaurants in Chicago. This search of the Yellow Pages website would also include nearby towns or suburbs. We can specify a list of towns that we would like the search restricted to, in order to limit the results to only those we need.

This is not a publicly available website because it can use an extremely large amount of bandwidth.

 
FREE - Prevent email harvesting on your website.

Spammers commonly use bots, also called spiders, which are programs that read web pages and scrape email addresses from them. A database is then created using these addresses to generate lists for mass spam emailing. Every time you put an email link on your website you are risking your email address becoming the target of these spam bots.

There are different things you can do to prevent this. One way is to use a contact form. This not only keeps your email address off of your web page, it also looks more professional and makes it easier for people using public computers to contact you.

A simpler solution is to use a little Javascript (available below) to hide your email address from web bots. Bots don't read the information that is displayed in a user's browser, but rather the HTML code that builds the web page. You can display your email address on your website, but hide it from the email harvesting bots that continuously crawl the internet.

Below is the code required to implement this, along with instructions. If you have any questions about it, feel free to contact me.

Download the file jsemail.js. Open the file in your text or web editor and edit lines 11, 12 and 13. On line 11, change yourname to your email name. On line 12, change yourdomain to your domain, gmail for example. On line 13, change yourcom to com, org, etc depending on your email. If your email was president@whitehouse.gov, line 11 would be president, line 12 would be whitehouse, and line 13 would be gov. Save the file and upload it to your web server in the same directory as your HTML files.

In every file that you want your email address on, include the following line of code in the head section of your HTML file(s).

<script language="JavaScript" src="email.js"></script>

Where you want the email link, insert the following two lines of code. Whatever you have between the quote marks, "email me" in this case, will show up on your website, linking to your real email address. If you don't put anything between the quotes, your real email address will show up on your website, but will still hidden from the bots.

<script type="text/javascript">stealthmail("email me"); </script>