SAC Python Checker Home

I wrote up a little Python script to run with cron every so often. It pulls down a copy of the steepandcheap.com website and parses through it to grab the current item, price, and discount info. There were a few reasons I had in mind when I decided to write it. First, I wanted a checker that would send me a text message any time the item changed and store all the days' items in a log file for reference. Second, the one I had written as a bash script a couple months back was super inefficient, it used wget and lynx and made 3 calls to the site which resulted in wrong information a lot. Lastly, I figured it was about time to learn Python anyway (this is my first Python script).

As you can see in the code below, the items get logged to /home/<user>/public_html/saclogs/ of the user the script runs as. Adjust this as necessary of course. Each log file is named SAClog-<mm/dd/yyyy>.html as well. The script also pulls a thumbnail image link into the log file above the item info. The script is written to check for and create if necessary both the logDir and logFile each time it runs. Justin, a buddy of mine suggested adding a way to allow or disallow the sending of text messages based on a schedule in case you don't want to get alerted at 3:00 am or whatever. The workStart and workEnd variables take care of that. The script figures time based on your server timezone setting, so you'll have to adjust the times accordingly or rewrite the script as needed if your timezone is different.

Oh and on the SMTP stuff; This script is using smtplib.SMTP instead of pointing to a sendmail binary, I don't know if that makes it less or more portable, but that's the way it is. And be sure to change the example email addresses in the server.sendmail() command to some that will work for you.

Update (05JUN2008): I added in some nice keyword matching functionality (also suggested by my buddy Justin) so you will only get text messages about items that match one of your keywords. At some point in the near future, I'll add a switch to turn this functionality on or off probably so you can opt for keyword matching or to receive notifications about all new items.

Update (18JUN2008): I changed some of the regex stuff to simplify the script more.. Instead of using re.findall and searching for 3 different patterns to get the item description and price/discount info, it's now using re.search and just grabs the page title from the <title> tag since it contains all the info we need.. Going this route we don't get the original price, but not a biggie in my opinion. Other than that, just some minor cleanup stuff here and there.. Enjoy!

Update (25FEB2010): I made a small change to the log file output. The small image is now a link (opens in a new tab/window) that searches the Backcountry website for the item. This could be helpful for finding an item you missed or if you want to lookup reviews or whatnot about it.

Update (03MAY2010): Made another change to the log file output to clean it up some and make it easier to read.

Update (03MAR2011): I've made some additional changes over the past few months. Since I run a separate script for each BC site (they have some minor code differences that I'm sure could be worked out, but I'm mostly lazy) I decided to have each script read in a "keywords" file so they can all be looking for the same things. The keywords file is just plain text with one keyword per line and the path to the file is specified on the 4th line of the script. I haven't figured out if it's case-sensitive for sure, but I don't think it is so that shouldn't matter. I also made a few other minor changes to the logs and whatnot. I don't have any real error catching code in there but that could be added pretty easily if you need it. Anyway, have fun.

Update (09NOV2017): This script no longer works due to changes in the layout/functionality of the SaC website. I may rewrite it some day, but I also may not.. all depends on time and desire.

Requirements: Linux, Python (you may be able to run it on Windows, I haven't tried and don't really care to)
Modules used: re, urllib2, BaseHTTPServer, os, smtplib, time, datetime

If anyone has any comments or suggestions to improve it, please feel free to email me at sac-python AT woodenpickle DOT com..



Click here to download 953b (zipped)

Sample log output:


Source code: