Thursday, August 30, 2007

INQ7 front page image retrieval

According to statistics, the most important page of a newspapager is the frontpage. The very first page of a newspaper is where major news are being emhpasized, given attention, and sensationalized since this is the very outside cover and first entry of attraction where consumers look at from daily newspaper.

With that, INQ7, a website of one of the major newspaper source from Philippines, provides an soft copy of its front page publicly posted from their website on daily basis for the consumer to take a view.

This entry basically covers a simple approach on how to parse date string using linux date command inside a bash shell script to retrieve a front page image of a newspaper particularly from INQ7. This image has a changing filename based from date when the image was photo scanned.

Say the actual URL is
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
http://images.inquirer.net/img/thumbnails/new/hea/pag/img/2007/08/20070830.jpg
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

From the above, the image filename is 20070831 in JPG file format. Noticeably, the folder location is also dynamic in value which is also based on current date

Here's the actual simply bash script on handling this kind of dynamic URL source and image filename based on current date string value.

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
#!/bin/bash
# script to retrieve front page image from INQ7 website

# get date, year and month in this format: year=2007, month=08, and day of the month=30
date=`date "+%Y%m%d"`
year=`date "+%Y"`
month=`date "+%m"`

# assigns the dynamic URL to the URL variable like so
URL="http://images.inquirer.net/img/thumbnails/new/hea/pag/img/$year/$month/$date.jpg"

# download the image
wget -c $URL > /dev/null 2>&1

#sends the image file as a mail attachment and send it to my email box like so
echo "my email body" | mutt -s "INQ7 Front Page image subject" -a $date.jpg myemail@domain.com
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Change file access permission to be an executable bash script like so

# chmod 700 samplenews.sh

Then create a crontab job executing the script on daily basis every 7:01AM like so

01 07 * * * /myfolder/samplenews.sh > /dev/null 2>&1

That's it, a very simple approach on retrieving INQ7 front page image file every 7:01AM and deliver it into your Inbox on daily basis.

Sample image from INQ7 site:


This approach is also applicable to Abante and Tonite philippine site.

Done.

Linux commands used:
chmod, wget, bash commands, mutt, crontab

All news website are property and managed by their own respective companies and sites.

0 comments:

Sign up for PayPal and start accepting credit card payments instantly.
ILoveTux - howtos and news | About | Contact | TOS | Policy