Ruby to the rescue

December 04, 2009  |  Posted in: Programming  |  0 comments

Tags: / / /

I got a request from a client today. They had a site, that was made up mostly of static html pages (with a smattering of PHP and a WordPress blog.) They wanted to drop <meta> description tags into their site. Excellent idea. They helpfully sent along a Word doc containing the urls of each page and the description they would like added. Of course, this was a list of 30 discreet urls, which would be something of a pain to do by hand with ctrl-v/c. Like most humans I hate repetitive tasks like this, so I figured I should write a script to do it for me.

Luckily for me the url’s mapped to the html/php files themselves (no fancy routing) and looked something like this:

http://www.someurl.com/ourproducts.html  This page is about wonderful unicorns that enable your business to churn out molten rivers of gold

http://www.someurl.com/aboutus.html  We're a gang of very exciting entrepreneaurs making fabulous widgets in our Elven caves.

(etc.)

This gave me an idea: I could export the word doc to plain text and write a script to parse said textfile grabbing filenames & descriptions. The script would then open the correct files and drop the correct <meta> tags into them.

Pre-processing

To make my life easier I pre-processed the textfile with TextMate (any text editor will do) and used a find/replace to convert the urls into just the filenames wrapped with some tokens. So http://www.someurl.com/ourproducts.html became @ourproducts.html@. Wrapping the filenames with @’s made it easier to differentiate the filenames from the descriptions.

I then used TextMate’s Find in Project command to add an empty <meta type="description" content=""> tag to every file. Again, I did this to make it easier to place the descriptions later with my script. To pull this off I wrote a regex to find the <title> tags on each page and added the empty <meta> after them:

find: <title>(.*?)<\/title>

replace: <title>$1<\/title>\n<meta type="description" content="">

Ruby to the rescue

I’ve been playing around with Ruby and Python lately, to expand my horizons past PHP. I decided to use Ruby for this particular task. The following script opens my textfile and reads it line by line grabbing filenames and descriptions. It then attempts to open the file specified, finds the empty <meta> tags and drops in the description using gsub!.

I ran it from TextMate with command-R but I could have added a shebang and run it from the cli. It worked like a champ:

# Open a file and look for the empty meta tag
# if found, drop the description into it

def get_and_open_file(filename, desc)
  begin
    if f = File.open(filename, "r+") # open the file
      lines = f.readlines

      # for each line in the textfile build a new meta tag
      # and drop it into place

      lines.each do |it|
        newline = '<meta type="description" content="'+desc+'" />'

        # replace empty meta tag with loaded tag

        if it.gsub!(/<meta type="description" content="" \/>/, newline)
          puts it
        end

      end
      f.pos = 0          # back to start
      f.print lines      # write out modified lines
      f.truncate(f.pos)  # truncate the file to its new length & save
    end
  rescue

  end
end

# Open the textfile and parse it line by line looking for 
# URLs & descriptions with a regex: /\@(.*?)\@(.*)/

File.open("Metadata.txt").each { |line|
  if line =~ /\@(.*?)\@(.*)/
    filename = $1
    desc = $2
    get_and_open_file(filename, desc)
  end
}

I know its the not the prettiest Ruby code (and I’m sure I could have done it more efficiently) but it did what I needed in a very short amount of time. I find that I’m using Regex’s all the time lately and they’re really helping to automate a lot of very boring tasks.

Make a comment

Privacy: I will never give/sell/share your email address with anyone, ever. I need it to help crack down on spam, and contact you if I have a question about your comment that would be best handled in private.

A red label and/or an asterisk (*) indicates a required field

this will never be made public
(optional)

Preview:

Recent posts

It's been a long time coming.

0 Comments :: February 01, 2010

Array.prototype for fun and profit

1 Comments :: January 14, 2010

Ruby to the rescue

0 Comments :: December 04, 2009

About that tablet

0 Comments :: August 30, 2009

Problems with the cloud

0 Comments :: August 12, 2009

Search the posts

Categories

Browse around

© 2010 Darren Newton, all rights reserved - Revision 322