granth green sumo

turkey sandwiches!

Photo Photo :thumbsup::hamburger: Beets of the Southern Wild part two: now with mountains Photo Photo

Sake Music Grab 25 May 2008

sake, ruby, mp3

This is a little Sake task for grabbing all the mp3s off a web page. Like if someone sends you a link to a directory of music and you get an Apache index, you know?

Install Sake and the task.

gem install sake
sake -i http://granth.ca/2008/05/sake-music-grab/code/3

Start grabbing!

sake grab:mp3s http://example.com/musics

Here’s the source:

desc "Grab all the mp3/m4a files from a web page"
task "grab:mp3s" do
  require 'hpricot'
  require 'open-uri'

  taskname = ARGV.shift

  if ARGV.empty?
    $stderr.puts "usage: #{File.basename($0)} #{taskname} <mp3dir-uri>"
    exit(1)
  end

  uri = ARGV.first

  doc = open(uri) {|f| Hpricot(f) }
  links = (doc/"a").map do |a|
    a.get_attribute("href")
  end.select do |link|
    link.match(/\.(mp3|m4a)$/)
  end.map do |link|
    URI.join(uri, link)
  end

  dirname = URI.unescape(File.basename(uri))
  FileUtils.mkdir(dirname)

  links.each do |link|
    filename = File.join(dirname, URI.unescape(File.basename(link.to_s)))
    puts filename
    open(filename, 'w').write(link.read)
  end
end

previously: Titleize