This is a little Sake task for grabbing all the mp3s off a web page. Like if someone sends you a link to a directory of music and you get an Apache index, you know?
Install Sake and the task.
gem install sake
sake -i http://www.granth.ca/2008/05/sake-music-grab/code/3
Start grabbing!
sake grab:mp3s http://example.com/musics
Here’s the source:
desc "Grab all the mp3/m4a files from a web page"
task "grab:mp3s" do
require 'hpricot'
require 'open-uri'
taskname = ARGV.shift
if ARGV.empty?
$stderr.puts "usage: #{File.basename($0)} #{taskname} <mp3dir-uri>"
exit(1)
end
uri = ARGV.first
doc = open(uri) {|f| Hpricot(f) }
links = (doc/"a").map do |a|
a.get_attribute("href")
end.select do |link|
link.match(/\.(mp3|m4a)$/)
end.map do |link|
URI.join(uri, link)
end
dirname = URI.unescape(File.basename(uri))
FileUtils.mkdir(dirname)
links.each do |link|
filename = File.join(dirname, URI.unescape(File.basename(link.to_s)))
puts filename
open(filename, 'w').write(link.read)
end
end
previously: Titleize