Posts Tagged ‘Python’

Shrub: S3 Proxy Service on Google App Engine

Tuesday, September 23rd, 2008

Shrub is a simple proxy to S3 to provide an RSS feed or HTML listing of files in a bucket or directory. Also, JSON, mixtape and ID3 lookups (more on that below).

Me and my friends have been using it for a little while to keep track of when we are uploading files (via RSS) and it seems to work pretty well.

Last week, I was playing around with the idea of a MuxTape/OpenTape type format where you could point it at a directory of mp3’s and you could play them straight from the browser. I have a preliminary version, an example of it, at: http://shrub.appspot.com/m1xes/sub-pop-mix-1/?format=tape (songs from the official Subpop Records free downloads page). So it seems like an OK way to share mixtapes by hosting them on your own S3 account, or to just stream mp3’s that you have backed up from a web browser.

In order to get it to show the track information, I wrote a handler to pull ID3v2 tag info from the mp3s by doing a request using the Range header (which S3 supports) of the first 1024 bytes (Range: bytes=0-1024). I think most ID3v2 taggings are at the beginning of the mp3 (I think ID3v1 was at the end). For example, if you have the S3 file:

sub-pop-mix-1/01-Dntel-The_Distance_(ft._Arthur&Yu).mp3

in the m1xes bucket, you can do an ID3 lookup using format=id3-json:

http://shrub.appspot.com/m1xes/sub-pop-mix-1/01-Dntel-The_Distance_%28ft._Arthur%26Yu%29.mp3?format=id3-json:

{ "album": "Dumb Luck",
  "performer": "Dntel",
  "title": "The Distance (Ft. Arthur & Yu)",
  "track": "5/9",
  "year": null,
  "isTruncated": false }

It will parse as much as it can from the first 1024 bytes of the mp3 (and isTruncated will be true, if there was more data after that). ID3 lookups are cached for 5 minutes.

Right now it doesn’t do any authentication (not sure if I want to store secret keys), so its only useful for public buckets and files. I think it makes sense to open source it and then add auth capability, and then people can configure it with their own keys and host it on their own app engine space. I will try to throw some code up on github really soon, after I re-factor some of my noobish python code.

Mainly, its just an excuse to help me learn python and play with mako and google app engine.

All this, and more detailed info is at: http://shrub.appspot.com/. If you see anything wacky, let me know.

Update: The source is now available on github: gabriel/shrub