So, I was working on a re-write of icesDJ, a DJ interface to ices (allows for playlist manipulation and such), and I decided to do it in Python. The basic idea is that ices will call one of my scripts, songpicker.py, which will spit out the next song to play based on 1) DJ Requests (highest priority), 2) User requests (second priority), and finally 3) Random selection based on user ratings. My dilemma was that the ratings and requests are stored in a database, but I didn't have a way of mapping between song titles and the file names they correspond to. What I wanted to do was to go through the media directory periodically and populate the database with the tag data from the files, but was extremely disappointed to find that there were very few Python vorbis libraries of any sort and the ones that existed depended on multiple other libraries. Once I finally got one to build, it crashed upon importing the module.
So, what do I do when I'm fed up with existing software (or lack thereof) and not feeling like doing my (boring) schoolwork? I go write some code! After deeply meditating on the Vorbis file specification's header section for a very long time I finally realized how simply the header was setup. Armed with my trusty (not so) dusty hex editor, I opened up an ogg file, played around with my ideas in the Python interpreter, and soon came up with a working function that will read any ogg/vorbis file and extract the comment header information as a dictionary. The code is pasted below for the benefit of whoever may need it. (Yes, I realize it is somewhat inefficient and I will add stuff later to only read in the necessary header info).
def getVorbisComments(filename): ''' Parses the given input file and returns the vorbis comments as a dictionary ''' comments = {}
fp = open(filename)
# Read in the file data = fp.read().split('vorbis')[2]
fp.close()
# Read in the length of the first field (vendor string) fieldLen = int(''.join([str(ord(c)) for c in data[3::-1]]))
# Remove the first field (vendor string) data = data[fieldLen+4:]
# Read in the length of the comment fields numComments = int(''.join([str(ord(c)) for c in data[3::-1]]))
# Remove the comment field length data data = data[4:]
# Read in the comment fields for i in range(numComments): fieldLen = int(''.join([str(ord(c)) for c in data[3::-1]])) fieldData = data[4:fieldLen+4].split('=')