Applescript String Manipulation
Because I am the sort of person who alphabetises his collections of things for fun, I’ve been using Applescript to help tidy up my iTunes library recently. It’s actually an efficient tool for the job, and not too horrible to work with either once you get used to the syntax; for someone used to C-based languages, a typical statement seems at first glance to have a load of extraneous words and far too little punctuation, but you soon get the hang of it. Thanks to the integration provided by the iTunes dictionary you can assign genres by artist name, split compilation album track names into correct artist and title settings, sort video files as TV shows, and so on, all with a single click (and a few hours reading the docs and writing the code, of course). It’s one of those gratifyingly useless, boring pastimes that dorks like me prefer to the crucially important, fascinating things that normal people do, like playing golf and watching telly and talking about cars and shopping. A normal person would probably just put up with having all their songs in some sort of soul-revolting, multiple-artist-spelling, TV-shows-in-the-movies-section multiple metadata pile-up, but I am not normal, and I will not tolerate such carnage on my own computers. Nor am I willing to spend days typing all that stuff in manually: I’m a geek, not a lunatic.
Anyway. It’s all been going swimmingly, apart from one big obstacle that I’ve encountered: Applescript’s capacity for string manipulation is rather feeble, to say the least. Perhaps I’m missing something obvious, but apart from text item delimiters, there’s almost no provision for messing with bits of text built into the language. Say you want to get a variable-length number from the name of each of several TV show files, for example. Normally, using pure Applescript, we’d have to mess about looping along the string and checking each character for matches or something like that, involving writing many lines of boring and inefficient code that should really be part of the basic command set. Happily, there is a better, lazier way: using do shell script allows us to call tools like grep instead of having to rely on AppleScript’s limited string processing abilities, or our own equally limited coding skills. To get the last one or two digits off the end of a string, we can do something like this:
set MyString to "TV Show, Episode 56"
set MyRegExp to "[0-9]{1,2}$”
set MyCommand to “echo \”" & MyString & “\” | grep -o -E \”" & MyRegExp & “\”"
set MyNumber to do shell script MyCommand
VoilĂ : MyNumber now has a value of “56″, to do with what you will. With a bit of work and more reading of documentation you can extract lots of useful data from one string and plug it all into the relevant places as required; the possibilities are literally quite extensive.
Technical stuff: you have to use “echo ” & MyString & ” | ” because there’s no way to pass input to StdIn from Applescript, but piping the output of echo works well enough for this purpose. The “-o” option tells grep to return just the matched string, and “-E” forces grep to use extended regular expressions. You could use Perl or Awk or whatever other command line tool you fancy to do your string manipulation, grep just happened to be the first thing I typed. You need Applescript 1.8 or higher, and one of those fancy new OS X based Macs with all the lovely Unix shell tools built in, obviously.
Amerella says:
*blink*
I was with you until ‘grep’.
2008-01-20 11:56
Tom Ryan says:
Grep. Handy command-line program for searching text, built into Unix based OSes.
2008-01-20 18:08
Mike says:
Didn’t applescript inherit the stuff from Hypercard that allows you to iterate over words and lines (and “items” which you can set the delimiter for). Works a bit like split() does in thinks like JavaScript and php.
Damned if I can remember the syntax as it’s, gosh, 19 YEARS since I wrote any hypercard stuff :)
It’s no RegExp, and the above trick is rather neat, but it’s worth knowing.
If you want to try sorting out my itunes directory, you’d be most welcome. Most of it doesn’t have ANY identifying information as it all got trashed by accident. All I’ve got is anonymous tracks in order grouped into anonymous albums.
2008-01-21 14:07
Tom Ryan says:
Aaaugh, nightmare!
I’ve often wondered about the possibility of using a service like Shazam for exactly that purpose though. Given access to their system, you could build something that could take snippets of songs and do whatever incredibly clever thing it is that Shazam does to identify them, and then tag them accordingly. It’d work for digitising recordings from vinyl too, which would be handy.
You’re right about the delimiter stuff, and I’ve been using that quite a bit on text data that have spaces or punctuation in useful places. But if you have a load of filenames like, say “TVShowNameEpisodeTitle15″ there’s no easy way to split the string up using just text delimiters, you really need RegExps.
2008-01-21 18:59
Dennis Wurster says:
set MyString to “TV Show, Episode 56″
set EpisodeNumber to word 4 of MyString
2008-07-08 02:53