Honeypot – noun (from Encarta)
1. something that is inviting: anything that attracts or appeals to large numbers of people ( informal )
2. Internet server used to entice hackers: a server connected to the Internet that is used as a decoy to attract potential hackers in order to study their activities and techniques
Apple recently at their WWDC11 keynote announced a new service called iCloud Music Match. For $24.99 per year, it will scan the user’s machine and mimic all of the user’s music files onto Apple’s new data center for streaming anywhere. In cases where it finds a match with one of the songs in its data files, even if not purchased from Apple, it will make a record of the song and then stream to the user Apple’s 256kb AAC version. Apple presented this as a convenience to the users, saying that the setup will take ‘minutes, not weeks’ in a jab at competitors like Amazon and Google that offer cloud based storage lockers.
The unspoken flip side of this is: The users are voluntarily granting Apple the right to scan their system and store the personally identifiable results on Apple’s servers. Presuming that Apple restricts its scan strictly to the information that is absolutely necessary for Music Match to work, what will that be?
Quite obviously Music Match cannot work without scanning your files. For example, assume I take any old file and rename it LadyGaga:BornThisWay.mp3 and add it to my library. Obviously, Apple is not going to send me the music just because of the file name. I also doubt that there is going to be any process that is going to ‘listen to’ the music to see if it sounds like a recognized song. Instead, chances are the Music Match feature will, at a minimum, examine the header information on the MP3 file and run a hash calculation on the entire contents of the file.
Although the ‘DRM free’ MP3 now being provided from many of the the major music download companies can be played anywhere, each download is watermarked with header information specific to the exact purchase and purchaser. This article from Techcrunch gives more details on ‘dirty’ MP3s. Consequently, if you purchase a ‘DRM free’ MP3 file from iTunes and then share it, and the person(s) who received it saves it to their iCloud, then Apple will know both (i) who shared their copy and (ii) whose copy is illegal. For files from other watermarked retailers, the same information would only require coordination with the other site.
Next consider music purchased from sites that sell legal but ‘clean’ MP3s without watermarks. These files will have unique MD5 or SHA-2 signatures that can distinguish them to a particular company. They will certainly have different signatures than the watermarked versions (because the addition of the watermark) and they will be unique from versions of the same song encoded by others. When Apple’s servers detect a number of copies far in excess of the ‘clean’ mp3 company’s reported sales, they will know where to suspect illegal copying.
Then there will be MP3s that individuals created themselves from, for example, ‘ripping’ their CD collections. While these are not watermarked to the individual, they appear to be unique for each ‘rip’. To confirm this, I ran a test with fresh installations of the exact same CD ripping software on two different computers. I then had them rip the same track from the exact same CD using the unchanged system default settings on both computers. The MD5 hashes did not match. Small differences between the two reads, the internal timestamps, the system metadata, etc. likely resulted in the mismatch. It will almost certainly also be different from the file hashes from legal download sites, both those that watermark and those that do not. In short, if you and thousands other people have MP3s of the same song with the same file hash value, you will not be able to credibly claim it occurred because all of you ripped it from your CD collections.
MD5 hash values are a cornerstone of computer forensics and fully accepted as evidence that two files are identical copies of each other. You could claim that you didn’t download the song from the file sharing network because you were the one who uploaded it, but I doubt that will help your legal predicament.
Some people I have mentioned this concern to have essentially accused me of heresy and paranoia because “there is no way Apple would do that to their users”. Apple would not have to. They would simply have to comply with an information demand from the RIAA, who has had no problem with being seen as the bad guy in hardball enforcement against file sharing. Moreover consider this:
- Apple is the largest music retailer on the planet.
- Apple believes, possibly justifiably, that it loses billions of dollars annually to illegal music file sharing.
- The easiest way out of the legal jam over challenged content in your iCloud storage would be to convert the suspected iCloud music by buying it from Apple. Apple becomes almost like a white knight in the process.
Several notable commentators, such as Berklee Music chief David Kusek and publisher rights lawyer Micheal Speck, have either in favor or against, called the iTunes Music Match service ‘amnesty for pirates’. I think they may be surprized at how this really plays out.