You probably won’t find a person more qualified to talk about music metadata than Chris McMurtry. As Head of Music Product at payments company Exactuals, he spends his time fixing one of the industry’s most challenging problems: incomplete or incorrect metadata that ultimately prevents artists and creators from getting paid properly.
We recently sat down with Chris to discuss how his musical background led him to become passionate about metadata, and the role that machine learning and AI technologies play in fixing the industry’s enormous data problem.
What’s your background in music and how did it lead you to Exactuals?
I’m Nashville born and raised so I’ve been in music my entire life. After high school I was on the road touring and this thing called the iPod came out, which exposed me to so much more music. I fell in love with classical music – I was introduced to Arvo Pärt and from there I got into other composers like Penderecki and Lutosławski. I wanted to become a classical composer, so I ended up quitting the band I was in and going to school to study classical composition.
When my wife suggested I would need a job in addition to composing, I responded, “Well, the only company I’d ever work for is Apple.” The next thing I know I’m at Apple for nine years as part of the Information Systems and Technology team, which was incredible. I would wake up at 4am every day so that I could compose as well. The only thing that could have taken me away from Apple was getting to work with classical music and technology, which is exactly what happened when the largest distributor in classical music brought me in to help them fix their metadata. Apple had gotten pretty strict on metadata standards by that point, and if you didn’t hit those standards the music simply would not go live and you would miss your street date. Which was pretty intimidating and could have a huge impact on your business, especially if you’re serving hundreds of labels. It took a huge amount of manual effort to fix. With classical music it’s so complex; the sheer number of performers, number of languages, number of diacritics. If it’s Beethoven’s Fifth is it the version where you’ve got four individual movements? Or did you cram the third and fourth together into one?
I really dove into metadata at that point and began writing metadata standards, specifically around classical music. I knew the human approach wasn’t going to scale, so I set out to automate the process. That’s how my first company Dart was born. Billboard called us TuneCore for classical because we were basically a distributor, which I think is a great description. It was going really well, but there was one big problem: our business model. We were charging $40 per album per year, and our target market was independent composers.
We also found that the rules engine would only get us so far. How were we going to account for anomalies like mashing the third and fourth movements together? That’s when we got into entity resolution, or the matching of disparate datasets, and machine learning to account for the changes of the data. Then someone said to us, “Have you thought about using this technique to match master recording and publishing information? That would be a viable business!” Our business model then very quickly went from $40 per album per year to what a SaaS model should be.
After Dart went bankrupt we immediately started rebuilding everything, and Exactuals approached us. I’ve known Mike Hurst (co-founder and CEO) for a long time, he’s a phenomenal entrepreneur, and he was looking to bring Exactuals into the music space. They ended up acquiring RAI (Royalties.AI), our machine learning AI tool, to help them figure out who to pay.
Can you give us a bit of background on Exactuals and your role as Head of Music Product?
Exactuals has been around for nearly eight years and in September of last year Exactuals was acquired by City National Bank. Exactuals started in film and TV and the first client was the Screen Actors Guild (SAG) and AFTRA. They began digitising processes to make it easier for the payers and the payees in those industries to manage payments. They automated everything and provided transparency which the industry hadn’t seen before.
As Head of Music Product I oversee RAI. PaymentHub is our flagship payment product, and RAI resolves data for it. Our clients are publishers, labels, distributors and aggregators, as well as some third parties.
Can you explain further how RAI is helping to fix the music industry’s data problems, and the role of machine learning in that?
As I mentioned earlier, part of the reason we went to machine learning was to supplement the rules-based engine, but really there were two things that made it complex. One is the disparate nature of the data; so, all the different places that the data lives, and the differences in the data between those disparate datasets. To use a really simple example, one database might have Paul James McCartney and another one will have Paul McCartney or Paul J McCartney, and maybe those databases do or don’t have musical identifiers like an IPI number or maybe it’s a proprietary ID; and if they do how does that tie correctly to the ISNI? It’s all about resolving all of that data.
The other issue is that standards are always changing, and I use the words standards lightly. RAI was built on the assumption that there is no such thing as a universal standard – it doesn’t exist and it will never exist, so how are we going to account for that? What we landed upon was what we saw with social graphs and social networks. It works very well for metadata problems, because what you build is not a database of the data itself, but a database of the relationships between the data. So, when Paul James McCartney is tied to Paul McCartney on a graph, you’re looking at the edge, the relationship between the two, and that’s how it gets resolved. Machine learning was an obvious choice at that point.
What are some typical use cases for RAI?
There are four use cases that we see over and over. If you’re a rightsholder you have the rights information, but what you don’t know is all of the usage of your catalogue. We have great data sources and we’re able to take in both the publishing and the repertoire and then provide all of the usage details back to you. Our rightsholder clients are always surprised to find out about cover songs or lullaby versions or karaoke versions they never knew existed. So, they’re able to find money basically.
The inverse is also true. Maybe you’re a label or a distributor and you just acquired a catalogue of master recordings but the data is limited. We are able to pull in all the missing information from the repertoire side – all the UPCs, all the ISRCs, the details of who distributed them before, the P line the C lines, and then all of the publishing information as well. So, you’re starting with the master recoding and matching to the publishing to pull it all together. With those two use cases you’re looking at a complete record from both sides of the equation.
The third use case is when distributors need to hit the metadata standards of services like Spotify or Apple in order for it to go live in the right timeframe. They will use us not only for the complete information, but to clean it up and to make sure it meets that data standard. And then the final use case is how third parties use us, which is any of the three use cases but as an API within their system.
You recently partnered with Dot Blockchain – how does that relationship work?
I’m a big fan of Benji Rogers and the entire Dot Blockchain team. They’re an excellent partner. DotBC is really going about blockchain the right way, which is for claims management. I’ve talked about the importance of having the right data, and I’ve also talked about the dynamic nature of the data. Our partners and RAI technologies are able to work together to account for dynamic, yet authoritative, information.
What do you see as the biggest issues facing music royalty systems?
I think this is an area of the music industry that is ripe for change. What we do is aggregate and resolve data, and that word resolve is very important because what it means is there are different representations and we are basically saying ok, ignore all of these and go with this – it’s that correct representation.
If you’re going to pay people, you have to know who to pay, and that is based on the identifier that is registered. It’s not uncommon to see fifteen ISWC codes – which of those is the right one? The same goes for IPI numbers or any other identifier. So, in the royalties space getting that data clean and resolved is the first step, and that will then open up the door for better royalty calculations. And what that also equates to is not only getting paid accurately but getting paid faster. We all talk about this utopia of instant payments. That’s how we get there.
As you mentioned before there is no universally agreed upon-standard for music data, and all previous attempts to create a global music database have failed. Where do we go from here?
I don’t believe in a global rights database; I believe in using machine learning with an adaptable graph where you essentially end up with standards. While of course it is worthwhile to strive for standards in the traditional sense, and the efforts by DDEX around standards and the IFPI and CISAC regarding identifiers is very important, there’s still tremendous opportunity for human error, and the pressure to get music to the market can override the proper adherence to things like song splits for example. I am hopeful that compliance to standards and identifiers will become more automatic on a going-forward basis, but there are billions of lines of song data from the past 15-20 years that needs to be corrected, and catalogues of music that stretch back a century. This is exactly the right application for AI – it can vet the data better than any human ever could.
What are your thoughts on the Music Modernization Act’s planned mechanical licensing database? Is it a futile project?
I don’t think it will be futile. I think a lot of times you’ve got to have the mission in order to accomplish the vision. Let’s say the MMA is a Ferrari. It will still be that beautiful car and do the things they want it to, but there are times you need a pickup truck or a mini-van. Don’t forget, the MMA is limited to mechanical rights, and all the direct licensing deals may continue to be administered outside of the Mechanical Licensing Collective that’s mandated in this legislation. We want to be the gas in the engine – or perhaps the better analogy would be the road – that gets all of these vehicles moving smoothly in the same direction.
This technology is now affordable and scalable, and it just takes folks to really focus on these problems with these solutions. And it’s not just us – there are others out there doing it and I think that’s a good sign.
Enjoyed this article? Why not check out The Guild of Music Supervisors’ Suggested Metadata Standards
Find out more about Synchtank’s metadata services here
Another thing to keep in mind along the lines of the Paul McCartney example, is metadata and different systems not recognizing tildes and other foreign language letters in their databases. In my many years with Billboard, SGAE, SoundExchange, RMM/Universal, and as a music publisher and administrator, I have seen omissions of tildes where José Feliciano becomes ? Feliciano or Jos Feliciano. Also not taking into consideration that without an identifier how does a payer know the difference between two prominent artists Jose Feliciano and Jose (Cheo) Feliciano? There are many other similar examples. With streaming, new music discovery, and the increasing acceptance of Latin American artists and repertoire, this should also be addressed and rectified.
So delighted that Chris McMurtry is a classical composer. The word “composer” has fallen from its once vaulted heights: Spotify defines ‘composer’ in the hip-hop mold: as any person who contributes to the music. And, online distributors consider the performers of our music to be the only important search tag. THANK YOU CHRIS. We independent composers really need your help !!!!