The ever-evolving tech landscape throws up new challenges and opportunities, particularly in the realm of artificial intelligence. One of these pivotal questions concerns AI’s thirst for data, specifically from big Hollywood studios. As AI models grow more sophisticated and require more data to train, it’s stunning to many that the major film studios haven’t yet entered into licensing agreements to feed these hungry AI beasts. Yet, the reasons behind this choice are more complex and multilayered than they first appear.
Legal concerns, the fear of making a strategic blunder as was the case in Hollywood’s initial interactions with Netflix, and notably, the enigmatic issue of price all contribute to this hesitance. When it comes to valuation, the studios are in the dark about the worth of their content from an AI perspective. No ballpark figures or dollar range exists that can assist them in pricing the value of their content for use in training AI models.
Many have advocated for price transparency when it comes to future agreements, as the initial deal could effectively set the financial bar for those that follow. With each tech advancement, Hollywood has shown its reticence to determine the true worth of their offerings. This causes a degree of hesitance, slowing down progress towards engaging with AI companies. The studios aren’t idle in this predicament; instead, they’re trying to identify the best course of action.
Asking how Hollywood’s films and TV shows should be priced for AI training garners varied responses, indicating the level of disagreement and confusion surrounding this issue. The major studios are remaining mum on the issue or have chose to not comment. One suggestion is that the price be determined by the maximum amount developers are willing to shell out in the AI training data market.
In the eyes of an AI model, the origin of a live-action shot of a scene is irrelevant; the AI is not discerning about if that shot came from a blockbuster movie or an indie film. Therefore, valuing all content equally becomes a sticking point. The AI models are primarily interested in learning how the world looks and they seek variety and volume in their data.
But others argue that leaving the pricing to market forces could risk undervaluing their content. They propose that the real value should be outlined in the context of existing distribution channels, such as broadcast, pay TV, and SVOD. Alternatively, the value should be calculated based on the potential worth the studio content might confer on the developer’s product, both now and in the future.
This calculation of worth could factor in the market scale of data licensing, which could be 5% or 10% of the total expenditure AI companies allot to development. It could translate to a sum ranging between $17.5 billion to $35 billion. The underlying question remains: do Hollywood studios genuinely understand the potential value they control?
One argument suggests studios could segment their film libraries into data tokens for consumption by AI models. However, the process of ascertaining a content library’s value for generative AI is a foreign concept for these empires, leading to potential undervaluing. The need for a deep understanding of the value, not just currently but also projected for the future, is crucial.
A preliminary lump sum for data might not be a satisfactory agreement. Consideration should also be given to a developer’s future plans to generate and distribute products or services. Therefore, contract terms should encompass a consistent share of revenues generated from future AI products, trained off the data contributed by the studios.
There could also be unknown factors in the ongoing development of generative AI that might lead studios to underestimate the value of their content. One significant area many overlook is the potential role of synthetic data. Many in the AI community believe models will soon be trained exclusively on high-quality synthetic material. These speculations should be considered when setting the price for studio content used in AI training.
Likewise, it is crucial to ask whether synthetic data generated from original IPs and NIL data should accrue a permanent royalty and warrant protections on its usage. Negotiations on the price could be the vital bridge for closing a deal.
Finalizing a deal comes down to the price being high enough to justify the risks for the studios, including amplifying competition and straining the industry, as well as potentially violating existing contracts. The question then boils down to determining the market value sufficient to motivate the studios to seal a deal. A trivial amount certainly won’t entice them, but a significant sum might just encourage problem-solving and prompt action.