“Move fast and break things” was Facebook’s motto until 2014.
This phrase has long encapsulated Silicon Valley’s disruptive startups’ approach to innovation. Today, it is more relevant than ever, with AI companies rapidly advancing and challenging the status quo of the news industry once again.
Like the tech giants before them, AI companies have swiftly moved forward with little regard for existing copyright laws or ethical principles. Some are scraping the entire internet, including copyrighted material, books, and news articles behind paywalls. This aggressive approach aims to maintain momentum in a race between leading AI companies.
We have reached a critical juncture where AI is ever pushing boundaries with increasingly powerful models. Publishers and the traditional establishment are struggling to keep pace and are caught between a rock and a hard place, either:
So, what exactly is happening with these AI deals, and what can publishers learn from them?
AI companies, most notably OpenAI, are forming licensing agreements with major publishers to use their content to train AI models and provide real-time, authoritative information through AI applications like ChatGPT. Other AI companies, such as Perplexity AI, are lagging behind.
OpenAI has strategically partnered with major media companies worldwide, integrating their trusted journalism into its AI models like ChatGPT. These partnerships provide OpenAI access to high-quality, curated content, ensuring that users receive accurate, timely, and contextually relevant information. Below is a comprehensive list of these partnerships, showcasing OpenAI’s global reach and its commitment to enhancing AI-driven information delivery.
OpenAI’s Preferred Publisher Program, detailed in a leaked pitching deck from AdWeek, includes:
As the AI industry continues to grow, several major news publishers have escalated their efforts to protect their copyrighted content by filing lawsuits and issuing cease and desist notices against AI firms like OpenAI and Perplexity AI. These legal actions are central to defining how copyrighted materials can be used in AI training and whether these practices infringe upon publishers’ rights.
The New York Times has been at the forefront of legal battles against AI companies. Unsatisfied with licensing offers, NYT filed a lawsuit against OpenAI, claiming that the company used millions of its articles without permission to train its language models. The lawsuit argues that scraping and repurposing NYT’s content for commercial gain without compensation violates copyright law. NYT contends that such actions undermine its business model by diverting traffic and revenue away from its site, while OpenAI profits from the content.
In addition to the ongoing lawsuit against OpenAI, NYT also issued a cease and desist notice to Perplexity AI on October 2, 2024. The letter, a copy of which was shared with Reuters, demanded that Perplexity immediately stop using its content for generative AI purposes. NYT alleges that despite assurances from Perplexity that it would cease “crawling” its website, the newspaper’s content continues to appear on the AI platform. NYT claims that Perplexity’s use of its material, including generating summaries and other outputs, violates copyright law.
Perplexity responded by stating that it is not scraping data for building foundation models but rather indexing web pages to provide factual citations in response to user queries. Despite these claims, NYT’s notice demands transparency from Perplexity, requesting detailed information on how it accesses the publisher’s website despite its efforts to prevent scraping. Perplexity has until October 30 to respond to these demands.
These legal actions emphasize NYT’s aggressive approach in defending its content and setting a precedent for other publishers who may face similar issues with AI firms. The outcome of these cases will likely impact not just OpenAI and Perplexity, but the entire industry’s approach to data usage.
News Corp, the parent company of the Wall Street Journal (WSJ) and the New York Post, filed a lawsuit against Perplexity AI in October 2024, accusing the AI startup of copyright infringement. News Corp’s complaint alleges that Perplexity uses its copyrighted material without authorization, harming its business by diverting traffic that would otherwise go to its websites. The publishers argue that Perplexity’s practice of reproducing full articles and attributing false quotes to WSJ content constitutes a violation of copyright law.
News Corp’s lawsuit also follows a cease and desist letter it sent to Perplexity earlier this year, offering an opportunity to negotiate a licensing agreement. Perplexity did not respond, prompting the lawsuit. News Corp is seeking an injunction to block Perplexity’s use of its content, destroy any databases containing its material, and is demanding damages of up to $150,000 per infringement.
Smaller digital publishers, such as The Intercept and Raw Story, have also taken legal action against OpenAI, citing unauthorized use of their articles. These publishers argue that the use of their content to train AI models threatens their revenue streams, as AI-generated summaries compete directly with their original work, reducing site traffic and ad revenue. The lawsuits aim to compel AI firms to either license the content or stop using it entirely.
These actions reflect a broader strategy among independent media organizations to protect their intellectual property rights and ensure that their journalism is not repurposed without compensation.
Choosing between partnering with or fighting tech giants is not a new dilemma for news publishers. Google has been siphoning traffic away from publishers for over a decade. Similarly, Facebook convinced publishers to use its platform with promises of fair compensation, only to later reduce the traffic share to external websites to a trickle. There is nothing to prevent AI companies from acting similarly, potentially leaving publishers with more traffic and revenue loss in the process.
Integrating AI into search engines and content platforms does provide new potential revenue streams for publishers through licensing but also poses a threat by eventually reducing direct traffic to publisher websites. What will happen once these deals expire in a few years? Will AI companies still be interested in renewing them, or will they, like Facebook has in the past, blacklist or completely ignore news publishers unless they agree to even more unfavorable terms?
The industry is divided on whether to pursue legal action or strike deals with AI companies, a decision that will shape the future dynamics between AI developers and content publishers.
Insights from a panel at the International Journalism Festival 2024 reaffirm that publishers have struggled with platforms like Google and Meta over fair compensation for the news they use and profit from. Studies suggest that Google and Meta downplay the value news provides to their platforms. Payments to publishers vary greatly, with large publishers receiving more favorable deals, while smaller and middle-income publishers get much less.
Here’s a quick recap of what is happening between publishers, Google, and Meta around the world:
What can publishers do to ensure their journalism is compensated fairly? Publishers should seek non-exclusive deals to maintain leverage and avoid being locked into unfavorable terms with a single AI company. Despite its challenges, collective negotiation might provide a better bargaining position for smaller publishers.
Transparency in deals and fair distribution of funds are essential to ensure money does not just reach the largest media companies or hedge fund-owned entities. A potential solution to this issue is TollBit, a marketplace allowing publishers to license their verified content to AI companies dynamically, based on keyword trends, helping monetize content and provide AI firms with high-quality data.
The “move fast and break things” mentality is once again reshaping the news industry in profound ways. While tech giants challenge the status quo and force innovation, the persistent issue of fair compensation for publishers remains critical.
What will happen once the 2, 3, or 5-year deals that are being signed now expire? For high-quality journalism to continue playing a vital role in society, several steps are essential:
The future of journalism hinges on a balanced approach that benefits both AI companies and publishers. Large publishers can secure lucrative deals now, but they must remain vigilant to ensure these agreements continue to be favorable in the long term. Smaller publishers should collaborate to enhance their bargaining power for fair compensation.
Ultimately, it is likely that the demand for high-quality, real-time data will still be present in the future. Publishers must be ready to negotiate and adapt, ensuring that journalism is fairly compensated. These strategies will allow the news industry to maximize the benefits of AI technology while protecting the value and integrity of quality journalism.
Join our community of industry leaders. Get insights, best practices, case studies, and access to our events.
"(Required)" indicates required fields