News In Brief Technology and Gadgets
News In Brief Technology and Gadgets

Meta Plans to Pay News Outlets for Higher Quality AI Training Data

Share Us

429
Meta Plans to Pay News Outlets for Higher Quality AI Training Data
27 May 2024
4 min read

News Synopsis

Meta considers paying news outlets for content access to enhance AI models. Discussions ongoing, but no agreements made. Quality data crucial for competitiveness in AI development.

Meta's New Strategy

Potential Payment to News Organizations

Meta, the parent company of Facebook, is reportedly considering paying news organizations to access their content to better train its AI language models.

Objective:

The move aims to enhance Meta's generative AI tools, including Meta AI, to make them more effective and competitive against similar tools offered by Google and Microsoft.

Internal Discussions and Sources

Ongoing Discussions

Report by Business Insider: According to Source, which cites two sources familiar with the matter, internal discussions are ongoing within Meta about potentially striking deals with news publishers.

Nature of Agreements: Any agreements for data access would be distinct from previous deals where Meta compensated publishers to host links on its platforms.

Current Status: Meta has not yet approached any news outlets about licensing content.

Possibility of Compulsion

  • Source Insights: One of the sources mentioned that Meta might eventually be compelled to pay for access to high-quality news, photo, and video content.

Current AI Training Practices

Existing Data Utilization

Meta's Proprietary Data: Meta's AI model training currently involves using its own data. CEO Mark Zuckerberg has claimed that the company has a larger dataset for training its Llama models than Common Crawl, a widely used collection of scraped web data.

Quality Concerns: Despite having a large dataset, there are internal concerns about the quality of Meta's proprietary data.

Quality of Data Sources

  • Data Quality Issues: Posts and comments on Facebook and Instagram may not provide the high-quality training data needed for generative AI chatbots and search tools.

  • Structured Sources: More structured sources such as books, news articles, and essays are considered better for producing quality outputs.

OpenAI's Agreement with News Corp

  • Recent Deal: OpenAI recently signed a multi-year agreement with News Corp, a media giant that owns publications such as The Wall Street Journal, MarketWatch, and The Sun.

  • Deal Details: The deal is reportedly valued at $250 million over five years and grants OpenAI access to content from over a dozen news publishers.

  • Impact on AI Training: This partnership underscores the importance of high-quality content in training advanced AI models.

Meta's Competitive Position

  • Trend Highlight: The agreement between OpenAI and News Corp highlights the growing trend and necessity of utilizing high-quality content to train AI models.

  • Meta's Navigation: Meta is navigating a competitive landscape where access to superior training data can significantly impact the effectiveness and competitiveness of AI tools.

Future Focus

  • Enhancing AI Tools: By potentially paying news organizations for content, Meta aims to enhance its AI tools to remain competitive.

  • Strategic Importance: Access to high-quality data is increasingly recognized as a critical factor in developing advanced AI models.

Next Steps

  • Awaiting Agreements: Meta has not yet formalized any agreements with news outlets.

  • Ongoing Evaluation: The company continues to evaluate the best strategies for improving its AI training data quality to ensure its generative AI tools can produce high-quality outputs.

Conclusion:

Access to high-quality data is pivotal for Meta's AI advancements. The company navigates the competitive landscape, evaluating strategies to ensure top-tier training data for superior generative AI tools.