Contributing Writer
https://x.com/ScottNover
https://www.linkedin.com/in/scottnover/
Scott Nover
Contributing Writer
Scott Nover is the lead writer for GZERO AI. He's a contributing writer for Slate and was previously a staff writer at Quartz and Adweek. His writing has appeared in The Atlantic, Fast Company, Vox.com, and The Washington Post, among other outlets. He currently lives near Washington, DC, with his wife and pup.
May 07, 2024
Artificial intelligence systems are trained on massive troves of data — but it could use some expert advice. After all, not all data is created equal.
Take the written word: There’s a difference between training on tweets, New York Times articles, classic literature, Wikipedia entries, and academic journals.
AI companies seem to be missing the good stuff, essentially, the expertise. But expertise is often proprietary, sold by companies or behind tight paywalls, and isn’t easy to scrape up for training. One company called Gretel told Fast Company that its platform can be used to anonymize expert data so it can be sold to AI firms.
We’ve already seen a land grab for high-quality training data with AI firms trying to sign licensing deals with news publishers, though some have opted for copyright litigation rather than taking the money. Could we see an AI company buy up a news publisher, or a social media site, or even a publishing house? Facebook parent company Meta reportedly explored buying the publishing giant Simon & Schuster to train its AI systems. A similar acquisition might not be too far in the future.