Looking inside the black box

Looking into the code.
Looking into the code.
DPA via Reuters
One of the biggest challenges facing artificial intelligence companies is that they don’t know everything about their algorithms. This so-called black box problem is exacerbated by the fact that deep learning models do precisely that — they learn. And when they learn they change. They take in enormous troves of data, detect patterns, and spit something out: How a sentence should read, what an image should look like, how a voice should sound.

But now researchers at Anthropic, the AI startup that makes the chatbot Claude, claim they’ve had a breakthrough in understanding their own model. In a blog post, Anthropic researchers disclosed that they’ve found 10 million “features” of their Claude 3 Sonnet language model, with certain patterns that pop up when a user inputs something it recognizes. They’ve been able to map features that are close to one another: One for the Golden Gate Bridge, for example, is close to another for Alcatraz Island, the Golden State Warrior, California Governor Gavin Newsom, and the Alfred Hitchcock film Vertigo — set in San Francisco. Knowing about these features allows Anthropic to turn them on or off, manipulating the model to break out of its typical mold.

This development offers hope that the companies behind powerful generative AI models will soon have much more control over their creations, as MIT professor Jacob Andreas told theNew York Times. “In the same way that understanding basic things about how people work has helped us cure diseases,” Andreas said, “understanding how these models work will both let us recognize when things are about to go wrong and let us build better tools for controlling them.”

More from GZERO Media

Last Thursday, Brazil’s Supreme Court delivered a historic verdict: Jair Bolsonaro, the far-right former president who tried to overturn the 2022 election, was convicted along with seven close allies for conspiring against democracy and plotting to assassinate his rivals, including President Lula. Bolsonaro was sentenced to 27 years in prison and barred from office until 2060. At 70, he will likely spend his remaining years behind bars.
Last Thursday, Brazil’s Supreme Court delivered a historic verdict: Jair Bolsonaro, the far-right former president who tried to overturn the 2022 election.

Last Thursday, Brazil’s Supreme Court delivered a historic verdict: Jair Bolsonaro, the far-right former president who tried to overturn the 2022 election.

This summer, Microsoft released the 2025 Responsible AI Transparency Report, demonstrating Microsoft’s sustained commitment to earning trust at a pace that matches AI innovation. The report outlines new developments in how we build and deploy AI systems responsibly, how we support our customers, and how we learn, evolve, and grow. It highlights our strengthened incident response processes, enhanced risk assessments and mitigations, and proactive regulatory alignment. It also covers new tools and practices we offer our customers to support their AI risk governance efforts, as well as how we work with stakeholders around the world to work towards governance approaches that build trust. You can read the report here.

- YouTube

Brazil’s Supreme Court has sentenced former President Jair Bolsonaro to 27 years in prison for plotting to overturn the 2022 election and allegedly conspiring to assassinate President Lula. In this week's "ask ian," Ian Bremmer says the verdict highlights how “your response… has nothing to do with rule of law. It has everything to do with tribal political affiliation.”