YOU ARE AT:AI InfrastructureBookmarks: Trusting AI — confidence without comprehension

Bookmarks: Trusting AI — confidence without comprehension

Editor’s note: I’m in the habit of bookmarking on LinkedIn and X (and in actual books) things I think are insightful and interesting. What I’m not in the habit of doing is ever revisiting those insightful, interesting bits of commentary and doing anything with them that would benefit anyone other than myself. This weekly column is an effort to correct that.

We are deploying artificial intelligence (AI) systems at remarkable speed—for consumers, for enterprises, for governments. But we still don’t fully understand how these systems work. So how do we develop trust in tools whose inner logic remains largely invisible?

We want to trust AI, but we don’t yet understand how it thinks

In a recent Newsweek op-ed, former Nokia CTO and Nokia Bell Labs President Marcus Weldon makes a compelling case for a set of foundational principles to guide future AI development. These include concepts like agency, societal alignment, and transparency. For the piece, Weldon interviewed luminaries Rodney Brooks, David Eagleman, and Yann LeCun.

As Brooks put it to Weldon, in a section discouraging “magical thinking” as it relates to how humans appraise the intelligence of an artificial intelligence solution, “When we don’t have a model and can’t even conceive of the model, we of course say it’s magic. But if it sounds like magic, then you don’t understand…and you shouldn’t be buying something you don’t understand.” 

That juxtaposition is telling. On one hand, we are calling for constitutional frameworks, ethics declarations, and principled development. On the other hand, we admit we don’t know how our most advanced systems reach conclusions. We’re writing rules for tools whose reasoning processes remain opaque. It’s not wrong to do this—but it is unsettling. And it invites a question: can trust be built on principles alone, without understanding?

Trust is rising, comprehension is not

According to Stanford’s 2025 AI Index, produced by the university’s Human-centered AI (HAI) inter-disciplinary institute, trust “remains a major challenge.” This despite ramping investments from virtually every stakeholder in global commerce. And “AI optimism” is also on the rise. “Since 2022, optimism has grown significantly in several previously skeptical countries—including Germany (+10%), France (+10%), Canada (+8%), Great Britain (+8%), and the United States (+4%).”

However, as investment and “optimism” increase, “Trust remains a major challenge,” the report authors wrote. “Fewer people believe AI companies will safeguard their data, and concerns about fairness and bias persist…In response, governments are advancing new regulatory frameworks aimed at promoting transparency, accountability, and fairness.” 

To say that in a different way, there’s a gap between confidence and comprehension. This is a potentially dangerous divergence. Trust is often earned through consistency and, in many ways, is regarded as a proxy for utility—if a solution works well enough, often enough, we trust it more and more. However, as AI systems become increasingly complex, the ability of humans to understand how these systems work is not keeping pace; it’s potentially declining as use increases and the utility-as-proxy for trust effect becomes embedded in users’ minds. 

Models can appear aligned, but that doesn’t mean they are

A recent research paper, Randomness, Not Representation: The Unreliability of Evaluating Cultural Alignment in LLMs, adds another wrinkle to the trust dilemma. The authors tested how reliably large language models aligned with different cultural values. What they found was deeply inconsistent. The models’ responses varied wildly depending on prompt phrasing, sampling method, and even randomness in generation.

What does that mean? It means that, “It is appealing to assume that modern LLMs exhibit stable, coherent, and steerable preferences…However, we find that state-of-the-art LLMs display surprisingly erratic cultural preferences. When LLMs appear more aligned with certain cultures than others, such alignment tends to be nuanced and highly context-dependent…Our results caution against drawing broad conclusions from narrowly scoped experiments. In particular, they highlight that overly simplistic evaluations, cherry-picking, and confirmation biases may lead to an incomplete or misleading understanding of cultural alignment in LLMs.” 

Whether we’re considering how well we understand the inner workings of a particular model or solution, or if we’re zooming in on cultural alignment, the throughline is that state-of-the-art models don’t really operate consistently. That presents a challenge. If we don’t know how it works, and it doesn’t work with any consistency, how do we reliably audit it? And, barring the ability to reliably audit it, who becomes the arbiter of trust? In many ways this is more a question of epistemology than technology. 

Trust is a spectrum

This column is generally about trusting artificial intelligence. To bring that concept into the real world, I’m a heavy AI user. In preparing this column, I asked ChatGPT 4o to summarize the Newsweek article by Weldon and the research paper on cultural alignment. ChatGPT confidently attributed the article to Geoffrey Hinton, not Marcus Weldon, and peppered its summary with quotations that were not present in the article. For the research paper, ChatGPT provided a believable summary of a paper titled, Reflections on GPT-4 on Logical Reasoning Tasks. Not only is that not the paper I asked it to summarize, it’s not a paper that seems to exist. 

I don’t trust AI even though I use it a good deal. I don’t have that utility-as-proxy for trust problem because, by the nature of my work, I verify everything. Good thing too, because my lack of trust and drive to verify consistently proves essential to producing things that are correct (I hope) and grounded in reality rather than in hallucination. 

Back to Weldon, not Hinton. Based on his interviews, he came away with two ways to evaluate the intelligence of a system: first, establishing the “level of intelligence demonstrated in any given domain by identifying whether a system is just curating information, creating knowledge or generating new conceptual or creative frameworks in different domains of expertise.

“Second…look[ing] at the type of intelligence process that the AI uses to ‘think’ about a problem. The two are complementary—one focusing on the ‘what’ (was demonstrated) and the other focused more on the ‘how’ (it was produced). But there is clearly more thinking to be done to create a general-purpose methodology for accurately judging the intelligence of any system in any domain, and to eliminate hyperbolic or ‘magical’ conjecture.” 

Policies and principles matter. But until we can open the black box a little further, we should be clear-eyed that trust is not binary. It’s a spectrum. We’ve taught machines to predict. Now we have to decide how much to believe them—and how much to verify.

For a big-picture breakdown of both the how and the why of AI infrastructure, including 2025 hyperscaler capex guidance, the rise of edge AI, the push to artificial general intelligence (AGI), and more, check out this long read.

ABOUT AUTHOR

Sean Kinney, Editor in Chief
Sean Kinney, Editor in Chief
Sean focuses on multiple subject areas including 5G, Open RAN, hybrid cloud, edge computing, and Industry 4.0. He also hosts Arden Media's podcast Will 5G Change the World? Prior to his work at RCR, Sean studied journalism and literature at the University of Mississippi then spent six years based in Key West, Florida, working as a reporter for the Miami Herald Media Company. He currently lives in Fayetteville, Arkansas.