What’s actually working, what’s failing, and what to do about it
By Mike DiNapoli
Let me save you some time.
If you work in government and you’ve been to a conference in the last year, you’ve heard some version of this: “AI is going to transform public service.” Probably accompanied by a slide deck full of buzzwords and a vendor in the lobby ready to sell you the future.
I’m not here to sell you anything. I’m here to tell you what I’m actually seeing — the good, the bad, and the stuff nobody wants to talk about on a panel.
Because here’s the truth: AI in government is simultaneously more promising and more dangerous than most people realize. And the leaders who figure out the difference are going to have a massive advantage over the ones who don’t.
What’s Actually Working
Let’s start with the good news. Because there is some.
Government agencies are quietly getting real results from AI in a handful of areas. Not theoretical results. Not pilot program results. Actual, measurable improvements.
Document processing. Federal agencies are using AI-driven document processing to digitize and classify millions of tax returns, with significant reductions in manual data entry. If you’ve ever watched a government employee manually key data from a paper form into a database — and I have, many times — you understand why this matters. It’s not glamorous. But it’s saving thousands of hours of human labor that can be redirected to work that actually requires judgment.
Citizen services. AI-powered routing systems are automatically directing resident inquiries to the right department, categorizing permit applications, and bridging legacy databases with modern systems. The boring plumbing work that makes government actually function.
Workforce augmentation. With chronic staffing shortages across the public sector, agencies are using AI to handle routine processing so that limited human staff can focus on complex, high-empathy case management. As Darryl Polk of the Public Technology Institute put it, “The most valued use of AI that I’m seeing is augmenting human services.” This is the use case that makes the most sense to me — not replacing people, but freeing them up to do the work that only people can do.
The pattern across all of these? They’re high-volume, rules-based, low-risk processes. Nobody is using AI to make sentencing decisions or allocate emergency resources. The wins are coming from automating the paperwork so humans can focus on the people.
What’s Going Wrong
Now the part that should keep you up at night.
New York City launched an AI chatbot called MyCity to help residents and business owners navigate city services. It cost nearly $600,000 to build. When users asked whether buildings were required to accept Section 8 housing vouchers, the bot told them no — landlords don’t have to accept these tenants. That’s not just wrong. In New York City, it’s illegal for landlords to discriminate by source of income. An official city resource was giving illegal advice at scale.
It got worse. The same chatbot told employers they could fire employees who filed harassment complaints. It told restaurants they could refuse cash payments. All in direct conflict with city law. When ten reporters from The Markup independently asked the same Section 8 question, the bot told all ten of them the wrong answer. The city’s eventual response? Adding a disclaimer that the chatbot “may sometimes” give “inaccurate or incomplete” responses.
In Canada, the Canada Revenue Agency spent over $18 million building a tax chatbot called “Charlie.” When the Auditor General tested it, Charlie gave accurate answers only 44 percent of the time. Let that sink in: a government tax tool, built with public money, that was wrong more often than it was right.
And in Sweden last August, Prime Minister Ulf Kristersson admitted to using ChatGPT and the French chatbot LeChat “quite often” for “second opinions” on policy decisions.
These aren’t hypothetical risks. They already happened. And they share a common pattern: organizations deployed AI tools without fully understanding what they could and couldn’t do, without adequate testing, without human oversight, and without thinking through what happens when the system is confidently, authoritatively wrong.
The Governance Gap Nobody Wants to Talk About
Here’s what concerns me most.
According to the NASCIO 2025 survey, 88 percent of state CIOs say they’ve completed guidelines for responsible AI use. That sounds great on paper. But having guidelines and having actual governance are two very different things.
A guideline says “we will use AI responsibly.” Governance means someone specific is accountable when things go wrong. It means there are processes for testing, monitoring, and pulling the plug. It means someone has asked — and answered — questions like:
Who reviews what the AI produces before citizens see it?
What happens when it’s wrong?
How do we know it’s wrong if nobody checks?
Who is personally accountable?
That last question sounds harsh. But it’s the one that separates real governance from a PDF nobody reads. If nobody’s job is on the line when the AI gives illegal housing advice to thousands of people, you don’t have governance. You have a press release.
I spent years managing risk in financial institutions where regulators could — and did — hold individuals personally accountable. That kind of clarity focuses the mind. Government AI needs the same.
Why “Agentic AI” Should Make You Nervous and Excited
The latest development is “agentic AI” — systems that don’t just recommend actions but actually take them. Autonomously.
Instead of an AI suggesting that a permit application should be fast-tracked, an agentic system would actually fast-track it. Instead of flagging a suspicious transaction, it would freeze the account.
According to Gartner, 40 percent of enterprise applications will integrate task-specific AI agents by the end of 2026, up from less than 5 percent in 2025. State and local agencies are already deploying these systems for workflow automation, citizen services, and resource optimization.
But here’s what the International AI Safety Report — published February 3, 2026, led by Turing Award winner Yoshua Bengio and authored by over 100 AI experts from more than 30 countries — makes clear: AI agents pose heightened risks because they act autonomously, making it harder for humans to intervene before failures cause harm.
Read that again. Harder for humans to intervene before failures cause harm.
We’re talking about government systems — systems that affect people’s benefits, their housing, their taxes, their freedom — operating with increasing autonomy and decreasing human oversight. The report also notes that current AI systems sometimes fabricate information, produce flawed code, and give misleading advice. Current techniques can reduce failure rates, but not to the level required in many high-stakes settings.
That’s not inherently bad. But it requires a level of governance sophistication that most agencies haven’t built yet.
A Practical Framework (No Buzzwords)
If you’re a government leader trying to figure out what to actually do, here’s the framework I’d suggest. It’s not revolutionary. It’s just practical.
Start with the boring stuff. The best AI use cases in government are the ones nobody writes articles about: data entry, document classification, routing, scheduling. Start there. Get wins. Build confidence. Build institutional knowledge about how AI systems actually behave in your environment.
Audit before you buy. Before you sign a contract with any AI vendor, answer these questions: What happens when the system is wrong? How will we know? Who’s responsible? What’s our rollback plan? If the vendor can’t answer these clearly, walk away.
Build literacy, not just tools. The biggest challenge isn’t technology. It’s people. Your workforce needs enough AI understanding to use the tools effectively, spot problems, and know when to escalate. This doesn’t mean everyone needs to become a data scientist. It means everyone needs to know what the AI can and can’t do — and what to do when it gets something wrong.
Separate augmentation from automation. Augmentation means AI helps a human make a better decision. Automation means AI makes the decision. These require fundamentally different governance approaches. Don’t blur them.
Create real accountability. Not a committee. Not a task force. A person. Someone whose name goes on it when the AI tells citizens something wrong. Someone who has the authority to shut it down. The NASCIO data showing 88 percent of states with AI guidelines is a start, but guidelines without individual accountability are just paper.
Plan for the politics. AI in government isn’t just a technology decision. It’s a political one. Public trust is at stake. When the AI gets it wrong — and it will — how will you communicate? Who will answer the media’s questions? Plan for that.
The Opportunity Is Real. So Is the Risk.
I’m not an AI skeptic. I’ve written about how AI is transforming the workplace. I believe it will.
But I’m also not an AI evangelist. I’ve spent enough time in both finance and government to know that technology is never the hard part. People are. Process is. Accountability is.
The agencies that will get this right are the ones approaching AI the way good government approaches anything: carefully, transparently, with clear accountability and genuine concern for the people being served.
The ones that will get it wrong are the ones chasing headlines, signing vendor contracts they don’t understand, and deploying systems nobody is monitoring.
Gartner also predicts that over 40 percent of agentic AI projects will be canceled by the end of 2027 due to escalating costs, unclear business value, or inadequate risk controls. That tells you something important: even the industry analysts expecting massive growth are also expecting massive failure.
The stakes are too high. The people depending on these systems deserve better than a chatbot that’s right only 44 percent of the time, built with $18 million of their tax dollars.
They deserve leaders who take this seriously enough to do it right.
Mike DiNapoli is President & COO of Marcman Solutions. He spent 23 years on Wall Street before serving in Florida state government, including as Director at FloridaCommerce and Chairman of the Florida Development Finance Corporation. He writes about technology, leadership, and public sector transformation.
Connect on LinkedIn.


Leave a Reply