Mr. Aayush Bhatt
June 11, 2026 · 10 min read
The US Government Is Now Using ChatGPT to Audit All 50 States for Healthcare Fraud
HHS launched AERO on May 21, 2026 — using ChatGPT to scan five years of healthcare audits across all 50 states. Here's what it means.
Introduction: The Audit Reports Nobody Was Reading
Every year, thousands of reports land on desks inside the United States Department of Health and Human Services. States, local governments, universities, nonprofits, and hospitals that receive at least one million dollars in federal funding are legally required to submit annual audit reports documenting how every dollar was spent. The system exists to catch fraud, waste, and financial mismanagement before they compound over years into losses that the taxpayer can never recover.
The problem is that nobody was actually reading them.
"It's classic big government," said Gustav Chiarello, HHS Assistant Secretary for Financial Resources, in a May 2026 interview. "Everyone files an audit and it lands with a thud and no one does anything about it." On May 21, 2026, Chiarello announced that the United States government had found a solution to that problem — and it is called ChatGPT. The initiative, formally named AERO — Audit Enforcement and Risk Oversight — is now using generative AI to systematically scan at least five years of audit history across all 50 states, covering every entity that receives federal healthcare funding. The scale of what the government is attempting, and the questions it raises, are both enormous.
What AERO Is and Who Is Running It
AERO is a department-wide program launched by HHS's Office of the Assistant Secretary for Financial Resources. Its stated mission is to hold states and federal grantees accountable for a backlog of audit noncompliance that, in many cases, has been accumulating for years without consequence. Initial findings from the program revealed that states and grantees have consistently failed to fix serious internal control problems, with some deficiencies persisting for three, four, or even five or more consecutive years without resolution. Hundreds of organizations had not submitted required audits on time, and some reports were overdue by more than two years.
Chiarello, a career financial executive who was appointed to lead HHS's financial resources office under the Trump administration, is the public face of the initiative. He described the problem in direct terms: "Years of audit reports documented serious vulnerabilities and failures in oversight, yet states and grantees faced little to no consequences. Following revelations of fraud in various states, we examined their audits more closely and found years of unresolved findings hiding in plain sight." HHS has already sent formal letters to all 50 state governors and treasurers putting them on notice. The letters were blunt in tone, warning that "chronic audit noncompliance, unresolved findings, and delinquent audit submissions will no longer be tolerated." What the letters did not include was a specific deadline for corrective action or a precise timeline for when HHS would begin withholding funds — a deliberate ambiguity that compliance experts say significantly increases the pressure on states to act preemptively.
How ChatGPT Is Being Used to Analyze Audit Reports
The AI deployment at the center of AERO is not a custom-built government system developed over years of procurement. The Wall Street Journal, which first reported the details of the program, confirmed that the tool was built in part using ChatGPT — the commercial large language model developed by OpenAI. HHS describes its approach as using "next-generation AI analytical tools" to ingest and analyze the kind of lengthy, exhaustive audit documents that human reviewers historically lacked the time and capacity to work through at scale.
In practice, the system reads audit reports and scans for patterns that indicate financial irregularities: internal control weaknesses that recur across multiple years, compliance failures that were flagged and never resolved, spending categories with unusual variance, and grantees whose documentation does not support the claims they are making. The volume of material this requires processing is significant. Every entity that receives more than one million dollars annually in federal funds must file a Single Audit — a standardized comprehensive report covering all federal program expenditures. At the federal level, thousands of these reports are submitted each year, and the historical backlog that AERO is working through represents years of accumulated data that no human team had ever fully reviewed. Chiarello's summary of the AI advantage is accurate in its basic premise: "Here, with AI, we're able to dig into it."
The enforcement consequences for states and grantees found to have unresolved deficiencies range significantly in severity. HHS has the authority to temporarily withhold payments, disallow costs, suspend or terminate awards, and initiate debarment proceedings that could permanently exclude an organization from receiving federal funds. In May 2026, CMS Administrator Mehmet Oz and Vice President JD Vance formally threatened to withhold Medicaid funds from all fifty states if they fail to comply with federal anti-fraud statutes. The administration has already demonstrated it is willing to act on that threat: Medicaid funding deferrals were issued to Minnesota and California in February 2026, and a freeze on new Medicare enrollment for certain provider categories followed in May.
How Much Fraud Is Actually Being Targeted
The numbers Chiarello has cited publicly are large enough to require emphasis. He estimated that HHS has between $100 billion and $200 billion in wasteful or fraudulent spending annually across the programs it oversees. That range, presented to the Wall Street Journal, encompasses Medicaid, Medicare, research grants, addiction services funding, and a broad ecosystem of federal healthcare support that flows through every state.
The 2025 enforcement baseline that CMS cited in its own documentation gives a sense of the scale of activity even before AERO launched. In 2025, CMS suspended $5.7 billion in Medicare payments, denied 122,658 individual claims, revoked 5,586 billing privileges, and generated 372 fraud referrals worth a combined $3.7 billion. AERO is designed to function as an upstream early warning system that identifies the conditions that produce those losses before the money has already left the federal treasury. The shift from what HHS itself describes as a "pay and chase" model — disbursing funds and then attempting to recover them after discovering fraud — toward a real-time detection model is the fundamental strategic change the program represents.
AERO is also embedded in a broader government-wide fraud enforcement push that has accelerated sharply in the first half of 2026. HHS's Office of Inspector General commenced simultaneous audits of every Medicaid Fraud Control Unit across all 50 states — an action described by healthcare policy experts as unprecedented in scope. In February 2026, CMS issued a Request for Information seeking public input on AI methodologies applicable to fraud prevention across Medicare, Medicaid, and other federal health programs. Chiarello has also been in contact with his counterparts at other federal departments, with an explicit goal of exporting AERO's approach to agencies beyond HHS. "It would be fairly easy for the other agencies to use our technology and jump on it," he said.
What Experts and Critics Are Warning About
The concerns being raised about AERO fall into two broad categories: accuracy and accountability.
On accuracy, the criticism is grounded in well-documented behavior of large language models. AI tools like ChatGPT are powerful at pattern recognition across large volumes of text, but they also make mistakes — including confident, well-formatted mistakes that look credible on the surface. Healthcare compliance experts have warned that AI systems could generate inaccurate findings, flag legitimate transactions as suspicious, or miss subtle forms of fraud that do not follow recognizable text patterns. Some experts argue that human oversight remains essential at every stage where a funding decision or an enforcement action could result from an AI-generated finding. The concern is not hypothetical: when the consequence of a false positive is the loss of federal Medicaid funding for a state program that serves millions of low-income patients, the tolerance for AI error is essentially zero.
On accountability, the criticism is both more specific and more damaging. HHS has its own published standards governing the use of AI in public benefits administration — a Plan for Promoting Responsible Use of AI in Public Benefits Administration and a document called the Trustworthy AI Playbook. Those internal standards require bias testing, meaningful human oversight, transparency about how the AI system works, and pre-clearance from the Office of Management and Budget before any rights-impacting AI system is deployed. Legal analysts reviewing AERO found no public evidence that the program met any of those requirements before it launched. As the National Law Review put it plainly: "That is the government's own playbook, and its own program certainly appears to fail it."
The political dimension adds another layer of concern. Critics have argued that AERO is being applied disproportionately to Democrat-led states, pointing to the early enforcement actions taken against Minnesota and California. The letters sent to all 50 governors did not include deadlines or specific timelines, which effectively gives the administration discretionary control over when and how hard to press any particular state. Healthcare policy researchers have also flagged that an AI system trained to find fraud in audit documents will find what it is designed to find — and that the framing of what counts as a "deficiency" worth escalating is a political and legal judgment that no AI system is equipped to make.
What This Signals for AI in Government
AERO represents something genuinely new in the history of US federal governance. It is, by any reasonable measure, the largest deployment of a commercial generative AI tool in a federal enforcement context ever attempted — launched without a formal procurement process, without a multi-year authorization, and without a public evidence trail showing that HHS's own AI governance standards were followed. The initiative moved from concept to operational deployment at a speed that traditional government technology programs cannot match. Whether that speed is a virtue or a liability depends almost entirely on whether the AI outputs are accurate and whether the enforcement actions that follow from them are applied fairly.
Chiarello has made clear that he sees AERO as a template for the entire federal government, not just HHS. If other agencies adopt similar approaches — using commercially available large language models to process regulatory filings, grant applications, and compliance reports at scale — the federal government's capacity to identify and act on fraud will expand dramatically. So will its capacity to make consequential decisions based on AI outputs that no one outside the agency can independently verify.
Conclusion: The Audit Has a New Auditor — and Nobody Elected It
The problem that AERO is trying to solve is real. Hundreds of billions of dollars flow through federal healthcare programs every year, and the evidence that large amounts of that money have been mismanaged or stolen without consequence is not in dispute. The audit reports have existed for years. The findings inside them have been real. The failure to act on them has been genuine and costly.
What is also real is the concern that deploying a commercial AI tool at federal enforcement scale — without meeting the government's own standards for bias testing, human oversight, and transparency — introduces a new category of risk alongside the one it is trying to reduce. An AI system that wrongly flags a legitimate state program, or that is applied with more force to some states than others, does not just make a technical mistake. It makes a political and legal decision with real consequences for the people who depend on those programs.
The audit reports used to land with a thud and go unread. That problem is now being solved. The harder problem — ensuring that the new auditor is accurate, transparent, and applied fairly — has not been solved yet, and nobody in Washington has been particularly loud about the fact that it needs to be.
Written by
Mr. Aayush Bhatt
Software Engineer interested in how models work and where they fail.