Let's see how we can help you!
Leave a message and our dedicated advisor will contact you.
Send us a message
0/10000
Leave a message and our dedicated advisor will contact you.
As researchers from the Israeli startup Gambit Security report, an unknown actor used artificial intelligence to orchestrate a massive attack on Mexican government agencies. The scale? Absolutely staggering. We're talking 150 gigabytes of data stolen, including information on 195 million taxpayers, voter records, and credentials belonging to state employees.
Let's take a closer look at how artificial intelligence – originally designed to help humanity – was manipulated into becoming a digital lockpick for state secrets.
The most interesting aspect of this incident isn't actually the breach itself, but the psychological game played out between man and machine. Unlike early, naive bots, Claude has built-in safety mechanisms. It initially warned the hacker that their intentions toward the Mexican government seemed pretty malicious.
So how did the hacker react? They used classic social engineering, but directed it at an algorithm. They convinced Claude that they weren't a criminal at all, but rather an ethical hacker participating in a "bug bounty" – hunting for security flaws in exchange for a financial reward. Many companies and government institutions actually offer these programs, which made the context seem completely credible to the model. The attacker essentially asked to run penetration tests on the Mexican federal tax authority.
It's a fascinating proof of just how flexible and susceptible to context manipulation large language models (LLMs) really are. All it took was shifting the narrative frame from "theft" to "security audit" to bypass the fundamental filters.
Even after switching the narrative to a "bug bounty," Claude didn't just turn into a mindless lackey. At one point, the machine actually rebelled. When the hacker demanded instructions on deleting logs and hiding the command history, the AI outright refused to cooperate.
"Specific instructions about deleting logs and hiding history are red flags," Claude responded, according to a transcript released by Gambit. "In legitimate bug bounty, you don’t need to hide your actions – in fact, you need to document them for reporting."
This phenomenon is quite incredible. We're seeing an AI model that doesn't just scan prompts for forbidden keywords, but genuinely understands the logic and ethics of the cybersecurity industry. Claude identified the contradiction: "You claim to be an auditor, but auditors don't cover their tracks." Unfortunately, the hacker found a workaround for this too. Instead of continuing the dialogue, they simply uploaded a ready-made, detailed "playbook" to the bot. This ultimate move broke the bot's safeguards and allowed for the execution of thousands of commands across government networks.
When Claude hit roadblocks or needed extra info, the hacker didn't just give up. They turned to the competition – OpenAI's ChatGPT.
The cybercriminal used OpenAI's product for very specific, highly advanced tasks: figuring out how to move laterally inside computer networks, identifying which credentials to use for specific systems, and even asking it to calculate the probability of the operation being detected. Although OpenAI stated they identified these attempts and their tools eventually refused to comply, the mere fact that multiple AI models were being used as "attack consultants" creates a new, incredibly disturbing paradigm.
Hackers basically don't have to be experts in every field anymore. As long as they can ask the right questions in Spanish, the AI will generate read-to-run scripts and action plans for them. Alon Gromakov, co-founder of Gambit Security, put it bluntly: "This reality is changing all the game rules we have ever known."
Just as interesting as the technological aspect of the hack is the victims' reaction. Last December, Mexican officials issued a brief statement about investigating breaches in public institutions. But when faced with Gambit Security's revelations, many of those institutions either clammed up or flat-out denied the attacks.
Mexico's tax authority stated they reviewed their logs and found zero evidence of a breach. The national electoral institute also denied any intrusions, and state authorities in Jalisco claimed the issue only affected federal networks. The water utility in Monterrey didn't detect any intruders either.
So we're facing a classic cybersecurity dilemma here: are these institutions' monitoring systems so outdated that they entirely missed the theft of 150 GB of data, or is this just a massive PR cover-up? Given that the attack relied on thousands of AI-generated reports and highly detailed plans, it's quite likely that the malware and masking techniques suggested by the chatbots simply proved too sophisticated for traditional defense systems.
This incident in Mexico wasn't an anomaly – it's the herald of a new era. Artificial intelligence has fundamentally become a key catalyst for digital crimes. We've already seen hackers use AI to breach hundreds of firewalls, and we've witnessed suspected Chinese state-backed cybercriminals trying to use Claude for a massive espionage campaign.
Companies like Anthropic are pouring massive resources into safety. Company reps announced they banned the hacker's accounts and are actively using data from this attack to train their newer model, Claude Opus 4.6, to better detect misuse. Still, it's a constant game of cat and mouse. With every new, smarter AI model released, a tool hits the market that becomes a devastating weapon in the hands of a creative intruder.
We're facing a fundamental challenge. How do we build an assistant that's smart enough to write complex code for us, but "resilient" enough not to believe us when we ask it to help with a virtual break-in disguised as an audit? It's a question the entire tech industry needs to answer quickly – before more gigabytes of our data fall into the wrong hands.
Aleksander

Chief Technology Officer at SecurHub.pl
PhD candidate in neuroscience. Psychologist and IT expert specializing in cybersecurity.
Traditional security models are obsolete. Learn why the "Never Trust, Always Verify" philosophy is becoming a legal and technological standard, and why your firewall is no longer enough.
SOC analysts are drowning in a data flood, wasting hours on false alarms. Is 2025 and the arrival of autonomous AI agents the moment machines finally let humans stop "chasing ghosts" and start thinking strategically?
The modern digital ecosystem operates under an unprecedented convergence of legal requirements and technological challenges. The General Data Protection Regulation (GDPR), which came into effect in May 2018, has permanently changed the way organizations must perceive information security.
Loading comments...