How easy is it to use ChatGPT for hacking? A team of Cambridge University researchers decided to find out — and the results were more alarming than expected.
📖 Read more: ChatGPT Ads: End of Free Usage?
🔬 The Experiment
The researchers used GPT-4o, Claude 3.5, and Gemini 1.5 Pro in a series of tests: can these models write malware, phishing emails, or exploit code? The study, published in February 2026, examined 150 different scenarios.
The results: all three models can assist in creating malicious software, if the user knows how to ask. Safety guardrails block the most obvious requests — typing “make me ransomware” gets a refusal. But asking for the same things in parts ("write a script that encrypts files," “how do I send files via HTTP to a server”), the models cooperate.
📖 Read more: Copilot Tasks: AI Uses Your PC on Its Own
📊 What They Found
On a scale of 1-10: GPT-4o scored 6.2 in usefulness to hackers, Claude 3.5 scored 5.8, and Gemini 5.4. No model created complete, ready-to-deploy malware — but in every case, the assistance was sufficient for someone with basic programming knowledge.
📖 Read more: Politeness to AI: Should We Say Thank You?
🎭 The “Salami Slicing” Technique
The most effective technique is called "salami slicing": the user breaks a malicious request into small, seemingly innocent pieces. Each piece alone is legitimate (e.g., “how do I encrypt files in Python”). But stitched together, they form ransomware.
The models don't need to know they're helping with hacking. They just need to answer technical questions — which is their core purpose.
💡 The bottom line: AI chatbots aren't “hacking tools” per se. But they dramatically lower the bar. Someone who would need months of learning can now do the same in hours — with help from a chatbot that doesn't understand what it's building.
📖 Read more: Deepfake CEO: $25M Corporate Fraud
🔮 What's Changing
AI companies are strengthening safety filters. OpenAI announced “multi-turn safety analysis” that examines the entire conversation rather than individual messages. Anthropic already has a similar system in Claude. But researchers are skeptical: every time a filter is strengthened, someone finds a new bypass.