Deepseek's AI model proves easy to jailbreak

Admin · Feb 01, 2025, 11:43 AM

DeepSeek, a Chinese startup, has been stirring up excitement and concern in the AI world. While people are impressed with its performance, security risks have also been raised.

On Thursday, Palo Alto Networks' cybersecurity team, Unit 42, released a report about jailbreaking DeepSeek's V3 and R1 models. They found that it was easy to bypass these models' security with little to no technical expertise. The report also revealed that the models could be manipulated to guide users in dangerous activities, like creating keyloggers, stealing data, writing phishing emails, and even making Molotov cocktails.

Cisco also tested DeepSeek R1 and found that it didn't block harmful prompts at all. In fact, they reported a 100% success rate for attacks, which shows that DeepSeek has serious security flaws.

A third security company, Wallarm, looked into DeepSeek as well. They discovered potential weaknesses in the model's security and found that DeepSeek might have used OpenAI's technology to train its models, which could be a violation of OpenAI's terms. After jailbreaking the system, Wallarm was able to extract sensitive details about how DeepSeek's models were trained, something AI models usually hide to protect their data.

Despite these concerns, DeepSeek has already patched some of these vulnerabilities. However, the fact that their database was once found exposed online shows there are still major security gaps that weren't addressed before the models were released.

It's worth noting that similar jailbreaking issues have been found in other popular AI models, including those from big U.S. companies like OpenAI. But these latest findings with DeepSeek highlight significant risks that need to be carefully considered.

News:

Deepseek's AI model proves easy to jailbreak

Admin