Chinese AI Firm DeepSeek Exposes Over One Million Records in Major Data Breach

August 5, 2025

A recent discovery by cybersecurity firm Wiz Research revealed that DeepSeek, an artificial intelligence company based in China, exposed over one million sensitive records through an unsecured ClickHouse database. The exposed system allowed public, unauthenticated access to full chat logs, API keys, internal secrets, and backend metadata—posing significant security and privacy risks.

This is not a typical consumer breach. This incident demonstrates how AI companies, in their rush to scale, are now handling sensitive personal data in ways that bypass basic cybersecurity protocols.

What Was Exposed

Wiz researchers discovered two database instances that allowed full access without any form of authentication. Inside were more than one million log entries generated in January 2025. Among the data:

- Full AI assistant chat histories

- API tokens, environment variables, and software credentials

- Backend system architecture metadata

- User identifiers and behavioral logs

The databases were fully queryable, allowing outsiders to potentially run commands and escalate privileges inside DeepSeek's systems. This is more than passive exposure—it was an open door into the company’s infrastructure.

DeepSeek secured the database within approximately one hour after Wiz made contact, but it remains unknown whether the data was accessed or exfiltrated beforehand.

Timing and Context

The leak came just days after DeepSeek had experienced a distributed denial-of-service (DDoS) attack that forced it to temporarily shut down new user registrations. That disruption now appears to have coincided with—or possibly triggered—a series of operational errors, culminating in this database exposure.

Simultaneously, malicious actors published fake packages on the Python Package Index (PyPI) mimicking DeepSeek's SDK. These packages, named “deepseeek” and “deepseekai,” were designed to steal credentials and environment data from developers who installed them. While it’s unclear whether this campaign was coordinated with the database exposure, both incidents point to a lack of security maturity inside the company.

Regulatory Fallout

The leak prompted immediate scrutiny from international regulators. Italy’s data protection authority opened a GDPR investigation, while South Korea suspended access to the DeepSeek app pending a security review. In the United States, the National Security Council began a national security inquiry, and multiple state agencies—including those in Texas—banned the app on government devices.

These responses were driven in part by reports that DeepSeek was transmitting user data to servers operated by ByteDance in China, raising concerns over surveillance, foreign data access, and misuse of personally identifiable information.

Why It Matters

This breach is not just a story about one AI startup making technical mistakes. It’s an indictment of a broader trend: AI companies gathering massive amounts of sensitive personal data without implementing even the most basic safeguards. In this case, a database logging user inputs—many of which likely included private information—was left wide open to the internet.

For Patriot Protect members, this breach illustrates several key points:

- AI providers, especially those based overseas, should not be treated as trustworthy by default.

- Exposure of chat logs and API credentials can lead to cascading security failures, especially when AI tools are integrated with cloud systems or business processes.

- Even highly technical teams are capable of critical configuration failures.

Recommendations

Patriot Protect advises all members—especially those using AI tools internally or via vendors—to take the following steps:

1. Audit all public-facing databases and services. Ensure proper authentication is required and access is limited by network restrictions.

2. Review your use of AI platforms, particularly those with unclear data handling policies or offshore infrastructure.

3. Scrub logs for sensitive data. Never store API keys, passwords, or user inputs in plaintext log files.

4. Vet third-party packages. Use hash validation and monitor for impersonation or typo-squatting in software dependencies.

5. Ask hard questions of your vendors. Demand clear answers on data storage locations, breach notification policies, and regulatory compliance.

Final Thoughts

The DeepSeek leak wasn’t the result of a sophisticated attack. It was a preventable mistake—one that exposed how quickly unvetted AI platforms can become liabilities.

As artificial intelligence tools are increasingly deployed across the private and public sectors, the surface area for exposure will continue to expand. Breaches like this aren’t isolated—they are warnings.

Patriot Protect will continue to monitor developments and notify members of actionable risks. For support with infrastructure audits, vendor security reviews, or incident readiness, contact the Patriot Protect Response Team.

Back to blog