Presentation Flashcards

1
Q

What is DeepSeek?

A

DeepSeek is a Chinese company with many Gen-AI models under their wing. Their most famous being DeepSeek-R1, a reasoning model that rivals OpenAI’s O1 in performance while being more cost-effective and efficient. Researchers claim that it only cost $6 million to train, as opposed to OpenAI’s $100 milion.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is ClickHouse?

A

ClickHouse is an open-source database management system [DBMS] designed for fast queries on large datasets. It was developed by the company Yandex, and is used for real-time data processing, log storage, and big data analytics - indicating that this exposure contains potentially very valuable and sensitive information.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is WIZ?

A

WIZ is a cloud security platform that scans cloud environments for vulnerabilities. Has dedicated researchers testing for vulnerabilities on online platforms - some of which were the first to document this vulnerability, which was then later made public - after it was resolved - in an article published on January 9th, 2025.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

How was the database found?

A

WIZ Researchers found the database within minutes. It was completely open and unauthenticated. They started by accessing DeepSeek’s public domains, mapping the external attack surface with both passive and active discovery of subdomains.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is a passive discovery of a subdomain?

A

The process of identifying subdomains belonging to a target domain by utilizing publicly available information sources, like search engines, DNS records, and certificate transparency logs, without directly interacting with the target domain itself, whereas active discovery refers to the process of identifying subdomains belonging to a target domain by directly interacting with its infrastructure.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

How was the database found? (Cont.)

A

Through this, the researchers found about 30 internet facing subdomains. Most of these simply hosted elements of the chatbot, like the interface, status page, etc.
However, they also found two strange, open ports (98123, 9000) which would be associated with the following hosts. These ports led to the then exposed ClickHouse database.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Image Slide

A

Here, you can see they were able to access the ‘play’ path of these ports, allowing them to insert and execute basically any SQL query they wanted.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Exposure walkthrough

A

A simple ‘show_tables’ query opened up this whole thing, revealing a list of accessible datasets. An important one being this log_stream. Now, this log_stream contains over 1 million log entries containing highly sensitive data. This posed a critical risk to both DeepSeek as a company and the security of its end-users. Not only could hackers access plain-text chat messages, they could also potentially retrieve plaintext passwords directly from the server.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Why is this significant?

A

AI is being adopted at a rapid pace, often too quickly for proper security measures to be implemented. More often than not, the major security risks are the simple, basic ones that are overlooked in lieu of perceived major threats down the line. More thought must be given to how to keep these models secure as many individuals and organizations alike are giving them very sensitive information.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly