Content Discovery Flashcards
What is Content Discovery?
When speaking of Content Discovery in a Hacker sense, we’re interested in the content that we can’t immediately see on the page. Some examples of things that we’d be looking for during Content Discovery would be: pages or portals used for staff, older versions of the site, backup and configuration files, and even admin panels.
What’re the three ways that we’ll be doing Content Discovery?
Automated, Manually, and OSINT(Open-Source Intelligence)
Manual Discovery: What are Robots.txt files?
This is a text document that tells search engines which pages they are and aren’t allowed to show on their search engine results or they can say to ban specific search engines from crawling the website altogether.
Manual Discovery Information:
Each of these methods have multiple ways of executing them.
Why are Robot.txt files common?
It’s common to restrict certain websites areas like admin portals so that not just anyone can simple access them from a search engine.
Manual Discovery: What are Sitemap.xml?
Sitemap.xml is basically the opposite of Robots.txt, it gives a list of every file the website owner wishes to be displayed by search engines.
Manual Discovery: What are Framework Stacks?
As established before, when you discover the type of framework that you’re working with, you’ll be able to also find its inherent vulnerability. This vulnerabilities are often tracked on Frameworks Stacks, basically just sites that house this information.
Manual Discovery: What are HTTP Headers?
HTTP Headers are sent back to the client when you request something from the web server.
OSINT: What is Google Hacking/Docking?
This method is used through utilizing advanced search engine filters
Types of Search Engine Filters:
site: - Returns results only from the specified website address
inurl: - Returns results that have specified word in the URL
filetype:- Returns results which are a particular file extension
intitle:- Returns results that contain the specified word in the title
OSINT: What is Wappalyzer?
Wappalyzer is an online tool o browser extension thats used to help identify all technologies used on a website.
OSINT: What’s are S3 Buckets?
S3 buckets are an external program provided by AmazonAWS that allow you to save files and even make pages dormant on the cloud to be accessed by HTTP or HTTPS.
What’s Automated Discovery?
Automated Discovery is the process of using tools to do content discovery as a pose too doing it manually. These tools will be automated because they contain hundreds, thousands, even millions of request to a web server.
Automated Discovery: What do these request do?
These request will check whether or not a file or directory exist on a website, and this process is made possible by use of a tool called wordlists.
Automated Discovery: What’s a Wordlist?
Wordlists are just text files thats contain a long-list of commonly used words.