DAY1 RECON & OSINT Flashcards
WHAT IS RECONNAISSANCE
“Recon is the science of gathering information about a target”
T/F Profile a target (a user, a company or any victim) in depth
T
T/F reconnaissance Relies heavily on OSINT
T
Relies heavily on OSINT
Open-Source Intelligence (publicly available information that can prove to be helpful to the attacker)
• Collect and analyze everything that’s “out there” pertaining to the target
» Direct communication with the target mostly does not happen
T/F Reconnaissance Used in various domains, including cybersecurity, law enforcement, business intelligence, and national security
T
PHASES OF HACKING
RECONNAISSANCE STAGE:
-FOOTPRINTING
-SCANNING
-ENUMERATION
SYSTEM HACKING:
-GAINING ACCESS
-MAINTAINING ACCESS
-COVERING TRACKS & BACKDOORING
TYPES OF RECON
› Passive
» Relies heavily on OSINT techniques
» Does not reveal the source of the activity (anonymity)
» Information can be inaccurate or out-of-date
› Active
» Interact with the system directly (tools communicate with the target)
• Direct victim profiling via scanning and enumeration-based invasive techniques
» Information is accurate and up-to-date
» Can reveal the source of the activity (identity is compromised)
TYPES OF RECON
WE WILL SEE BOTH ACTIVE & PASSIVE RECON (OSINT) TOOLS & TECHNIQUES
› Valuable information on web pages
» HTML comments
• Sensitive information left there by developers
» Website Mirroring & Web Spiders/Crawlers
• Technique used to copy publicly available and linked content for offline analysis
» Directory Brute Forcing and Forced Browsing
• Technique used to discover hidden, restricted and unlinked content on the web server
» Google Hacking via Dorks & Advanced Search
• Advanced search queries that return very specific data from websites
» Email Harvesting
• Gathering email addresses on individual victims or potential target in an organization
WEBSITE MIRRORING: HOW DOES IT WORK?
› Site Ripping
» Download the entire website to your local machine for offline browsing
• Retrieves the content that is publicly available or linked on the site
• Does not look for hidden directories or content (no brute force or dictionaries are used)
• Tools follow/explore links/references on the main page and then sub-pages
» Much easier to parse and analyze the website for useful information
• No further interaction with the live site (no need to send repeated requests for content)
› Many tools exist
» PageNest
» BlackWidow
» HTTrack
› Web Spider or Crawler
» Visit website homepage and follow/open links to sub-pages recursively
• Content needs to be publicly accessible for this
» Downloads relevant content in an automated fashion matching pre-
defined search criteria
• Regular expressions
• Certain file extensions (like JPG or PNG)
• Metadata of Office (Word, PowerPoint, Excel) & PDF documents
DIRECTORY BRUTE FORCING: WHAT IS IT?
› Systematically trying different directory and file names to see if they exist on the server.
» Used for accessing hidden, restricted or unlinked content on a website
» Often a contextually relevant list of common directory and file names (dictionary) is used
» For instance, for a university web server, the potential entries in the dictionary could be:
• Academics
• Grades
• Registrar
• Student Affairs
• Courses
» Another option to discover content is to attempt all possible combinations (brute forcing)
• A → Z
• 0 → 9
• Combinations of alphabets and digits to cover the entire space of possible directory and file names
DIRECTORY BRUTE FORCING: MORE TOOLS
Lots of Web Content Scanners
» BurpSmartBuster (plug-in for Burp Suite)
» Dirsearch
» DIRB (available in Kali with built-in dictionaries)
» Cansina (available with BlackArch Linux) – Good one!
» Meg (does not overwhelm the servers)
» Wfuzz (available in Kali with much more functionality)
» Gobuster
FORCED BROWSING: WHAT IS IT?
› Directory brute forcing is a resource-intensive activity (aggressive)
» May trigger security alerts on the target server
› Instead, strategically manipulate URLs to take advantage of vulnerabilities in
the application’s input validation or authorization mechanisms
» Attackers attempt to navigate to directories or resources that should be protected but are
not due to flawed security configurations (improper access control)
» Targeted approach which is more stealthy
» Feroxbuster is useful for forced browsing
• Uses brute forcing as well as wordlists (dictionaries)
GOOGLE DORKS: SEARCH ENGINES
› Advanced Google queries and operators
» cache: Display results from pages stored in Google cache
» link: Display results with links to the specified page
» related: Display similar results
» site: Display results from the queried website only
» intitle: Display results that have searched keywords in title
» inurl: Display results that have searched keywords in the URL
EMAIL HARVESTING: HOW DOES IT WORK?
› Email Harvest
» Gathering emails of potential victims
» Step 1: Guess email IDs because companies have a pattern
• ali.hassan.1@kaust.edu.sa (first initial followed by a dot and the last name)
• Do this for as many users as possible (dictionaries of common names,
employee lists, brute force, etc.)
» Step 2: Send email on the guessed email ID
» Step 3: Analyze the response of the SMTP server
• If email is accepted, add to the database of harvested email IDs
• If email is rejected, discard it (Delivery Status Notification msg)
EMAIL HARVESTING: OTHER OPTIONS
› Spider or Crawler Scans
» Use web crawlers and spiders to go search through the entire website,
forums, blogs, etc., for email addresses
› Search Engines
» Use Google and other search engines to return all email addresses having
a certain suffix, such as “@kaust.edu.sa”
› Email Address Lookup Services
» Hunter.io - https://hunter.io/
» Phonebook.cz - https://phonebook.cz/
» VoilaNorbert - https://www.voilanorbert.com/
RECON: LOCATION DETAILS
› Google Maps & Google Earth
» Used to plot data points and cross-reference with know landmarks, addresses, or publicly available datasets
› OpenStreetMap (OSM) Geographic DB & Wikimapia
» Queryable open-source database with loads of features (geographic encyclopedia)
› Quantum Geographic Info System (QGIS)
» Perform detailed spatial analysis and visualization
› World Imagery Wayback
» A digital archive of different versions of World Imagery created over time (online historical atlas)
› IP Geolocation Services
» Translates IP addresses to the corresponding physical location of a system
› Social Networking Sites
» Users share geolocation tags or hashtags; movement patterns of users can be inferred if they post frequently
› Shodan
» Seach engine for Internet-connected devices that can also provide geographic location based on IP information
› Maltego
» A data mining tool used to connect location data with other OSINT findings
FINDING LOCATION CAN BE TRICKY
› Even if users don’t explicitly share their location, background or minor
details can provide intelligence:
» Landmarks (Eiffel Tower, Burj Khalifa, road signs)
» Shadows and time of day (can help estimate time zone)
» License plates, billboards, or street signs for geographic hints
› Reverse image search can match an image to a known place or location
» Use AI services to enhance image quality
› Explore video reviews and vlogs on YouTube for certain locations to look for clues and information
RECON: EMPLOYEE INFORMATION
› Lots of people-based search engines out there:
» Pipl, snitch.name, That’sThem, Intelius, myLife, etc.
› Provide the following information:
» Biodata (name, age, address, sex etc.)
» Emails
» Social media presence
» Friends
» Preferences/Interests
» Marital status
» Education
» Court records
» Credit history
» And much more
RECON: ARCHIVED INFORMATION
› Wayback Machine is a digital archive (collection) of the Web
› Useful tool for various reconnaissance scenarios
» Uncovering Deleted Information:
• Archived versions can help recover sensitive information that has been removed from
websites
» Tracking Website Evolution:
• By examining how a website has changed over time, attackers can identify the security
patterns and plan accordingly
» Discovering Deprecated APIs and Endpoints:
• In API reconnaissance, Wayback Machine can help identify endpoints or functionalities
that were once publicly accessible but have since been deprecated or hidden
RECON: SOCIAL MEDIA INTELLIGENCE
› Profile information
› Photos and videos
› Friend and connection lists
› Status updates and posts
› Groups and communities
› Check-ins and locations
› Likes and interactions
Also RECON: SOCIAL MEDIA INTELLIGENCE
› Impersonation, Sock Puppets, and Sybil
Identities:
» Assume identity of someone the target knows or
trusts or someone they could easily learn to trust
» A fake online identity or persona is called a sock
puppet or sybil identity
• E.g., a male attacker joining a female-only WhatsApp
group by pretending to be a female
» Hides true identity of the attacker while
simultaneously tricking the victim into revealing sensitive information
TOPOLOGY MAPPING: TRACEROUTING
› Trace the route to a host
› Direct interaction with the victim
› How ‘traceroute’ works:
» Send packet with TTL 1
» First router will receive and drop the packet
» Send packet with TTL 2
» Second router will receive and drop
» Send until max number of hops
› We know the identity of each router from the ICMP response message
SERVICES/APPS REQUIRE PORTS
Services/Apps run on a specific port(s) over a particular protocol
FTP 21 (TCP)
SSH 22 (TCP)
DNS 53 (TCP, UDP)
Vulnerable service software allows hackers to break into a system
Unpatched Web server software
Buggy DNS server software
Etc..
Hence, an important step in reconnaissance is to discover which ports are open
RFC 793:
Any TCP segment with an out-of-state flag sent to an open port is discarded, whereas segments sent to closed ports should be handled with a RST in response.
RFC 793:
For the ACK flag, out-of-state segments sent to an open/listening port or to closed ports should both be handled with a RST in response.
TCP CONNECT SCAN (VANILLA SCAN) – OPEN STATE
Attacker – SYN –> Target
Attacker <–SYN/ACK Target
Attacker –ACK –> Target
TCP CONNECT SCAN (VANILLA SCAN) – CLOSE STATE
Attacker – SYN– > Target
Attacker < – RST – Target
TCP SYN SCAN – OPEN STATE
Attacker –SYN –> Target
Attacker <–SYN/ACK Target
Attacker –RST–> Target
TCP SYN SCAN – CLOSE STATE
Attacker – SYN – > Target
Attacker <– RST – Target
TCP FIN SCAN – CLOSE STATE
Attacker – FIN –> Target
Attacker <– RST – Target
TCP FIN SCAN – OPEN STATE
Attacker – FIN –> Target
Attacker <– No response– Target
TCP XMAS SCAN – CLOSE STATE
Attacker – PSH,FIN,URG –> Target
Attacker <– RST– Target
TCP XMAS SCAN – OPEN STATE
Attacker – PSH,FIN,URG –> Target
Attacker <– No response– Target
TCP NULL SCAN – CLOSE STATE
Attacker – NULL–> Target
Attacker <– RST – Target
TCP NULL SCAN – OPEN STATE
Attacker – NULL–> Target
Attacker <– No response– Target
TCP ACK SCAN – UNFILTERED STATE
Attacker – ACK–> Target
Attacker <– RST – Target
What we do know is that there is no firewall preventing us from getting to the machine or firewall allows ACK packets
TCP ACK SCAN – FILTERED STATE
Attacker – ACK–> Target
Attacker <– ICMP Error or No Response
– Target
(behind a firewall or ACL)
TCP ACK Scan
TO CHECK IF PORT IS FILTERED OR UNFILTERED
USEFUL FOR MAPPING FIREWALL RULES
NOT USEFUL ALONE
OFTEN COMBINED WITH SYN SCAN
Two Common UDP Scans
UDP Empty Packet Scan
UDP Application Data Scan
UDP EMPTY PACKET SCAN – OPEN STATE
Attacker – UDP–> Target
Attacker <– No response– Target
UDP EMPTY PACKET SCAN – CLOSE STATE
Attacker – UDP–> Target
Attacker <– ICMP Error– Target
UDP APPLICATION DATA SCAN – OPEN STATE
Attacker – UDP–> Target
Attacker <– No response– Target
UDP APPLICATION DATA SCAN – CLOSE STATE
Attacker – UDP–> Target
Attacker <– ICMP Error– Target