Web Scraper Project Flashcards
beautifulsoup method from bs4
syntax and function
BeautifulSoup(string/html code, parser)
.li
returns FIRST OCCURRENCE of list item header in html code <li>
.head method from bs4
attach to variable containing beautiful soup object
returns headers from html code (?)
what are html tags (2)
the building blocks of HTML documents, defining the structure and content of the webpage.
Tags are enclosed in angle brackets (< >) and usually come in pairs: an opening tag and a closing tag.
what is the basic structure of an html tag? (3)
opening tag, content, and closing tag, in that order.
what about self-closing tags?
do not have content and are self-closing. They end with a forward slash before the closing angle bracket.
Example: <img src="image.jpg" alt="Image description" /
what is a div tag?
<div>: Defines a division or section in an HTML document.
</div>
what is a span tag?
<span>: Defines a section in a document (inline) for styling purposes.</span>
what is the body tag?
<body>: Contains the content of the HTML document that is visible to users.
</body>
what is the title tag?
<title>: Sets the title of the webpage (displayed in the browser tab).
</title>
what is the head tag?
<head>: Contains meta-information about the HTML document (e.g., title, meta tags, links to stylesheets).
</head>
what are h1 and h2, etc tags?
<h1> to <h6>: Define headings, with <h1> being the highest level and <h6> the lowest.
</h6></h1></h6></h1>
what is a p tag?
<p>: Defines a paragraph.
</p>
what is an a tag?
<a>: Defines a hyperlink.</a>
what is an img tag?
<img></img>: Embeds an image.