File Formats Flashcards
What does CSV stand for?
Comma Separated Values
What type of file is plain text with fields separated by commas (,) typically storing data in a tabular format with the same number of fields on each line (row) of the file?
CSV
Since the file suffix .yaml is the official file extension for YAML files, it is the only extension that is valid.
False. YML is also valid.
XML has been a W3C standard since:
1998
XML’s syntax is very similar to what other format?
HTML
What file type is used by Anaconda for its configuration files?
YAML
The ZIP file format was originally created and implemented by PKWARE via the utility:
PKZIP
What file format is often associated with Microsoft Excel and/or Google Spreadsheets?
CSV
What file type can easily be created using Python Dictionaries {key: value, key: value, …
JSON
What file type uses Python-style indention for structure?
YAML
What does XML stand for?
extensible markup language
What does YAML stand for?
YAML Ain’t Markup Language
XML is:
- typically design to carry data, not display it like HTML
- frequently used to save and share structured data, often over the internet
- designed to be self-descriptive (you define your own tags
JSON is:
- language independent, even though it is derived from JS
- lightweight, text-based format used for data interchange
- used to transmit data between a server and web apps (think about REST APIs)
- a popular alternative to XML
YAML is:
- a human readable data-serialization language
- commonly used for configuration files and data storage or transmission
- supports 3 basic data types:
- scalars (string, ints, floats)
- lists
- associative arrays (maps, dicts, hashes)
ZIP:
- format commonly used for archives and compression
- was quickly supported by many other utilities, including built-in versions in Windows and Mac OSX
To open a TXT file:
- use open()
- using “with open(…)” will eliminate you from having to close() the file
- Example: with open('classic_books.txt') as file: reader = csv.reader(file)
for row in reader: book_data = "".join(row).split("|") classics.append(book_data)
How do you display the first or last five records in a DataFrame?
df.head() or df.tail()
json.load()
- JSON -> python object
- deserializes a fp (.read-supporting text file containing JSON) into a python object
json.dumps()
- Python object -> JSON string
- serializes an object as a JSON string
json.dump()
- Object -> JSON formatted stream
- serializes an object as a JSON formatted stream (fp - .write-supporting file-like object)
How do you read a YAML file?
- use yaml.load()
- Example: with open('classic_books.yaml') as file: classics_dict = yaml.load(file, Loader=yaml.FullLoader)
How do you read and parse an XML file?
- ElementTree.parse()
- ET.getroot() gets the root of the tree
How do you write a zip file?
- zipfile.ZipFile()
- Example:
with zipfile.ZipFile(‘CSC221Lab7.zip’, ‘w’) as outfile:
for file in files_to_zip:
outfile.write(file)