Introduction to data science Flashcards

You may prefer our related Brainscape-certified flashcards:
1
Q

Webscraping tools: Readymade

A
import.io
ScraperWiki
Tabula
Google Sheets _=IMPORTHTML_
excel
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Web scraping tools: custom

A
R
Python
Bash
Java
PHP
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Steps to explore data

A

Review data
Check assumptions
Check Anomalies
Data Suggestions

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Exploratory Graphics

A

Coding: R, Python, JavaScript
Applications: Tableau, Qlik

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Exploratory Graphics

A

Bar Charts: for categories and can be grouped
Box Plots: for quantitative variables, in quartiles, show outliers
Histograms: show shape of distribution
scatter plot matrices

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Questions to ask in exploratory process

A

do you have what you need?
are there clumps or Gaps
are there exceptional cases
are there errors in the data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Exploratory Statistics

A

Robust Statistics: stable, less effected by outliers, skewness, kurtosis
Resampling: empirical estimate of sampling variability, jackknife, bootstrap, permutation, cross validation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Advantages of Excel

A

good for browsing, sorting rearranging, getting a visual picture, finding and replacing
more uses: formatting, transposing, making pivot tables

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

SQL

A

First, SQL is the language used for getting data from relational databases. Second, a few commands go a very long way. And third, the data is usually pulled out of a database and then sent to other programs like R or Python for analysis.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

HTML

A

HTML, or HyperText Markup Language is the language of web pages; it’s the thing that says what a text is and what the headings are, and where to put links. And the information on a web page is styled with CSS which is for Cascading Style Sheets.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

XML

A

XML or Extensible Markup Language. This is a data encoding that is simultaneously human-readable and machine-readable, which is not always the case.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly