Chapter 6 - Data Analytical Tools Flashcards
List the 3 reasons R language is so popular?
1) it’s open source
2) the machine learning packages are free on CRAN, the R code repository
3) the tidyverse data-analyis packages simplify the R language make it easier for data analysts
Python has a specialist tool specifically for data analysts to use. What is it called and what kind of data structures does it work with?
PANDAS = Python Data Analysis Library.
It’s specifically for structured/tabular data
Which of the following commands are Data Definition Language and which are Data Manipulation Language
CREATE
UPDATE
ALTER
DELETE
DROP
SELECT
INSERT
CREATE - DDL
UPDATE - DML
ALTER - DDL
DELETE - DML
DROP - DDL
SELECT - DML
INSERT - DML
Which 3 DDL commands does the book mention?
CREATE
ALTER
DROP
Which 2 ways can tools interact with databases using SQL?
1) provide a GUI to either reconfigure the database or retrieve data from the database
2) directly via computer software.
Which statistics packages created in the 60’s and 70’s are still widely in use today?
IBM SPSS
SAS
SPSS Statistics is a different product to SPSS Modeller!
Which are two statistics packages (or statistical analysis software) aren’t used as much today?
Minitab and Stata
if you saw a machine learning tool interface that showed decision trees in the form of flowcharts, which tool would you be looking at?
IBM SPSS Modeler
Which machine learning tool offer pre-built analytics templates for common business scenarios and offers access to hundreds of different algorithms?
Rapid Miner
Farrow wants to perform analysis of data but it requires a bespoke solution, what out of the below methods should they go with?
Spreadsheets
IBM SPSS
Use Python or R?
Using Python or R will allow Farrow to build an analysis package that suites their specific needs
List the analytics suites discussed in the book
AWS quicksight
business Objects
Domo
Datorama
IBM Cognos
Microstrategy
Power BI
Qlik
Tableau
Which analytics suite is specifically designed for Sales and Marketing functions of the business?
Datorama
Which analytics suite from SAP allows companies to integrate it with other applications?
BusinessObjects
Raheela needs to create a bespoke analytics but she’s not very familiar with programming, which solution should she choose?
R
List the SQL Data Manipulation Commands
UPDATE
SELECT
INSERT
DELETE
What’s the key benefit machine learning software or packages bring to analysts of today?
They don’t require analysts to know programming languages or write their own scripts/algorithms
What’s considered the most popular data visualization tool?
Tableau
Which data visualization tool enables easy ingestion of data from a wide variety of sources?
Tableau
Qlik X is the company’s original analytics platform and Qlik X is a more advanced platform
Qlik VIEW is the company’s original analytics platform and Qlik SENSE is a more advanced platform
Regarding IBM Cognos, what’s the difference between the two modules below?
Query Studio
Report Studio
Query Studio provides access to data querying and basic reporting
Report Studio is for complex reporting needs
Which ‘Studio’ module in IBM Cognos enables advanced modelling and analytics for large data sets?
Analysis Studio
If you’re a business executive and want to create a scorecard to analyse key metrics across the business, which module would you use in IBM Cognos?
Metric Studio
Which Power BI module allows developers to create paginated reports that are designed for printing and email?
Power BI Report Builder
think building reports to publish
An organization wants to host it’s own Power BI capability, what offering would they purchase?
Power BI Report Server
List the analytics suites begging with A and B
AWS Quicksite
Business Objects (SAP)
List the analytics suites beginning with D
Domo
Datarama
List the analytics suites beginning with C and P
Cognos
Power BI
List the analytics suites beginning with Q and T
Qlik
Tableau
Power BI Report Builder has a key function to use specifically when printing or emailing reports, what is it?
Pagination
Which 2 products in the book are built especially for machine learning capabilities?
IBM SPSS Modeller
Rapid Miner
Which name is given to a group of packages within R that is specifically design for data manipulation, analysis and visualization?
Tidy-verse
Machine learning and the tidy-verse analytics packages are stored where for R?
in CRAN (Comprehensive R Archive Network)
these provide an advanced statistical environment via GUI or bult-in scripting language and are generally the domain of professional statisticians
Statistics packages