Instructor Created Flashcards
What is the purpose of affinity grouping?
To evaluate relationships or associations between data elements that demonstrate some kind of affinity between objects
How is estimation defined in data analysis?
A process of assigning some continuously valued numeric value to an object
What is a key benefit of the estimation process?
Results can be ranked by score
What does classification involve?
Organizing data into predefined classes
What is the goal of the classification process?
To build a model that can accurately classify new records
What is clustering in data mining?
The task of dividing a large collection of entities into smaller groups based on similarity
What is the main process in data mining?
Assemble information, prepare it for mining, apply algorithms, and analyze results
What does data mining rely on?
Using one set of data for training and another for testing
What is data type conversion?
Parsing strings representing values and transforming them into the proper form for the target machine
What is data cleansing?
Correcting known data errors and automating corrections
What is the purpose of integration in data processing?
To represent linkage between different tables and maintain metadata
What does referential integrity checking ensure?
That referential integrity constraints are not violated
What are derivations in data processing?
Transformations based on business rules applied during data movement
What is the difference between denormalization and renormalization?
Denormalization breaks data into a simpler form, while renormalization restores a structured form
What is aggregation in data processing?
Populating summaries or cube dimensions in the staging area
What is the purpose of audit information?
To provide a reference for integrity checking
What is null conversion?
Transforming different forms of nulls from disparate systems
What are the two key questions in the extraction phase of ETL?
- What data should be extracted?
- How should that data be extracted?
What is a data mart?
A subject-oriented data repository for decision support and BI needs of a specific department
What does the ETL process stand for?
Extract, Transform, Load
What are the main steps in the ETL process?
- Get data from the source location
- Map data into a suitable model
- Validate and clean data
- Apply transformations
- Move data to the repository
- Load data into the warehouse
What is the financial value associated with increased profitability?
Derived from lowered costs or increased revenues
Financial value is crucial for evaluating business performance.
What does productivity value refer to?
Decreased workloads and high-quality outcomes
It emphasizes efficiency in processes like manufacturing.
What is trust value in a business context?
Greater customer, employee, or supplier satisfaction and confidence in forecasting
Trust value also includes better management reports and decision-making.
What does risk value encompass?
Improved visibility into credit exposure and confidence in capital investments
It also involves maintaining auditable records.
Define a data warehouse.
A centralized repository of information
📖 It is essential for data analysis and reporting.
How is a data warehouse organized?
Around relevant subject areas important to the organization
This organization supports targeted data analysis.
What is the primary use of a data warehouse?
For analysis and not for transaction processing
This distinction is critical for understanding its role in data management.
What characteristic does the data in a data warehouse have?
Nonvolatile
Nonvolatility means that data remains stable and is not frequently changed.
What is the target location for integrating data from multiple sources?
A data warehouse
It consolidates data from both internal and external sources.
Fill in the blank: A data warehouse provides a platform for different _______ to submit queries about enterprise information.
[consumers]
Consumers can be both human and automated.
What is the purpose of MySQL Workbench?
A visual database design and management tool for MySQL databases.
What are the features of MySQL Workbench?
Provides a user-friendly interface for:
- database modeling
- SQL development
- server configuration
- user administration
- backupand data recovery
In which use case is MySQL Workbench ideal?
Managing MySQL databases graphically, creating visual representations of database schemas, and managing complex SQL statements.
What is the purpose of the MySQL Command-Line Client?
A command-line interface for interacting with MySQL databases.
What are the features of the MySQL Command-Line Client?
Allows users to execute SQL queries, manage databases, and perform administrative tasks directly through the terminal.
Who is the target user for MySQL Command-Line Client?
Users who prefer command-line operations or are working on servers without a graphical interface.
What is SQL Server Management Studio (SSMS) used for?
A management tool for Microsoft SQL Server databases.
What are the features of SQL Server Management Studio (SSMS)?
Offers a graphical interface for SQL Server database development, management, and administration, including querying and performance monitoring.
What is phpMyAdmin?
A web-based application to manage MySQL databases.
What are the features of phpMyAdmin?
Offers a browser interface for executing SQL queries, managing tables, and handling database administration tasks.
What is the key difference in database compatibility between MySQL Workbench, phpMyAdmin, and SQL Server Management Studio?
- MySQL Workbench and phpMyAdmin are specifically for MySQL databases
- SQL Server Management Studio is exclusively for Microsoft SQL Server.
What is a primary key?
A unique identifier for each record in a table.
What are the characteristics of a primary key?
- Cannot contain null values
- Must contain unique values
What type of relationship does a primary key establish between tables?
Establishes a one-to-one or one-to-many relationship with foreign keys in other tables.
What is a foreign key?
A column or a set of columns in one table that references the primary key in another table.
What are the characteristics of a foreign key?
- Can contain null values
- Can have duplicate values
What type of relationship does a foreign key establish between tables?
Establishes a many-to-one relationship with the table containing the primary key.
What is a composite key?
A combination of two or more columns in a table that together uniquely identify a record.
What does a composite key allow; what does it consist of?
- May allow null values in some columns
- Always consists of multiple columns
What type of relationship does a composite key establish between tables?
Establishes relationships where each part of the composite key must be considered in conjunction with others.
What is a candidate key?
A column or set of columns that can qualify as a potential primary key for a table.
What are the characteristics of a candidate key?
- Can contain null values
- Must contain unique values
What SQL statement is used to order the result set?
ORDER BY
What is the purpose of the WHERE clause in SQL?
Filters records based on a specified condition.
What SQL statement is used to group rows that have the same values in specified columns?
GROUP BY
What is the purpose of the LIMIT clause in SQL?
Restricts the number of rows returned by the query.
What does the IN operator do in SQL?
Checks if a column’s value exists within a list of specified values.
What does the BETWEEN operator do in SQL?
Checks if a value falls within a specified inclusive range.
What is the purpose of the EXISTS
operator in SQL?
Evaluates whether a subquery returns any rows.
What does the ALL operator do in SQL?
Checks if all values in a subquery meet a specific condition.
What is the syntax of the UPDATE statement in SQL?
UPDATE table_name SET column1 = value1, column2 = value2, ... WHERE condition;
UPDATE
table_name SET
… is the core. However, always use WHERE
condition to specify which rows to update
What is the function of the COMMIT
command in SQL?
Saves all changes made in the current transaction to the database permanently.
What does the ROLLBACK command do?
Undoes all changes made in the current transaction.
What is the function of the SAVEPOINT command?
Creates a savepoint within a transaction allowing for partial rollbacks.
What is the UNDO command in SQL?
Not a standard SQL command; behavior is generally covered by the ROLLBACK command.
What is the purpose of the BETWEEN operator?
Filters results within a specific range (inclusive).
What is the LIKE operator used for in SQL?
Used for pattern matching in SQL.
What is the function of the BETWEEN operator in SQL?
The BETWEEN operator is used to filter results within a specific range (inclusive).
It checks if a column’s value is within a specified lower and upper limit.
What types of values can the BETWEEN operator be used with?
Numeric, date, or text values.
Provide an example of a query using the BETWEEN operator.
SELECT * FROM products WHERE price BETWEEN 50 AND 100;
What is the function of the LIKE operator in SQL?
The LIKE operator is used for pattern matching in SQL.
What wildcard character represents zero or more characters in the LIKE operator?
%
What wildcard character represents a single character in the LIKE operator?
_
Provide an example of a query using the LIKE operator.
SELECT * FROM employees WHERE last_name LIKE ‘Smi%’;
What is the function of the IN operator in SQL?
The IN operator is used to specify multiple values in a WHERE clause.
What types of values can the IN operator be used with?
Numeric, character, or date values.
Provide an example of a query using the IN operator.
SELECT * FROM customers WHERE country IN (‘USA’, ‘Canada’, ‘Mexico’);
What is the function of the EXISTS operator in SQL?
The EXISTS operator is used to check for the existence of any records in a subquery.
When is the EXISTS operator often used?
Frequently used with subqueries to filter records based on conditions in related tables.
Provide an example of a query using the EXISTS operator.
SELECT * FROM products p WHERE EXISTS (SELECT * FROM orders o WHERE o.product_id = p.product_id);
What does the EXISTS operator return if the subquery returns one or more records?
True
What does the BETWEEN operator do in SQL?
Filters data within a specified inclusive range.
What operators can achieve the same effect as the BETWEEN operator?
> = and <= (Greater Than or Equal To, Less Than or Equal To)
Provide an example of using >= and <= instead of BETWEEN.
SELECT * FROM employees WHERE salary >= 50000 AND salary <= 100000;
What does the > and < operators do in relation to the BETWEEN operator?
Similar to BETWEEN, but does not include the boundaries.
Provide an example of using > and < instead of BETWEEN.
SELECT * FROM employees WHERE salary > 50000 AND salary < 100000;
When should you use the BETWEEN operator?
When you want to specify a clear range with inclusive boundaries.
What is the role of the GROUP BY keyword in SQL?
Used to group rows that have the same values in specified columns into aggregate data.
Which SQL functions are commonly used with GROUP BY?
COUNT, SUM, AVG
Provide an example of a query using GROUP BY.
SELECT department, COUNT(*) FROM employees GROUP BY department;
What is the function of the ORDER BY keyword in SQL?
Used to sort the result set of a query by one or more columns.
What options can you specify when using ORDER BY?
Ascending (ASC) or descending (DESC) order.
Provide an example of a query using ORDER BY.
SELECT name, salary FROM employees ORDER BY salary DESC;
Is SORT BY a defined keyword in standard SQL?
No
What should you use instead of SORT BY in standard SQL?
ORDER BY
What is FILTER BY used for in SQL?
Typically used in contexts involving window functions.
Can FILTER BY be used as a standalone keyword in standard SQL?
No
Provide an example of a query using FILTER BY.
SELECT department, COUNT(*) FILTER (WHERE salary > 50000) FROM employees GROUP BY department;
What is the key difference between GROUP BY and ORDER BY?
GROUP BY aggregates data based on one or more columns, while ORDER BY sorts the results of the query.