Databases Flashcards
Structured Data
data is often organized to support transactional and analytical applications. Structured data is most commonly stored in relational databases but can also be stored in non-relational databases.
Unstructured Data
data is not organized in any distinguishable or predefined manner. Common stores for unstructured data are non-relational key-value databases. Unstructured data is full of irrelevant information, which means data needs to first be processed to perform any kind of meaningful analysis
Semistructured
data can be just as predictable and organized as structured data. The difference is that semistructured data is flexible and can be updated without the requirement to change the schema for every single record in a table. non-relational stores
OLAP
Relational DB online Transactional Processing. Optimize for Read
OLTP
Relational DB Online Transaction Processing. Optimize for Write
Aurora
Aurora is MySQL and PostgreSQL compatible.
log structured distributed storage layer
auto 6 copies of data across AZs
backed up to S3 with snapshots
serverless
Redshift
Amazon Redshift delivers 10 times faster performance than other data warehouses by using machine learning, massively parallel query execution, and columnar storage on high-performance disk. You can set up and deploy a new data warehouse in minutes. Run queries across petabytes of data in your Amazon Redshift data warehouse and exabytes of data directly from your data lake built on Amazon Simple Storage Service (Amazon S3) with Amazon Redshift Spectrum.
Steps to create RDS
- VPC
- Subnets (requires 2 in 2 different AZs)
- EC2
- Security Group
Aurora
- DB Instance Class (mem optimized or burstable)
2
Foreign Key
Used to create relationships between tables in a relational database
updating the schema
Adding a column in a nonelational db is not required to update. Relational update is required.
T/F Nonrelational databases are optimized for storage?
False: they are optimized for compute.
Which DB type scales vertically
Relational
T/F both types of DB use OLTP and OLAP?
Flase. Only relations uses OLAP
Key Value
Uses in Non-relational
Typicall stored in one table.
Can handle varied data
flexible
Document DBs
Uses Non relations
Python and Node.js
elements are person, place or thing.Mongo DB key word
In-Memory DB
Real time access to data
nonrelational
frequently accesses not frequently updated
cacheing gaming and session stortes
Graphy DB
Nonrelational
store data as nodes
visualize data
Elasticach for intensive apps
Amazon ElastiCache to support data-intensive apps or improve the performance of your existing apps by retrieving data from high throughput and low latency in-memory data stores. This service offers fully managed Redis and Memcached cache engines for in-memory data stores. ElastiCache is a popular choice for gaming, advertising technology (ad tech), financial service, healthcare, and Internet of Things (IoT) apps.
Redis and Inmemory
” elasticache key word” Uses as noralational engines for quick acces to data
Amazon DynamoDB
can handle more than 10 trillion requests per day and support peaks of more than 20 million requests per second. More than 100,000 AWS customers have chosen DynamoDB as their key-value and document store database for mobile, web, gaming, ad tech, IoT, and other applications that need low-latency data access at any scale. DynamoDB supports ACID-compliant transactions
uses partition keys for table and can use sort keys for sort
If no sort keys primary key and partition are the same.
billed for each read and write
Neptune
fully managed graph database service that makes it easy to build and run applications that work with highly connected data sets used to discover potential fraudulent behavior before it happens. This starts with finding interactions between products, locations, and devices and then mapping those data points to individual users, customers, and/or employees.
Neptune graph use cases include recommendation engines, fraud detection, knowledge graphs, drug discovery, and network security.
Can be structured, semistructured, or unstructured.
Amazon DocumentDB
reliable, and fully managed database service that allows you to set up, operate, and scale MongoDB-compatible databases in the cloud. With Amazon DocumentDB, you can run the same application code and use the same drivers and tools that you use with MongoDB.
Semistructured data
RDS security groups facts
They don’t exist. Use DB security groups.
What are clustered indexes used with
Relational DBs only or document DB
heterogeneous migration
Heterogeneous migrations, where you migrate between different database engines, requires use of the AWS Schema Conversion Tool (AWS SCT) to translate your database schema to the new platform.
Redshift comprised of
Amazon Redshift clusters are comprised of nodes. Compute nodes divide work among slices. Each slice is assigned a portion of the node’s memory and drive space. When you connect to an Amazon Redshift cluster, you use the SQL endpoint.
There are only two types of nodes in Amazon Redshift: a single leader node and one or more compute nodes. Amazon Redshift cannot use OLE DB drivers.
Document DB
Amazon DocumentDB clusters can only be run in an Amazon VPC. Amazon DocumentDB decouples storage and compute, enabling each to scale independently. All instances within the cluster support data reads.
The basic component of Amazon DocumentDB is the cluster, which contains a storage volume and instances. Once the cluster is provisioned, you can add and remove instances as needed.
Stores data in Json forms and semistructured documents
Benefits of Document DB
Flexible indexing (Correct)
Ad-hoc querying (Correct)
Powerful analytics (Correct)