Database Flashcards
What is the main purpose of the vector database in the movie application?
a) To store movie reviews
b) To support searching by data embedded as a vector
c) To store movie scripts
d) To manage user accounts
b) To support searching by data embedded as a vector
Which of the following is NOT a functional requirement of the movie application?
a) Look up a movie
b) Add/update an image for a movie
c) Stream movies in real time
d) Suggest similar movies based on the current movie
c) Stream movies in real time
Why was Apache Cassandra chosen as the database for this application?
a) It supports distributed storage and scalability
b) It is the only database available on Astra
c) It has built-in streaming capabilities
d) It does not require indexing
a) It supports distributed storage and scalability
What type of similarity algorithm is commonly used in vector searches?
a) Binary Search Algorithm
b) K-Nearest Neighbor (KNN)
c) Merge Sort
d) Dijkstra’s Algorithm
b) K-Nearest Neighbor (KNN)
What is the purpose of the ‘movies’ table in the database schema?
a) To store detailed information about each movie, including vector embeddings
b) To store user reviews and ratings
c) To manage movie streaming data
d) To store only movie titles
a) To store detailed information about each movie
Why do we create a ‘movies_by_title’ table instead of using a secondary index?
a) It improves query performance by avoiding high resource consumption
b) Secondary indexes do not work in Apache Cassandra
c) The database does not support queries by title
d) It allows us to store duplicate movie titles
a) It improves query performance by avoiding high resource consumption
What is the primary key of the ‘movies_by_title’ table?
a) movie_id
b) title
c) imdb_id
d) movie_vector
b) title
What is the significance of using a token-aware load-balancing policy in Apache Cassandra?
a) It ensures that queries are sent directly to the nodes responsible for the requested data
b) It improves the accuracy of vector searches
c) It allows for better video streaming
d) It removes the need for partitioning
a) It ensures that queries are sent directly to the nodes responsible for the requested data
Which provider is selected for the Astra DB database in this setup?
a) Microsoft Azure
b) Amazon Web Services
c) Google Cloud
d) IBM Cloud
c) Google Cloud
What is the purpose of the ‘StorageAttachedIndex’ in the database schema?
a) To enable efficient vector search queries
b) To store user authentication tokens
c) To manage movie streaming data
d) To enforce uniqueness constraints on movie titles
a) To enable efficient vector search queries