Technical Questions Flashcards
What is an Index?
A Data structure that makes it faster to retrieve records from a database
- -> Lets you sort records on multiple fields
- -> Holds a value for the field, as well as a pointer to the related record
- -> Index is then sorted, letting you do binary searches on it
Example:
Database of students, primary key is student ID which is sorted. Lets say you have names, and grades
If you want to query a student with a certain name, without indexing you’d have to look through every single row
if you index based on the student name, with that field sorted you can use a binary search which is much faster
What is a Compound Index?
A single index that references more than one field
- -> You can have more than one index and that way if you are querying an item on more than one field, you can get that record faster.
- -> can also save time if you want to sort the results, so if the field is indexed it will be pre sorted so you don’t have to sort it after retrieving it
What is a Correlated Query
A Nested Query,
so a query in a query, where the inner query uses values from the outer query
What happens when I submit a query to a database?
- Syntax is checked
- Checks if the tables and columns you are referencing exist
- Determines whether to return results from memory or from the storage system depending on whats faster
How would you go about troubleshooting a performance issue?
Start by looking at
- -> Disk Usage
- -> Memory Usage
- -> CPU Usage
- -> Network Bandwidth
Monitor any running applications to see if they are hogging resources
Is the issue consistent or intermittent?
You might need to log performance and try to replicate the issue to narrow it down to a specific piece of software or process
Linux –> Top or Htop, Vmstat
Windows –> Task Manager
What is TCP?
Transmission Control Protocol
- -> Breaks down application data in to packets
- -> Sends or accepts the packets from the network layer
- -> handles error checking
- -> Acknowledges when packets arrive
What is UDP?
User Datagram Protocol
- -> Less latency than TCP
- -> Doesn’t check if packets were sent successfully
- -> Drops errors, but relies on the application to deal with the results
What is DNS
Domain Name System
–> Translates names into IP addresses,
Example: www.mongodb.com in a web browser, the DNS resolves this to the public IP of the website
What is DHCP?
Dynamic Host Configuration Protocol
- -> Leaves it up to the host to automatically assign IP addresses to clients
- -> Will ensure you don’t have IP conflicts
- -> IP addresses of clients won’t stay consistent
- -> You can set a static IP address to make sure it will stay the same for that client
Explain RAID
Raid 0, 1, 5 and 10
RAID –> Redundant array of disks, combines multiple disks into one cohesive unit
RAID 0 –> Data is written across multiple disks, so you utilize all the disks to write data making it faster
RAID 1 –> Data is copied across disks so you have redundancies
RAID 5 –> Faster than raid 1 but also has fault tolerance. Not sure about the details
RAID 10 –> uses twice as many disks, but if one fails you don’t have to read from all the disks to rebuild it, just mirrored one
What Utilities can you use in Linux to monitor performance?
Top or HTop –> display CPU usage, memory usage, swap memory, process PIDs and a bunch of other stuff
Lsof –> lists all open files, can help to figure out what files are in use
tcpdump –> lets you analyze network packets
netstat –> gives network statistics
Iostat –> if you install sysstat you can use it to show I/O storage statistics
How do you troubleshoot a connection issue between an application and a database server?
- Check the connection string
- Ensure the network connection is alive
- Make sure your drivers are up to date?
How does Swap Space work?
Used to free up RAM by using Disk Space
- -> Looks for blocks of memory that are rarely used and stores them in the disk
- -> frees up space for RAM
thats the extent of my understanding
How do you configure RAM and CPU in a docker?
I’m not super experienced using docker
–> I assume you can configure this using the CLI, unsure of the exact command
How do you split data across mongo servers?
“Shard” the collection of data
- -> Look at how many entries there are and how big they are, then come up with a uniform size to split it into
- -> Then you can spread the entries out over different machines to utilize more hardware for faster performance
Why does Mongo/NoSQL scale better?
MongoDB stores data in non structured documents
Binary formatted JSON
Self contained documents, so they can be spread out across multiple nodes
What are C-Groups?
A container for processes where you can limit the resources that group of processes is allowed to use