SDI Terminology Flashcards
Bandwidth
Bandwidth is how much information you receive every second. You can compare it to a bathtub. If the bathtub faucet has a wide opening, more water can flow at a faster rate than if the pipe was narrower. The water is like bandwidth.
Memory
An electronic holding place for the instructions and data a computer needs to reach quickly. It’s where information is stored for immediate use. Without memory, a computer wouldn’t function.
Capacity
When referring to a disk/drive, capacity is the maximum amount of data a device such as a hard drive can hold.
Storage
A mechanism that enables a computer to retain data, either temporarily or permanently.
Horizontal Scaling
Also called scaling out. Refers to adding additional nodes or machines to your infrastructure to cope with new demands. For instance, adding servers.
Vertical Scaling
Also called scaling up. Describes adding additional resources to an existing system so that it meets demands.
Load Balancers
Refers to efficiently distributing incoming network traffic across a group of backend servers (called server farm or server pool).
Layer 4 vs. Layer 7
Layer 4 uses only TCP connection from client to the server while layer 7 uses two TCP connections from client to the server.
Layer 7 has application awareness and makes smart and informed load balances based on the content of the data, whereas layer 4 carries out its load balancing based on its built in software algorithm. Layer 7 is great for microservices.
Sharding
A “shard” means a small part of a whole. Hence, sharding means dividing a larger part into smaller parts. Shards are not only smaller, but also faster and hence easily manageable.
Active-Active vs. Active-Passive
Both are used for high-availability configurations.
Active-Active - made up of at least two nodes, both actively running the same kind of service simultaneously. Used to achieve load balancing.
Active-Passive - made up of at least two nodes, but not all nodes are going to be active. The passive (failover) server serves as a backup that’s ready to take over as soon as the active (primary) server gets disconnected or it unable to serve.
DNS
Domain Name System - a hierarchical naming system built on a distributed database for computers, services, or any resource connected to the Internet or a private network. Most importantly, it translates human readable domain names into the numerical identifiers associated with networking equipment, enabling devices to be located and connected worldwide. Analogous to a network “phone book”, DNS is how a browser can translate a domain name (e.g. facebook.com) to the actual IP address of the server, which stores the information requested by the browser.
CDN
A network of servers that distributes content from an “origin” server throughout the world by caching content close to where each end user is accessing the internet via a web-enabled device. The content they request is first stored on the origin server and it then replicated and stored elsewhere as needed.
Caching
A high-speed data storage layer which stores a subset of data, typically transient in nature, so that future requests for that data are served up faster than is possible by accessing the data’s primary storage location. Caching allows you to efficiently reuse previously retrieved or computed data.
Database Replication
Database replication refers to the process of copying data from a primary database to one or more replica databases in order to improve data accessibility and system fault-tolerance and reliability. Typically an ongoing process which occurs in real time as data is created, updated, or deleted in the primary database but it can also occur as one-time or scheduled batch projects.
Redundancy
Redundancy is the duplication or mirroring of a device or data that helps prevent from becoming lost or a device from becoming unavailable.
mapReduce
Algorithm/Technique that contains two important tasks: Map and Reduce. Map takes a data set and converts it into another set of data, where individual elements are broken down into tuples (key-values). Reduce takes the output from map as an input and combines those data tuples into a smaller set of tuples. The major advantage of mapReduce is that it’s easy to scale data processing over multiple computing nodes.
Cache Eviction
The process by which old, relatively unused, or excessively voluminous data can be dropped from the cache, allowing the cache to remain within a memory budget.
CAP Theorem
Applies the logic that a distributed system can deliver only two of three desired characteristics: consistency, availability, and partition tolerance.
ACID
The presence of four properties - atomicity, consistency, isolation, and durability - can ensure that a database transaction is completed in a timely manner. When a database possess these properties, they are said to be ACID compliant.
BASE
Stands for:
Basically Available - rather than enforcing immediate consistency, BASE-modelled NoSQL databases will ensure availability of data by spreading and replicating it across the nodes of the database cluster
Soft State - due to the lack of immediate consistency, data volumes may change over time.
Eventual Consistency -
Strong vs. Eventual Consistency
Strong consistency means the latest data is returned, but, due to internal consistency methods, it may result with higher latency or delay. With eventual consistency, results are less consistent early on, but they are provided much faster with low latency.
CPU
Central Processing Unit is the electronic circuitry that executes instructions comprising computer program. The CPU will be completing calculations by utilizing its billions of transistors. These calculations run the software that allows a device to perform its task.
http vs. http2
TCP/IP Model
IPv4 vs. IPv6
The fourth version of IP was introduced in 1983. The supply of available IPv4 addresses has become depleted. IPv6 has more permutations and it thus becoming the standard.