Big Data Flashcards
What is AWS?
A secure cloud services platform that provides compute power, database storage, content delivery or other functionality. AWS has the services to help you build sophisticated applications with increased flexibility, scalability and reliability.
What is the cloud/cloud computing?
The cloud is a network of servers, and each server has a different function. Some servers use computing power to run applications or “deliver a service.” Some servers provide an online service, like Adobe Creative Cloud, and others allow you to store and access data, like Instagram or Dropbox.
Any time you store information without using up your phone’s internal data, you’re storing information on the cloud.
Why use the cloud?
Where in the past, people would run applications or programs from software downloaded on a physical computer or server in their building, cloud computing allows people access the same kinds of applications through the Internet.
Working on the cloud allows your company to be nimble, efficient and cost-effective. If your company quickly needs access to more resources, it can scale quickly in the cloud. Conversely, if it needs to downscale or reduce resources, it can do so just as easily. Because of this scalability, the cloud’s elasticity is often compared to that of a rubber band.
What is a server?
A computer program or a device that provides functionality for other programs or devices, called “clients”. This architecture is called the client–server model, and a single overall computation is distributed across multiple processes or devices. Servers can provide various functionalities, often called “services”, such as sharing data or resources among multiple clients, or performing computation for a client.
Client–server systems are today most frequently implemented by (and often identified with) the request–response model: a client sends a request to the server, which performs some action and sends a response back to the client, typically with a result or acknowledgement.
Typical servers are database servers, file servers, mail servers, print servers, web servers, game servers, and application servers.[2]
What is Hadoop?
Hadoop is an open-source software framework for storing data and running applications on clusters of commodity hardware. It provides massive storage for any kind of data, enormous processing power and the ability to handle virtually limitless concurrent tasks or jobs.
What is SaaS?
Software as a service (SaaS) is a software distribution model in which a third-party provider hosts applications and makes them available to customers over the Internet. SaaS is one of three main categories of cloud computing, alongside infrastructure as a service (IaaS) and platform as a service (PaaS)
What is Redshift?
Amazon Redshift is a fast, fully managed, petabyte-scale data warehouse that makes it simple and cost-effective to analyze all your data using your existing business intelligence tools.
What is S3?
Amazon Simple Storage Service is storage for the Internet. It is designed to make web-scale computing easier for developers.
Amazon S3 has a simple web services interface that you can use to store and retrieve any amount of data, at any time, from anywhere on the web. It gives any developer access to the same highly scalable, reliable, fast, inexpensive data storage infrastructure that Amazon uses to run its own global network of web sites. The service aims to maximize benefits of scale and to pass those benefits on to developers.
What is a distributed system?
Computation occurs on multiple networked computers that communicate and coordinate their actions by passing messages
What is a low latency system?
“Low-latency systems” are those for which latency is the main measure of success and is usually the toughest constraint to design around.
What is service-oriented architecture?
A service-oriented architecture (SOA) is a style of software design where services are provided to the other components by application components, through a communication protocol over a network.
Service-oriented architecture is less about how to modularize an application, and more about how to compose an application by integration of distributed, separately-maintained and deployed software components. It is enabled by technologies and standards that make it easier for components to communicate and cooperate over a network, especially an IP network.
What is big data?
Extremely large data sets that may be analyzed computationally to reveal patterns, trends, and associations, especially relating to human behavior and interactions.
What is a data warehouse?
A large store of data accumulated from a wide range of sources within a company and used to guide management decisions.