LO1 LO2 Flashcards
What is Analytics Platform
Tool to extract insight / value from data.
Integrated data platform, centralized data warehouse, ML / AI, data management platform.
What does Cloud Analytics Platform do?
- Enable access to different data sources
- Access to comprehensive hardware resources
- Make models deployable with seamless integration (real time, batch streaming) available from any API
- Merge outcomes of different disciplines for one goal, insights.
What does Analytics Platform do for IT?
- Easy deployment
- Governance
- Self Service
- Security and Compliance
- Reliability and Performance
- Cost Effective
- Automation
- Easy Migration
Benefits of Analytics in Cloud
- Less time spent on preparation → get insights faster
- Confident decision making
- Pay as you go - cheaper.
- Less effort for insight generation
- Data Combination - more data, combination of internal and external data.
- Scalability - ease of upgrade of new computing resources, pay for what you need, probably cheaper than on premise computing, speed is faster.
- Security
- Efficient model development, Algorithms - AutoML, AutoAI, many algorithms available, some consider multiple.
- Ability to code in any language. Easy to change architecture - less process, faster than on premise.
Roles in Analytics Deployment
- Business Sponsors - decision makers, hold finances, example: Heads of Analytics, LOB.
- IT Decision Makers - work on software and hardware infrastructure, focus on data sources, integration, discovery and sharing.
- Data Scientists - writing code, manipulating data, creating code, looking for correlations. Subject matter experts in the analytics field. Their work is often affected by bad data, siloed data, lead times for getting data).
Data Ecosystem PPT
People - DevOps culture, sharing, aligned incentives.
Process - automation, agility, continuous deployments.
Technology - API and Microservices, Code Pipelines.
Balance of people, processes, and technologies drive organizational change
You need to balance the three components and maintain good relationships between them to maximize efficiency
Skills Across Analytics (TBNC)
Traditional - integration, storytelling, statistics, reporting.
BigData - structured, unstructured data, data restructuring, experimentation.
New Data Economy - ML, Cloud Analytics, change management.
Cognitive - NLP, AI, NN
Cloud Computing Pros and Cons
Pros | Cons |
| — | — |
| No upfront costs - pay as you go. | Reliance on internet |
| Easy software updates | Reliance on vendor / service provider |
| Reduces utility costs | Data Transfer - not easy. |
| Easy to learn | Bandwidth - billing can also be complex. |
| Ease of Access | Affected if many people use it |
| Centralization of Data | Security can still be an issue |
| Data Recovery | Non Negotiable Agreements |
| Sharing | Can grow costs with time |
| Security | Lack of full support |
| Free storage | Minimal Flexibility |
What is cloud computing?
Cloud Computing - class of network based computing that takes place over the Internet. Hide the complexity and details of the infrastructure from users and applications, provides simple graphical interface or API.
Characteristics of Cloud Computing (RACS)
- Remote: Services or data are hosted remotely.
- Available anywhere: Services or data are available from anywhere.
- Commodified: Pay for what you would want
- Self-managing services - provisioning services automatically without interaction with provider
Cloud Computing Workload Patterns
- On and Off - batch job, over provisioned capacity is wasted.
- Growing fast - keeping up with growth is difficult, complex time for deployment.
- Unpredictable Bursting - unexpected jumps in demand, impacts performance but cannot overprovision for extremes.
- Predictable Bursting - seasonality trends, periodically increasing demand, complex and wasted capacity.
Cloud Computing – Service Models (2)
Application Focused (SAD)
- Services - business services, PayPal, Google Maps, Alexa etc.
- Application - Google Apps, Microsoft Online (eliminates need for local installation)
- Development - software used to build custom cloud apps (SalesForce)
Infrastructure Focused (PSH)
- Platform - cloud based platforms, Amazon ECC.
- Storage - data storage, iDisk etc.
- Hosting - physical data centers, like IBM.
Three Service Models (SPI)
- SaaS - application accessed online not managed by your company, but by the software provider. This relieves organization from the constant pressure of software maintenance, infrastructure management, network security, data availability, etc. Consumer does not manage OS, storage, applications, but has control over the functionalities of the service. Example: Google Aps, Zoho, HubSpot, SalesForce, Google Docs,
- PaaS - build and deliver custom applications without the need of installing and working with IDEs, on cloud. Consumer does not or control underlying cloud infrastructure, OS, storage, but has control over deployed applications. Ex: Azure, Hadoop.
- Iaas - rent processing, storage, computing resources, virtual private servers on demand and over the web. Does not manage cloud infrastructure, but can manage OS, storage, applications deployed, databases, security etc. Example: Amazon EC2
IaaS Enabled / Service
Enabled with: Virtualization
Service: Resource Management Interface & System Monitoring Interface
What is Virtualization?
Abstraction of logical resources away from physical resources.
Multiple OS share same physical hardware and provide different services.
Benefits: security, convenience, availability.
Virtual Workspaces
Abstraction of an execution environment that can be available to authorized clients, a cloud-based desktop, ensures provision of virtual machines for users that they can then get on the VPN and connect to and have the applications and access they need—and these machines can be properly secured. Ran on virtual machines.
Benefits of Virtual Machines
- Running of systems where hardware cannot handle.
- Easier to create new machines.
- Test software on clean installs of OS.
- More machines than are physically available.
- Easy migration.
Properties of Virtual Machines (MIARSE)
- Manageability Interoperability
- Availability and Reliability
- Scalability and Elasticity
Two Elements of IaaS
### Resource Management Interface (MSN) ### System Monitoring Interface
Resource Management Interface (MSN) 3 Elements
- Virtual Machine - provide basic virtual machine operations, i.e. creation, suspection, termination etc.
- Virtual Storage - basic virtual storage operations. i.e. space allocation, space release, data writing, reading.
- Virtual Network - basic virtual network operations, IP allocation, domain name register, connection, bandwidth.
System Monitoring Interface Elements (MSN)
- Virtual Machine - monitor CPU usage, memory, network loading
- Virtual Storage - space usage, data duplication, bandwidth..
- Virtual Network - network bandwidth, load balancing.
PaaS Enabled and Provide Service
Enabled through: Runtime Environment Design - collection of software services available.
Provide Service: Programming IDE
What services does PaaS provide?
What should it offer?
Programming APIs & Development tools
Users can use the Programming IDE to develop their service. Should provide full functionalities of an environment
Should offer a debugger, testing environment, profiler…
Offers computation, storage, communication resource operation…
SaaS Enabling and Providing
Enabling technique: Web Service
Provide Services: Web Based Applications & Web Portals