Cloud Data Security Flashcards
List data lifecycle phases
Create Store Use Share Archive Destroy
Nmemonic - Colorado State University Stinks at Dodgeball
In the cloud data lifecycle list examples of data in the create phase?
New Data can be:
Freshly generated
Imported data new to the cloud
Data that has been updated/modified and has a new shape or state
- Describe the create phase.
2. What actions should be performed on data in the create phase?
- Data/digital content is CREATED, ACQUIRED, VERSIONED OR MODIFIED
2. Classify, Tag/Label data Ensure the right security controls are implemented Tag data with important attributes Assign access restrictions, as needed
- Describe the Store phase.
2. What activities happen during the Store phase of the cloud data lifecycle?
- Committing the data to some sort of STORAGE repository
- Store
Assign and PROTECT with security controls (e.g. ENCRYPTION, ACL, logging, monitoring)
Consider back up
- What happens in the Use phase of the cloud data lifecycle?
- What is not included?
- Viewing, processing, or using/consuming of data previously stored
data is protected with DLP, DRM/IRM - Read-only phase, so no modifications
What state must data be in order to be in the Use Phase?
Given this state what security mitigations should be taken?
Data must be unencrypted to be used
File access monitors, logging and monitoring, or Information Rights Management systems are important to detect and prevent unauthorized access during this phase
Should only be used based on NTK and Least Privileged
Share Phase of Cloud Data Lifecycle
Data is made available for use by others, such as employees, customers, and partners.
Should only be shared based on NTK and Least Privileged
data is PROTECTED with DLP, DRM/IRM
What security mitigations should be taken during the share phase of the cloud data lifecycle?
Proper encryption (in transit) is important during this phase, as well as IRM and Data Loss Prevention (DLP) technologies that help ensure sensitive data stays out of the wrong hands.
Archive Phase of Cloud Data Lifecycle
Example activity during Archive Phase
The Archive phase involves data transitioning from active use to long-term “cold” storage. Archiving can entail moving data from a primary storage tier to a slower, less redundant tier that is less expensive or can include moving data off the cloud to a separate medium altogether (backup tape, for example).
Cost and availability considerations can affected data access (hard to get data back from LTS)
Ex of activity Data Retention Policy enacted, Encryption
Destroy Phase of Cloud Data Lifecycle
Destroying data involves completely removing it from the cloud by means of logical erasure or physical destruction (like disk pulverizing or degaussing).
In cloud environments, customers generally have to rely on logical destruction methods like crypto-shredding, purge, clearing or data overwriting, but many CSPs have processes for physical destruction, per contractual agreements and regulatory requirements.
Data dispersion
Data dispersion is the process of replicating data throughout a distributed storage infrastructure that can span several regions, cities, or even countries around the world.
each storage block is fragmented and storage application writes each bit into different physical storage containers
Erasure coding
A more specific implementation of data dispersion, it enhances data security by segmenting a file, encrypting the segments and then spreading the segments out across multiple locations — meaning a compromise of any location would yield only a portion of the file.
Volume
A volume is a virtual hard drive that can be attached to a Virtual Machine (VM) and utilized similar to a physical hard drive. The VM Operating System views the volume the same way any OS would view a physical hard drive in a traditional server model.
AKA Block Storage
Object?
Object Storage?
An object is file storage that can be accessed directly through an API or web interface, without being attached to an Operating System
Object storage consist of the files that are actually just virtual objects in an independent storage structure that rely on key values to reference and retrieve them.
What type of storage does PaaS utilize?
Structured storage - RDBMS and others for searching and running operations
Unstructured storage
What type of data storage is used by SaaS
Information Storage and Management - customer data entry into app via web interface, the app storing int back end database and generating data on behalf of the customer and stored internally
Content and file storage
Content Delivery Network (CDN)
Ephemeral storage
Ephemeral storage is temporary storage that accompanies more permanent storage. Ephemeral storage is useful for temporary data, such as buffers, caches, and session information.
Raw-disk storage
Raw-disk storage is storage that allows data to be accessed directly at the byte level, rather than through a filesystem.
Threats to storage types
Unauthorized access or usage (external or malicious insider)
Data Leakage and exposure - CSPs must protect data when replicated/distributed across regions
Denial of Service - CSP must manage spikes in bandwidth, otherwise data availability is at risk
Corruption or loss of data - read CSPs data terms that include availability and durability SLOs and SLA’s; can happen intentionally or accidentally
Durability (or reliability)
Durability (or reliability) is the concept of using data redundancy to ensure that data is not lost, compromised, or corrupted.
Durability vs Availability
Availability focuses on uptime through redundancy, Durability focuses on reliability through redundancy
Data loss prevention (DLP)
Data loss prevention (DLP), also known as data leakage prevention, is the set of technologies and practices used to identify and classify sensitive data, while ensuring that sensitive data is not lost or accessed by unauthorized parties.
Components of DLP?
Discovery and classification - first stage of DLP
Monitoring - monitoring the data against one or more policies; if policy is violated, the system provides an alert
Enforcement - policy violations are logged/alerted, or blocked from unauthorized exposure or loss and kept inside boundary
DLP Implementation/Architecture
Data At Rest - At Rest Implementation runs where the data is stored (cloud or endpoint device, e.g. workstation, file server, and other storage system)
Data in Motion/Transit - Network-based DLP protect data in transit, monitors outbound traffic (HTTP, HTTPS, FTP, SMTP) near network perimeter
Data in Use - data being processed in RAM and/or CPU. DLP should be installed on endpoint device (host/client-based DLP) preventing sharing or processing of data that violates policy
Host-based or endpoint-based DLP run on workstation or other endpoint device
Data de-identification/Anonymization
Data de-identification (or anonymization) is the process of removing information that can be used to identify a specific individual from a dataset.
Data sanitization technique with an intent to protect privacy
Name Techniques available to de-identify or anonymize sensitive information
Masking (obfuscation)
Tokenization
Masking
Masking is the process of partially or completely replacing sensitive data with random characters or other nonsensitive data.
Masking entails hiding, replacing, or omitting specific fields or data in particular user views in order to limit data exposure in the production environment.
List masking techniques
Substitution SECRET=$3(837
Scrambling/Shuffling SECRET = TEESRC
Deletion or nulling SECRET = EE
Substitution
Substitution is a de-identification or anonymization masking technique that mimics the look of real data, but replaces (or appends) it with some unrelated value.
Substitution can either be RANDOM or ALGORITHMIC, with the latter allowing two-way substitution — meaning if you have the algorithm, then you can retrieve the original data from the masked dataset
Scrambling
Scrambling is a de-identification or anonymizing masking technique that mimics the look of real data, but simply jumbles the characters into a random order.
Deletion or nulling
Deletion or nulling is a de-identification or anonymization masking technique is just what it sounds like. When using this masking technique, data appears blank or empty to anyone who isn’t authorized to view it.
Data masking approach?
Static data masking - a new copy fo the data is created with the masked value
Dynamic data masking - on the fly masking that adds a layer of masking between app and DB
Static Data masking - Typical Use-case
Static masking is the better option when you need to use “real” data in a development or test environment.
Dynamic data masking - Typical Use-case
Requires a masking layer in between the storage component and the application. This type of masking is great when you need to use production environments in a confidential or private manner.
Tokenization
Tokenization is the process of substituting a sensitive piece of data with a nonsensitive replacement, called a token. The token is merely a reference back to the sensitive data, but has no meaning or sensitivity on its own.
Tokenization is able to track back to the original data (two way function), but it is expensive.
Important to make sure proper authentication is being done when storing and accessing sensitive data
Data discovery
Data discovery is the process of finding and identifying sensitive information in your environment.
Approaches to data discovery
Metadata - use meta data to find it
Labels - label sensitive data to find it
Content - analyze the actual content
Data classification
Data classification is the process of categorizing and organizing data based on level of sensitivity or other characteristics.
critical component for risk management, data security, and compliance.
Categories of Sensitive Data
Protected Health Information (PHI)
Personally Identifiable Information (PII)
Cardholder data
Describe types of PII
Nonsensitive PII includes data that can be used to identify an individual, but are publicly available
Sensitive PII information not publicly available
Direct identifiers can be used on their own to identify an individual. (e.g. SSN) 1:1 identification
Indirect identifiers can help narrow down a set of individuals, but cannot be used to identify a single individual on its own (e.g. birthdate)
Aggregate risk
aggregate risk — multiple pieces of information can be combined to create something more sensitive than any of its individual components.
PHI
Protected health information (PHI) is information related to the past, present, or future health status of an individual that was created, used, or obtained in the course of providing healthcare services, including payment for such services.
Can also include PII
IRM
Information Rights Management (IRM) is a data security technology that protects data (typically files, but also emails, web pages, and other information) from unauthorized access by limiting who can view, copy, forward, delete, or otherwise modify or access information (e.g. print, download, etc).
Commonly implemented with IAM, ACLs and encryption
Digital rights management (DRM)
Uses ENCRYPTION to enforce COPYRIGHT restriction on digital media
Done to protect intellectual property
Two types: Consumer DRM and Enterprise
What recourse does the cloud customer have if inadvertent or malicious disclosure of PII occurs in the cloud WRT the cloud provider? Why?
None. Under current law, no cloud customer can transfer risk or liability associated with the onadvertent or malicious disclosure of PII
Consumer Digital Rights Management
Controlling the access, execution, copying and alteration of copyrighted information from a publisher to the consumer
Enterprise Digital Rights Management
A solution used by the organization to protect assets such as documents and/or email from within the organization or partners
List Key Data Functions
Access
Process
Store
Align Key Data Functions to Data Lifecycle
Access - happens in all phases (Create, Store, Use, Share, Archive, DestroY (ALL)
Process - happens in Create and Use (CPU)
Store - happens in Store and Archive (SAS)
List Cloud IaaS Storage
Volume Storage - Block or File Object Storage Raw Storage Ephemeral Storage Long Term Storage
List PaaS Storage
Unstructured - Non Relational or NoSQL
Structured - SQL or Relational (RDBMS)
List SaaS Data Storage, Describe each
Information Storage and Management System - data entered into web interface and stored in SaaS back end DB
Content or File Storage - file based content stored within application
Content Delivery Network (CDN) - content in object storage distributed to multiple geolocations to improve consumption speed
Data Annonymization - Indirect vs Direct
Direct - sanitization of data that uniquely identified a subject (e.g. name, email, PII)
Indirect - sanitization of data that consist of something other than direct so e.g. demographic, socioeconomic, events, etc.
List Media Sanitization Techniques
- Physical Destruction
- Clearing/Overwriting
- Purging
- Cryptographic erasure/Crypto Shredding
- Degaussing
Degaussing
Magnetically scrambling data on a conventional hard drive or tape drive
Cannot degauss an SSD
Clearing/Overwriting
Prepping the media for re-use at the same classification level
Data cannot be recovered ty using normal system functions or utilities
However the data may not be immune to recovery with special tools in a lab environment
Purging
Removing sensitive data from a system with the intent that the data cannot be reconstructed by any known technique
Crypto Shredding
involves encrypting data and when the data is ready to be retired, destroying the keys so that the data can’t be accessed because it can’t be decrypted
best used with symmetric key encryption because there is only one key to delete and original encrypt is cheaper, faster, and simpler
Information Rights Management (IRM)
AKA
is a data security technology that protects data (typically files, but also emails, web pages, and other info) from unauthorized access by limiting who can view, copy, forward, delete or otherwise modify information
Provides:
- Encryption
- Protection of not just sensitive data but can also be used for the protection of web pages, emails, databases, etc.
- Access controls, even based on location
- A baseline for an information protection policy
Data Rights Management (DRM :(, not Digital Rights Management)
-Feature Goals of Security Architecture When IRM is Incorporated
Continuous protection - despite location or data life cycle or at rest or in transit
Automatic expiration - using IRM, set policy that automatically revoke access to data after a predetermined lifetime
Dynamic Control - using IRM , manage access permissions even when data has been distributed
Auditability - IRM provides continuous audit trail
Integration and support - IRM is usually interoperable with email filters, data formats, and other infrastructure or software in the enterprise
Data Retention Policies
Retaining information for organization or regulatory COMPLIANCE
Data Backup vs Data Archive
Data backup - the process of backing up data the organization is CURRENTLY using/processing for disaster recovery purposes
Data archive - retrain data per retention policy that is NO LONGER being used/processed
Continuous Monitoring
NIST 800-137: Information Security Continuous Monitoring (ISCM) maintaining ongoing AWARENESS of information security, VULNERABILITY, and threats to support organizational risk management decisions
NOT monitoring from a 24x7x365 concept; this is security monitoring as policy on a cyclical basis
Information Security Continuous Monitoring (ISCM) Strategy
- Clear understanding of the organization risk tolerance/appetite
- METRICS to understand organizations security status
- Ensure security controls are effective
- Verifies compliance with corporate governance
- Continuous awareness of threats and vulnerabilities
Continuous Operations
The organization is following these principles:
- Audit Logging - giving Sr. Management ASSURANCE the organization is adhering to legal, statutory and regulatory compliance
- Contract/authority maintenance: having liaisons between organizations and applicable authorities (regulatory, local, national, legal, jurisdiction) and to be ready to engage with them in case of a forensic investigation
- Data governance: POLICIES and PROCEDURES
(Make sure we are always available)