3.3 Flashcards
Data at rest
is data that has arrived at a destination in a file system, database, or object storage (disk,
tape) and is not being accessed or used
- It typically refers to stored data and excludes data that is moving across a network or is temporarily in computer memory or Redis cache waiting to be read or updated
- Data at rest is data that is not dynamically moving from device to device or network to network
Data in transit
is being packet forwarded or
switched over a wireless or wired network in a unicast, broadcast, multicast, or anycast
fashion
Examples include:
* Wired Ethernet
* Cable (DOCSIS)
* Fiber optic
* 802.11 wireless
* Cellular
* Satellite
* Personal area networking using RFID,
Bluetooth, Infrared, Zigbee, and more
data in use
This is active data undergoing processing, translation, analysis, change, or other manipulation
Examples include:
* Data in system RAM memory
* CPU registers
* Caches and buffers
* Data in Memcached or Redis clusters
* Database transactions
* Cloud-based file or code being modified in real-time
by one or more users
There are five common categories used for data classification in various business and commercial
sectors:
- Public data
- Private data
- Internal data
- Confidential data
- Restricted data
Public data
Public data may be
important, but it is
accessible to the
public
Since this data is
openly shared, it is the
lowest level
Private data
Private data requires a
greater level of security
than public data
It should not be
available for public
access and is often
protected through
common security
measures such as
passwords
internal data
Internal data is usually
limited to employees
only and often has
different security
requirements that
affect who can access
it and how it can be
used
confidential data
This information
should only be
accessed by a limited
audience that has
obtained proper
authorization using
strict identity
management
restricted data
This classification is
reserved for an
organization’s most
sensitive information
Access to this data is
strictly controlled to
prevent its unauthorized
use
regulated data
information that its use and
protection is dictated by a
government agency or third-party
agreements
trade secrets
Any practice or process of a
company that is generally not
known outside of the company
Intellectual property
Creations of the mind, such as
inventions, literary and artistic works, designs and symbols, names, and
images used in commerce
Personal health
information (PHI)
The demographic information,
medical histories, test and lab results, mental health conditions, insurance
information, and other data
Personally identifiable
information (PII)
Any representation of data that allows the identity of an individual to whom the information applies to be
reasonably inferred by either direct or indirect means
Legal information
Involves the careful reading about
specific clauses or stipulations that
does not constitute “advice”
Financial data
Quantitative information used by
organizations to make financial
decisions and data concerning a
company’s financial health and
performance
Human and nonhuman readable
Some human-readable formats, such as PDF, are not machine-readable as they are not structured or semistructured (JSON, YAML) data
data life cycle
-create
-store
-use
-share
-archive
-destroy
Create phase (mandatory)
Data is either generated from scratch, inputted, acquired, purchased, or
modified into another format
- The data owner, stewards, and
custodians (if applicable) are identified
in this earliest phase
Other key activities of phase one
include:
* Data discovery
* Data categorization
* Data classification
* Data mapping
* Data labeling (tagging)
store phase
The data is put onto a volume (block), object (blob), or file storage system or into one of several types of database systems
- This phase relates to the optional transactional, near-term usage data as opposed to long-term cold
data storage - Protection of data at rest and data in transit will often occur in this phase unless default encryption
is implemented in the Create phase
use phase (mandatory)
data is utilized by
people, applications, services, and tools as well as being changed from the original state
- If data is used remotely then protection mechanisms must be in place (virtual private network (VPN), secure endpoints, digitally signed application protocol interface (API)
calls) - The systems that “use” the data must be secured as well;
for example, endpoint
detection and response (EDR) or host-based intrusion prevention system (IPS) agents (Palo
Alto Cortex XDR)
share phase
In this optional phase, data is visible, analyzed, and apportioned among users, systems, and applications
- Data can be shared in a client-server, peer-to-peer, or distributed manner
- Most of the control used in the previous phases will be implemented here in phase four (such as information rights management (IRM) and data loss
prevention (DLP) services) - Stringent Identity and Access Management (IAM) and/or Identity Management (IdM) should be used
to enforce the least privilege
archive phase
In this optional phase, data is stored for the long-term and removed from active usage
- Archiving is based on regulations, governance policies, and/or best practices
-Archiving is often automated and based on Intelligent Tiering or Storage Gateway management over a high-speed connection to
cloud providers
destroy phase
Data is no longer accessible or usable based on lifetime, utility,
policy, governance, and/or regulations
- The organization should have their own established methods for
disposal of data and media, often using military grade programs or
physical destruction such as crushers and furnaces - Although data can be disposed of using a variety of methods, when
storing data at a cloud provider, crypto-shredding (cryptographic
erasure) is the only practical and comprehensive solution
methods to secure data
geographical restrictions
encryption
hashing
masking
tokenization
obfuscation
segmentation
Compartmentalization
geographical restrictions
Limit access to users or devices based in a specific region 
encryption 
Transforms data from plain text to cipher text that can only be deciphered with a correct private key known as the description key
hashing
By hashing the data before storing it in a database, one can prevent unauthorized parties from reading or changing it without knowing the original data or the hashing
algorithm
- It is common for systems like directory services to hash the passwords of users so that they can be verified without exposing the
plain text
Choose a hashing algorithm that meets all policy requirements and that is supported by tools and utilities - Generate a salt for each data input that is hashed with a built-in function or library
- Hash the data input and salt with the chosen algorithm
- It is essential to use the same hashing algorithm and salt for the same data input every time it is hashed
- Employ a secure connection to the data storage or database to offer protection of data-in-transit
masking
Data masking often involves using characters like “X” to hide some or
all data
- Example is to only display the last four digits of:
- Social security number
- Credit card number
- National ID number
- Bank account number
- Username or email address
tokenization
involves sending sensitive data through an API call (or batch file) to a system or cloud provider service that replaces the data with nonsensitive, pseudorandom placeholders called tokens
- Unlike encrypted data, the tokenized data is
irreversible and unintelligible - The practice involves two distinct databases:
- One with the actual sensitive data
- One with tokens mapped to each chunk of data
obfuscation
applies to any
mechanism that makes data less decipherable
- The goal is to render data unreadable or to hide aspects of personally identifiable, personal health, or corporate intellectual property information
- “Obscuring” is a concept where static or dynamic techniques are used on the original data or a
representational data set - “Shuffling” is a term that describes utilizing characters from the same data set to further present the data
- “Randomization” is when all or some of the data is replaced with indiscriminate characters
segmentation
a process of dividing and
organizing data and information into defined groups to enable:
* Handling
* Labeling
* Sorting
* Viewing
* Securing
- Segmented data offers a team or group with segregated, clear, actionable information
Data segmentation involves grouping data into
at least two subsets, although more
separations may be necessary on a large network with sensitive data
Compartmentalization
is regarded as a very powerful way to protect
personal information
- It involves limiting access to information to only those people or
organizations who need it to perform a certain task
military and fbi use it