Domain 4: Information Security Incident Management Flashcards

Question 1

Q

Incident Management

Answer

A

Incident Management - the capability to effectively manage unexpected operationally disruptive events with the objective to minimize impacts and maintain or restore normal operations within defined time limits as defined in the SLA
- ISM is responsible for developing and testing IRPs. and ensuring it correlates with BCPs and DRPs in case IRP is insufficient to resolve an event;
- Organization should have agreed-upon definition of what constitutes an incident and apply categorization levels
- Effectiveness is based on timeliness and accuracy; Timeliness is the time between incident identification and acceptance as a valid incident;
- Effective documentation ensures participants constantly understand the incident and can work together and respond appropriately
- When investigations involve individuals, legal and HR should be involved
- An important consideration is knowing when an incident becomes a problem and when that problem becomes a disaster. There should be declaration criteria (who has authority) and severity criteria (incident ranking)
- It is critical to achieve stakeholder and senior management support. Support can be gained through historical incidents and their impacts, or business cases of other organization incidents and their impact
- An effective incident management program can increase the risk accepted by senior management due to a demonstrated capacity to handle incidents

Question 2

Q

Incident Management Life Cycle Phases

Question 3

Q

Incident Handling vs Incident Management

Answer

A

Incident Handling - all tasks associated with handling events and incidents

Detection and Reporting - ability to receive and review event and incident information
Triage - categorize, prioritize, and assign events and incidents to maximize limited resources
Analysis - determine what happened, the impact and threat, the damage resulted, recovery or mitigation steps
Incident response - actions taken to resolve or mitigate incident, disseminate information, follow up, prevent recurring incidents

Incident Management - incidents are detected, recorded, and managed to limit impacts. Incidents are classified and prioritized and routed to the correct resources. Incident mgmt provides a structure to handle incidents. Ensures incidents are owned, tracked and monitored. It also accounts for incidents that may need to be escalated to escalate BC/DR plan. It includes vulnerability management, SAT, proactive activities to help prevent incidents.

Question 4

Q

Incident Management Systems
- Distributed vs Centralized IMS
  - Operating Costs
  - Recovery Costs

Answer

A

Incident Management Systems (IMS) - incident management systems automate many manual processes that provide filtered information that can identify possible technical incidents and alert the incident mgmt team (IMT)
- Distributed IMS - contains multiple specific incident detection capabilities (NIDS, HIDS, logs)
- Centralized IMS - security information and event manager; mines data from multiple systems into a single database providing near real-time notifications and identifies policy violations; can prioritize and escalate or notify on incidents based on impact; ability to track incident during entire life cycle until it is closed.
  - From ISM perspective, there are two things to consider:
    - Operating Costs - for Manual process, staff would need to be trained, and would need to monitor every system manually; higher probability for human error vs automation
    - Recovery Costs - automated systems can detect, notify, and escalate faster then manual process saving immense amounts of time; a longer analysis contributes to further damage;

Question 5

Q

IR Technology Concepts
IMT Personnel
IRT Organization

Answer

A

IR Technology Concepts IRT should be familiar with:
- Security Principles - CIA, authentication, nonrepudiation, privacy, access control, compliance
- Security Vulnerabilities
- The Internet - (i.e. Network protocols: IPv4, IPv6, TCP, UDP, how they’re used, common attacks) (i.e. Network applications and services: DNS, NFS, SSH, how they work, common attacks)
- Operating Systems and their configuration and common attacks
- Malicious Code - these can cause DoS attacks, can be transferred via USB etc
- Programming Skills - understanding code and poor programming through lack of input validation, SQL injection, XSS
IMT Personnel - ISM usually leads the team; Security steering group approves the charter and deviations from normal practice; There are also permanent and virtual/temporary team members. Permanent members are FT and perform primary tasks (i.e forensics expert); temporary members are recruited when necessary to fill gaps (legal reps, HR staff, etc)
IRT Organization - incident handlers analyze incident data, determine the impact, respond to limit damage and restore normal services; The following IRT models have worked:
- Central IRT - a single IRT handles everything, usually in small organizations or centrally located
- Distributed IRT - each of several teams are responsible for a logical or physical segment of the infrastructure; usually in large organization or one that is geographically dispersed
- Coordinating IRT - The central team provides guidance to distributed IRTs, develop policies SOPs; Distributed teams manage and implement IR;
- Outsourced IRT - can be partially or fully outsourced

Question 6

Q

IMT Roles and Responsibilities

Answer

A

IMT Roles and Responsibilities (image)

Question 7

Q

Outsourced Security Providers
- ISM Considerations

Answer

A

Outsourced Security Providers - for smaller organizations; they will still require an IRP overseen by the outsourced IRT; essential to understand the outsourcers’s capabilities, response times, indemnity clauses, SLA etc
- ISM Considerations
  - Match organization’s reference # with vendor’s for each applicable incident - this ensure common understanding of incident details between organizations
  - Integration of Change Management - contributes awareness to system changes for both organizations
  - Requirement from vendor of periodic review of incidents that occur on a regular basis (i.e. monthly) - all incidents/events are reviewed annually; follow up items to prevent recurring incidents

Question 8

Q

Incident Management Objectives
- Strategic Alignment
- Risk Management
- Assurance Process Integration
- Value Delivery
- Resource Management

Answer

A

Incident Management Objectives - next to last safety net after controls fail; purpose is to respond to and contain incidents or quickly restore to normal operations; failing to do so will cause a disaster and recovery operations will move to an alternate site (BCP/DRP)
- The objectives are to handle incidents as they occur within RTO, restore to normal operations, prevent recurring incidents, proactive countermeasures to minimize/prevent incidents
  - Strategic Alignment - like other functions, incident mgmt must align with strategic plan; the following help with accomplishing alignment:
    - Constituency - important to know who the stakeholders are, their expectations, and information needs
    - Mission - define purpose and primary objectives and goals
    - Services - should be clearly defined to manage stakeholder expectations;
    - Organizational Structure - IMT structure should support organizational structure;
    - Resources - incident mgmt covers wide range of services, it is not possible to cover everything, so good to have virtual/temporary team members
    - Funding - need sufficient funding
    - Mgmt Buy in - essential
  - Risk Management - successful outcomes of risk management include effective incident management and response capabilities; any risk that is not prevented by controls constitutes an incident and must be managed or it becomes a disaster
  - Assurance Process Integration - incidents may involve other functions such as legal, HR, physical security; plans should be in place to define how they’re involved and the plan should be tested under realistic conditions
  - Value Delivery - incident mgmt should be closely integrated with business functions; should integrate with BCP; optimize risk management; align with overall business strategy
  - Resource Management - time, people, budget, and other factors; not all objectives can be achieved so its important to prioritize by utilizing limited resources; effective resource management involves ensuring appropriate oversight, monitoring of resources, and regular reporting

Question 9

Q

Incident Management Metrics and Indicators
- Performance Measurement
Defining Incident Management Procedures
- Detailed Plan of Action for Incident Management

Answer

A

Incident Management Metrics and Indicators - metrics are based on KPIs and KGIs. KPIs are quantitative (i.e. The # of incidents per year resolved within 2 minutes of occurrence). KGIs are quantitative or qualitative (i.e. Business goal is to have 1000 incidents resolved within two minutes)
- Detailed Plan of Action for Incident Management -

Prepare - planning and design, establish vision, mission, requirements, funding, policies, resources, change management process etc
Protect - protect infrastructure
Detect - Detecting events; Proactive detection is conducted regularly through period vulnerability scanning, network monitoring, antivirus, firewall alerts, etc; Reactive Detection is conducted when there are reports from system users or advisories from other organizations;
Triage - Triage events; process of sorting, categorizing, correlating, prioritizing, and assigning incoming events into (typically) 3 categories;

Events that can wait
Events that cannot be readily resolved
Events that can be resolved now

Triage prioritization (2) Tactical (based on a set of criteria) or Strategic (Based on impact of business, this is more favored)

Triage categorization - use of predetermined criteria to classify incoming events (i.e. DoS, unauthorized access, malicious code)

Triage correlation - the higher the correlation, the more information that is useful to determining a response

Triage assignment - assigning incident to team member; can be based on workload of members, category of event, members who have handled similar incidents, etc

Respond - There are three different types of response activities

Technical response
Management Response
Legal response

Question 10

Q

Current State of Incident Response Capability
Types of Threats
Vulnerabilities

Answer

A

Current State of Incident Response Capability - Three ways to assess current state of incident response capability

Survey of senior management, business managers, and IT reps
Self-Assessment
External Assessment or audit

Threats - an event that can cause harm; Types of threats include:
- Environmental - natural disasters; rare occurrences; insurance is best way to handle these
- Technical - Fire, electrical failure, HVAC failure, etc; common; can be managed adequately with the exception of APTs and zero days attacks
- Man-made - damage by disgruntled employees, corporate espionage, political instability;
Vulnerabilities - a weakness that can be exploited and result in compromise is a vulnerability; a vulnerability that can be exploited by threats results in risk; vulnerability management is proactive in identifying weaknesses in incident management

Question 11

Q

Developing an IRP

SIX IRP Elements

Answer

A

Developing an IRP
- SIX IRP Elements
  - Preparation - develop IRP prior to an incident (i.e. handling incidents, communication, policies, warning banners, when to escalate, resources needed etc)
  - Identification - verify if incident has actually happened; things that may occur at this stage (assigning ownership of incident, verify reports or events, establish chain of custody, determine severity)
  - Containment - Once incident confirmed, IMT is activated; Purpose of this phase is to limit exposure; notify appropriate stakeholders like the SO; obtain agreement on actions; involve virtual team if needed; obtain and preserve evidence; Document backup actions; control and manage communication to public
  - Eradication - after containment measures have been deployed, it is time to determine root cause; ways to eradicate (restore backup, remove root cause, improve defenses, perform vulnerability analysis to find new vulnerabilities introduced by root cause etc);
  - Recovery - this phase ensures that affected systems or services are restored to a condition specified in the SDO or BCP; Time constraint is RTO; (restore service, validate actions were successful, SO test system, SO to declare normal operation)
  - Lessons Learned - report of actions taken, lessons learned, actions that can be improved, issues encountered; present to relevant stakeholders
    *

Question 12

Q

Developing an IRP

Business Impact Analysis
- Three Primary Goals
- Steps of BIA

Answer

A

BIA - after identifying all possible events, must determine potential impact; the team cannot plan for or prioritize response to an undesirable event if there is little idea of the likely impact; BIA calculates consequences of compromise and RA calculates probability of compromise;
- Three primary goals
  - Criticality Prioritization - every critical process need to be identified and prioritized
  - Downtime Estimation - determine MTD (maximum time a system can be in alternate state); AIW (maximum time a system can be down before business is no longer viable)
  - Resource Requirement - most critical processes receive the highest priority for resource allocation
- Steps of BIA
  - Identify critical assets
  - Identify interdependencies, discover possible disruptions, identify and document potential threats, provide alternative methods to restoring functionality and communication, provide rationale for each threat
  - Document assessment results and present report;

Question 13

Q

Developing an IRP

Escalation Process for Effective Incident Management
Help/Service Desk in relation to Incidents
List of Incident Management and Response Teams

Answer

A

Escalation Process for Effective Incident Management - IRP should include detailed description of escalation process and who has to authorize recovery actions or disaster declaration; POCs and alternate POCs should be documented and estimated time for execution;
Help/Service Desk - users may report incidents to service desk; ISM should ensure proper training to service desk to identify and notify of possible incidents;
Incident Management and Response Teams
- Emergency Action Team - FIRST responders whose function is to deal with fires or other emergency response scenarios
- Damage Assessment Team - qualified individuals who assess the extent of damage to physical assets and make an initial determination of what is a complete loss vs what is restorable
- Emergency Management Team - responsible for coordinating the activities of all other recovery teams and handling key decision making
- Relocation Team - coordinates process of moving from affected location to alternate site or to restored original location
- Security Team - aka CSIRT; monitors security of systems and communication links, contains ongoing security threats, resolves issues, assure proper installation and function;

Question 14

Q

Developing an IRP

Training response staff
- What does a training program include?
Incident Notification Process
Challenges in Developing IRP

Answer

A

Training Response Staff - ISM should develop event scenarios and test the response and recovery plans to ensure team is familiar with their responsibilities; This will help identify the resources needed for responses and detecting and modifying ambiguous procedures;
- The Training Program Includes: Induction to IMT, mentoring team members regarding roles responsibilities procedures, on the job training, formal training
Incident Notification Process - timely notification process limits the potential loss and damage that can occur; notification activities are effective if personnel understand their responsibilities
Challenges in developing an IRP
- Lack of Management Support and Organizational Consensus - way to improve this is to have regular meetings between IMT and relevant personnel
- Mismatch to organizational goals and structure - changes in organization may occur and IMT does not develop IR process to match new regulations; i.e. if organization expands into other countries. ISM needs to inform management of new regulations and how IR process needs to change to match
- IMT Member Turnover
- Lack of Communication Process - under-communication or over communication; under-communication leave stakeholders confused; over-communication leaves stakeholders overwhelmed and IR process may compete with priorities already established
- Complex and Broad Plan - hard to get consensus

Question 15

Q

BCP vs DRP
Planning Process for BCP and steps for developing BCP
What occurs during recovery operations
Methods to Address Threats
List of Recovery Sites

Answer

A

BCP vs DRP - DRP is a subset of BCP;
- BCP - goals include prevention and mitigation; CONTINUOUS process that is implemented in business-as-usual scenarios
- DRP - focuses on restoring operations AFTER an incident; REACTIVE process that is implemented when a set of scenarios are met
Planning process for BCP - prior to creating BCP, perform BIA to determine incremental daily cost of losing different systems and provides basis for appropriate RTOs; Steps include
- Conduct RA and BIA
- Define Response & Recovery (R&R) STRATEGY
- Document R&R PLANS
- Train R&R PROCEDURES
- Update R&R plans
- Test R&R plans
- Audit R&R plans
During recovery operations - BCP team should monitor restoration progress at primary site to assess when it is safe to return and perform tests to evaluate whether primary data center is capable of functioning at normal capacities; The relocation team is responsible for returning to primary sites; the recovery team updates the leader to declare normalcy and migrate back to primary site; if primary site is completely destroyed, there needs to be decision on transforming alternate site to primary; a RISK ASSESSMENT should be conducted to make senior management aware of potential impacts of the security risk introduced by executing plan; lessons learned gaps and recommendations should be documented; DR site should be restored after operations are reinstated at primary site
Methods to Address Threats
- Eliminate or Neutralize Threat - unlikely unless it is internal
- Minimize the likelihood of a threat’s occurrence - can be done through implementing effective security controls; i.e. Firewalls, IDS, IPs, network segmentation or compartmentalization
- Minimize the effects of a threat if an incident occurs - compensating or corrective controls; i.e. automatic failover, redundant systems
List of Recovery Sites
- Hot Site - fully configured, ready to operate within several hours
- Warm Site - complete infrastructures, partially configured
- Cold Site - this is only a viable option when organizations can afford long downtime; has very basic environment such as flooring, electrical wiring, air conditioning to operate an information processing facility (IPF); Ready to receive equipment; Activation may take several weeks; Two operations for equipping a cold site (Vendor or 3rd party (usually used SW HW), or Off the Shelf components);
- Mobile Sites - trailers that can be quickly transported to a business location or an alternate site to provide a ready-conditioned IPF;
- Duplicate Site - standby hot site that is functionally similar or identical to primary site
- Mirror Site - if continuous uptime and availability is required; applications are switched between sites without interruption;
- Reciprocal Agreements - seldom used; participants agree on computing time and network operations when an emergency arises;

Question 16

Q

BCP and DRP contd.

Basis for Recovery Site Selections
- RTO,RPO,MTO,Proximity Factors, Locations,Nature of disruptions
Methods for providing Continuity of Network Services
- redundancy, alternate routing, diverse routing, long-haul network diversity, last-mile circuit protection, voice recovery

Answer

A

Basis for Recovery Site Selections
- AIW - total time organization can wait from point of failure to restoration of critical services; after this time, loss threatens existence of the organization
- RTO - length of time from interruption to functioning at a service level sufficient to limit impacts to an acceptable level
- RPO - the amount of data that can be lost and need to be recreated; aka last known good data; this will be the starting point at the recovery site; If full back ups are infrequent, it may take too much time to re-create the amount of data lost and the result would be the RTO not being met;
- SDO - level of services to be supported during alternate process mode until normal situation is restored;
- MTO - maximum time the organization can support processing in the alternate mode;
- Proximity Factors - distance from potential hazards i.e. Flooding from nearby waterways
- Location - sufficient distance from primary and recovery facilities to minimize similar occurrences;
- Nature of Probable Disruptions - i.e. Hurricanes, Earthquake
Methods for providing Continuity of Network Services
- Redundancy - providing extra capacity should normal transmission not be available (i.e. Two LANs); providing multiple paths between routers; Using special dynamic routing protocols (i.e. OSPF, EGRP), providing failover to avoid SPOF; saving configuration files
- Alternative Routing - alternate medium such as copper cable or fiber optics; involves use of different networks, circuits, or end points
- Diverse Routing - routing traffic through split cable or duplicate cable facilities;
- Long-Haul Network Diversity - high speed data among major long-distance carriers; this ensures long-distance access if any single carrier experiences network failure; automatic rerouting and redundancy lines provide instantaneous recovery if a break in line occurs
- Last-mile circuit protection - redundant combination of local carrier high-speed data; enables facility to have access during local carrier communication disaster; alternate local carrier routing is also used
- Voice Recovery - redundant cabling and alternate routing for voice communication lines as well as data communication lines

Question 17

Q

BCP and DRP contd.

High-Availability Considerations
- DAS, NAS, SAN
- RAID
- if RTO and RPO are instantaneous
- if RTO and RPO are flexible
- Insurance
- Testing of Plans

Answer

A

High-Availability Considerations
- Direct Attached Storage (DAS) - data storage attached to server or client; each user needs direct access to the server; If more storage is needed, server or client needs to be taken offline so additional drives can be installed which affects availability
- Network Attached Storage - data storage attached to network that has its own operating system; can attach via ethernet; adding storage has no downtime; availability not impacted
- Storage Area Network (SAN) - high-speed special purpose network that provides MASS storage using REMOTE interconnected devices (i.e. Disk arrays, tape libraries, optical jukeboxes); Function as if they are attached locally; SAN support mirroring, backup, restore functions, data migration
- These 3 storage solutions are compatible with RAID; RAID improves performance and provides fault-tolerant capabilities by breaking up data and writing it to a series of multiple disks; This allows for continuous data availability
- If RTO is instantaneous, and RPO is equally stringent, need FAULT-TOLERANT storage solution; LOAD BALANCING or CLUSTERING occurs where all servers take part in processing; this is high cost;
- If RTO and RPO are flexible, then high-availability is fine; fail over processes are sufficient; if server fail, the application restarts in failover server; lower cost;
Insurance - insurance should be included in plans; organizations can NOT insure against failure to comply with legal and regulatory requirements or any other breach of the law;
Testing of Plans - prior to each test, ensure risk and impact of disruptions of test are minimal, business understands and accepts the risk inherent in testing, and fallback arrangements exist to restore operations at any point during the test; all tests should before fully documented;

Question 18

Q

Testing IR and BC/DRP

Types of Tests
Three main recovery testing
Phases of Testing
*

Answer

A

Types of Tests
- Checklist review - preliminary step; recovery checklists
- Structured Walkthrough - team members physically implement the plans on paper and review each steps to assess effectiveness and identify enhancements, constraints, and deficiencies
- Simulation Test - role-plays a prepared disaster scenario without activating processing at the recovery site
- Parallel Test - recovery site brought to state of readiness, but operations at primary site continue normally
- Full Interruption Test - Operations shut down at primary site and shifted to recovery site in accordance to recovery plan; expensive and potentially disruptive
Three Main Recovery Testing Categories
- Paper Tests - on paper walkthrough involving partial or entire plan
- Preparedness Tests - localized versions of a full test, actual resources are used in the simulation of a system crash; performed regularly on different aspects of plan; cost effective; improve plan in increments
- Full operational tests - one step away from actual service disruption; organization should test well on paper and locally before completely shutting down operations; The full operational testing scenario is the disaster;
Phases of Testing: Pretest Test PostTest
Recovery Test Metrics
- Time - measuring response time; opportunity to refund response time if measured
- Amount - amount of work or operations performed at backup site
- Percentage and/or number - # of things requested and those that were actually received
- Accuracy - accuracy of data entry at recovery site vs normal accuracy