Troubleshooting and Performance Optimization Flashcards

1
Q

troubleshooting methodology

A
  • identify problem
  • establish theory of probably cause
  • test the theory
  • establish plan of action
  • implement a solution/escalate
  • verify functionality
  • perform root cause analysis
  • document the solution
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

refined troubleshooting

A
  • identify problem scope
  • reproduce the problem
  • check log files
  • read documentation
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

BIOS failure possible causes

A
  • overheating
  • unsupported features
  • newer options may require UEFI
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

BIOS failure possible solutions

A
  • keep server rooms/data centers properly ventilated
  • update (flash) BIOS
  • acquire UEFI motherboards/enable UEFI options
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

POST failure possible causes

A
  • TPM firmware detects a boot configuration change

- failed hardware components

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

POST failure possible solutions

A
  • enter TPM recovery code/configure boot options
  • search for reported POST code to identify problem
  • replace failed components
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

memory failure possible causes

A
  • POST failure message

- random OS freezes/reboots

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

memory failure possible solutions

A
  • run memory diagnostics

- replace failed components

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

processor failure/performance degradation possible causes

A
  • overheating
  • throttling slows CPU as temperature increases
  • VMs with manual CPU affinity specified are performing poorly
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

processor failure/performance degradation possible solutions

A
  • ensure HVAC is running correctly

- don’t manually link VMs to specific CPU cores

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

boot sequence possible causes

A
  • OS not found due to changing disk order/partitions

- booting from USB might fail if not enabled in BIOS

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

boot sequence possible solutions

A
  • configure bootable disk order in BIOS
  • configure bootable disk partitions in OS
  • flash BIOS so USB boot is supported
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

storage failure possible causes

A
  • drive failure

- RAID array drive failures resulting in slow performance

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

storage failure possible solutions

A
  • run disk diagnostics
  • replace failed drives
  • have hot spare disks in place
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

power failure possible causes

A
  • power supply

- power surge

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

power failure possible solutions

A
  • use redundant power sources
  • use UPSs
  • use surge protectors
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

environment failure possible causes

A
  • HVAC malfunctioning causes overheating
  • accumulated dust hampers airflow/add layer of insulation
  • low humidity increases ESD
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

environment failure possible solutions

A
  • ensure HVAC is running properly
  • clear dust from components/air intake fans
  • ensure HVAC keeps consistent relative humidity
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

crash cart tools

A
  • multimeter to test power supplies
  • hardware diagnostics tools for components
  • can of compressed air to remove dust
  • antistatic wrist strap/ESD mats
  • tools for testing bad RAM chips
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

logon failure possible causes

A
  • incorrect credentials
  • corrupt user profile
  • can’t locate authentication server
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

logon failure possible solutions

A
  • reset user password
  • save old user profile/remove corrupt user profile and registry references
  • ensure client station points to correct DNS server
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

user unable to access resource possible causes

A
  • insufficient permissions
  • encryption is enabled
  • Windows UAC configuration is too restrictive
  • UNIX/Linux sudo is not configured to enable user access to certain commands
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

user unable to access resource possible solutions

A
  • check user effective access
  • check group membership
  • ensure user has decryption key
  • loosen UAC settings
  • modify sudoers configuration file
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

memory leak possible causes

A
  • poorly written software
  • malware
  • runaway processes
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

memory leak possible solutions

A
  • reboot server to reclaim memory
  • run antimalware scan
  • patch software
  • find functionally equivalent software
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

blue screen of death (BSOD)/hang/crashes possible causes

A
  • unstable device driver
  • bad RAM chips
  • memory buffer overrun due to unpatched software
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

BSOD/hang/crashes possible solutions

A
  • update/replace/roll back driver
  • run memory diagnostics
  • patch software
  • replace failed RAM chip
  • restart Windows server/press F8/attempt to boot using last known good configuration (LKGC)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
28
Q

purple screen of death (PSoD) in VMware ESXi possible causes

A

most commonly related to VMkernel critical errors

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
29
Q

PSoD in VMware ESXi possible solutions

A
  • apply OS and driver updates/roll back updates
  • remove recently added hardware/test stability
  • review memory core dump file
  • check ESXi scratch partition for the vmware support output file
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
30
Q

disk drive unmountable possible causes

A
  • file system corruption

- not supported by local OS

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
31
Q

disk drive unmountable possible solutions

A
  • run disk scan to correct file system errors

- format drive with file system supported by local OS

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
32
Q

logs can’t be written to possible causes

A

log disk volume is full

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
33
Q

logs can’t be written to possible solutions

A
  • free up disk space
  • store logs in alternate location
  • archive old log messages
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
34
Q

slow OS performance possible causes

A
  • OS disk is full
  • disks are fragmented
  • system resources lacking
  • CPUs are busy
  • VM memory swap file (page file)/partition is on slow disk/is corrupt
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
35
Q

slow OS performance possible solutions

A
  • free up space on OS drive
  • extend OS drive capacity
  • defragment drive
  • reduce number of processors running concurrently
  • place VM swap configuration file on fast disks
  • enable disk write caching
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
36
Q

software patches not being applied possible causes

A
  • previous software dependencies aren’t present
  • patches don’t match platform architecture
  • software has reached end of life
  • synching updates with downstream servers failing in enterprise
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
37
Q

software patches not being applied possible solutions

A
  • apply previous dependencies first
  • acquire patches for appropriate platform architecture
  • acquire newer versions of software that are supported
  • check for network connectivity problems/changes
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
38
Q

service failure possible causes

A
  • dependency services failed to start
  • service account has insufficient permisions
  • service account password has expired
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
39
Q

service failure possible solutions

A
  • ensure dependent services are started first
  • grant service account required permissions
  • set service account password
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
40
Q

OS can’t be shutdown possible causes

A
  • hangs caused by runaway background processes

- updates still being applied

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
41
Q

OS can’t be shutdown possible solutions

A
  • use task manager or Linux kill command to terminate processes
  • wait for updates to complete
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
42
Q

users can’t print possible causes

A
  • Windows print spooler service isn’t responsive
  • printer is offline
  • incorrect/corrupt driver
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
43
Q

users can’t print possible solutions

A
  • restart Windows print spooler service
  • ensure printer is correctly configured/online
  • uninstall/reinstall updated printer driver
  • remove/reconfigure printer in OS
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
44
Q

software packages can’t be installed in Linux possible causes

A

package dependencies aren’t installed/incorrect version

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
45
Q

software packages can’t be installed in Linux possible solutions

A
  • update package repositories
  • install
  • update dependent packages
46
Q

Windows resource monitor

A
  • shows which processes are consuming most disk I/O time
  • input/output operations per second (IOPS
  • not provided by other tools
47
Q

DCSs

A
  • data collector sets
  • similar to performance monitor
  • can control when to start/stop collecting data
  • can configure alert notifications when thresholds have been exceeded
48
Q

Linux systems in single user mode

A
  • run level 1
  • only minimal set of services are running
  • involves interrupting boot process/modifying boot startup file
49
Q

top Linux command

A

lists top processes consuming resources

50
Q

ps Linux command

A

lists running processes

51
Q

kill Linux command

A

terminates processes

52
Q

df Linux command

A

shows disk free space

53
Q

Windows commands to map drives

A
  • net use

- new-psdrive

54
Q

PS get-volume

A

shows file system health status/size stats

55
Q

slow file access possible causes

A
  • failed RAID 5 array rebuilding data on demand in memory
  • failed RAID controller disk write cache or battery
  • disk array contains mismatched drive speeds
56
Q

slow file access possible solutions

A
  • ensure hot spare disks are always available
  • consider using RAID 6 (tolerate 2 simultaneous drive failures)
  • RAID arrays can’t queue disk requests that can’t be serviced right away without write caching
  • replace faulty components
  • disk arrays with slow/fast disks will use the slower speed
57
Q

data unavailable possible causes

A
  • failed server

- failed HBA

58
Q

data unavailable possible solutions

A
  • ensure high availability with failover clustering/data backups/data replication to other sites
  • ensure redundant SAN paths
59
Q

failed backup possible causes

A
  • failed network connection

- media failure

60
Q

failed backup possible solutions

A
  • ensure redundant network connections for LAN/cloud-based backup
  • ensure extra backup media is always available
  • perform periodic restore drills
  • have at least 2 backups of critical data
61
Q

unavailable drives possible causes

A
  • OS failure
  • physical disk failure
  • RAID controller failure
  • black enclosure backplane failure
  • network connection failure
62
Q

unavailable drives possible solutions

A
  • view LED indicators/LCD displays/drive error lights to catch errors
  • ensure redundant network paths to critical apps/data
  • replace fail RAID components/attempt to rebuild array
  • replace failed hardware components
63
Q

unable to mount storage media possible causes

A
  • corrupt file system
  • corrupt mass storage driver
  • insufficient user permissions
  • incorrect partition type
64
Q

unable to mount storage media possible solutions

A
  • run Windows disk scan
  • ensure user permissions are correctly configured
  • some OSs can’t read disk partitions created with other OS versions
  • use correct partition type
65
Q

Windows command line disk management tools

A
  • diskpart.exe (replaces fdisk command in new OS versions)
  • defrag.exe
  • powershell cmdlets
66
Q

Windows GUI disk management tools

A
  • disk management
  • server manager
  • disk defragmenter
  • disk cleanup
  • error checking
67
Q

df Linux command

A

shows disk free space

68
Q

fsck Linux command

A

checks file system for corruption

69
Q

xfs_repair Linux command

A

checks for/repairs XFS file system

70
Q

iostat Linux command

A

shows disk I/O statistics for storage devices

71
Q

lsof Linux command

A

lists open files/provides further details

72
Q

mdadm Linux command

A

Linux software RAID array management

73
Q

TDR

A
  • time-domain reflectometer
  • used to measure continuity of electric signals through circuit boards/network cable wires
  • used to determine where problem exists (probes just identify there is a problem)
74
Q

OTDRs

A
  • optical time-domain reflectometers
  • show where fiber-optic cables are terminated
  • can show location of cable breaks
75
Q

cause of most network issues

A

incorrect software protocol configuration

76
Q

internet connectivity failure possible causes

A
  • service provider outage
  • incorrect IP address for subnet
  • incorrect subnet mask
  • incorrect default gateway
  • incorrect DNS server
77
Q

internet connectivity failure possible solutions

A
  • verify IP address is in correct range for subnet
  • ensure configured default gateway interface is on the LAN
  • ping by IP address instead of FQDN to isolate name resolution problems
  • check provider SLA to determine support options
78
Q

LAN connectivity only possible causes

A

IPv4 169.254 address is assigned when DHCP is not reachable

79
Q

LAN connectivity only possible solutions

A
  • ensure DHCP server is running
  • ensure UDP port 67 isn’t blocked
  • ensure LANs DHCP relay is functional for DHCP servers on other subnets
80
Q

network service misconfiguration possible causes

A

DHCP server handing out invalid IP configurations

81
Q

network service misconfiguration possible solutions

A

correct DHCP misconfigurations

82
Q

network resource unreachable/unavailable possible causes

A
  • name resolution problems
  • IP misconfiguration
  • VLAN membership
  • incorrect subnet mask
  • incorrect route table entry
83
Q

network resource unreachable/unavailable possible solutions

A
  • use nbstat (Windows) to troubleshoot NetBIOS name resolution issues
  • use nslookup or dig (Linux) for DNS unknown host messages
  • make sure computer is part of the correct VLAN
  • view routing table using route print (Windows) or ip route show (Linux)
84
Q

unable to connect to network possible causes

A
  • faulty network cable
  • switch port security
  • NIC speed set incorrectly
  • RADIUS authentication failure
  • MAC address filtering
85
Q

unable to connect to network possible solutions

A
  • replace faulty cables
  • configure switch ports to enable device access
  • set NIC speed/duplex settings to autodetect
  • ensure proper authentication credentials/methods are used
  • add device MAC address to filter list
86
Q

tracert (Windows)/traceroute (Linux)

A
  • display information as data moves along network
  • more useful than ping
  • use when destination hosts on different networks are unreachable
  • tracert uses ICMP which may be blocked by firewalls
87
Q

route (Windows)/ip route show (Linux)

A

use to display/modify routing table entries on Windows server

88
Q

names resolving to unexpected IP addresses

A
  • probably entries in the local HOSTS file on system
  • entries are placed into client DNS cache in memory
  • client checks cache before DNS servers
89
Q

ipconfig /flushdns

A
  • clears client DNS cache
  • recent DNS queries are cached in local clients memory
  • have time-to-live (TTL) value to determine how long entry is cached
  • use when DNS records have recently changed
90
Q

nslookup

A
  • displays DNS information for local machine or FQDN

- can also use to modify DNS server information

91
Q

symptoms of malware infection

A
  • excessive/prolonged hardware resource use
  • inability to reach network resources
  • web browser homepage changed/not editable
  • web browser opens pages user didn’t navigate to
  • rogue processes/services running with improper privilege escalation
  • missing log entries (cleared by attacker)
  • encrypted files/messages demanding payment
  • abnormal listening ports on server (backdoor)
92
Q

immediate action when infection is detected

A

isolate server/subnet

93
Q

malware removal

A
  • vendor malware removal tools
  • windows system restore point (client OS only)
  • server reinstall/reimage
  • boot through alternative means to remove infection (Windows safe mode/USB boot/PXE boot)
94
Q

gpupdate/gpresult /r

A
  • gpresult /r shows resultant set of GPOs

- GPOs may be too restrictive/cause issues

95
Q

security filtering

A
  • enables admins to ensure only specific users/groups get group policy settings
  • Windows management instrumentation (WMI)
  • WMI query language (WQL)
96
Q

SetUID special bit (Linux)

A
  • enables executed script/binary to run as the file owner (not invoker)
  • owner could be root
  • used carefully it can solve issue of user not being able to run script/program
97
Q

icacls (Windows)/getfacl/setfacl (Linux)

A

use to save/restore file system ACLs

98
Q

too much running (performance)

A
  • server OSs barebones by default
  • running too many services can harm performance
  • use port scanning tools periodically
99
Q

confidentiality

A

provided by encryption

100
Q

cipher

A
  • cryptographic algorithm used to encrypt/decrypt data

- incorrect cipher configuration can cause issues

101
Q

integrity

A
  • uses hashing algorithm to ensure data has not been tampered with
  • packet/file level verification
  • packet sniffers/checksums (can also detect use of insecure tools)
102
Q

hashing commands

A
  • get-filehash .\name.txt (powershell)

- sha256sum (Linux)

103
Q

sizing

A

selecting number of virtual CPUs/amount of RAM/disk type for VM

104
Q

network optimization

A
  • configure VLANs to group machines that communicate frequently into smaller networks
  • configure NIC teaming
  • network load balancing (NLB) distributes incoming traffic for network services to multiple duplicated servers
105
Q

horizontal scaling

A

elastically scaling number of servers in cloud in response to increased demand

106
Q

troubleshooting step involving questioning stakeholders

A

identify the problem

107
Q

troubleshooting step involving reproducing the problem

A

identify the problem

108
Q

troubleshooting step involving making single change at a time

A

implementing the solution

109
Q

machine with IPv4 statically configured can’t get GPO settings/domain controller can’t be found/machine can communicate with other local and remote hosts/GPO settings worked before manual IPv4 configuration

A

incorrect DNS server

110
Q

AD domain admin/nothing happens when trying to run install on domain-joined server

A

UAC issues (run as administrator)

111
Q

Linux command to terminate process

A

kill

112
Q

drawback of heuristic host/network analysis

A

false positives