5.1 How to Troubleshoot Flashcards
Change management
- Change control
- A formal process for managing change
- Avoid downtime, confusion, and mistakes
- Corporate policy and procedures
- Nothing changes without the process
- Plan for a change
- Estimate the risk associated with the change
- Have a recovery plan if the change doesn’t work
- Test before making the change
- Document all of this and get approval
- Make the change
Identify the problem
- Information gathering
- Get as many details as possible
- Duplicate the issue, if possible
- Identify symptoms - May be more than a single symptom
- Question users - Your best source of details
- Determine if anything has changed
- Who’s in the wiring closet?
- Approach multiple problems individually
- Break problems into smaller pieces
- Backup everything
- You’re going to make some changes
- You should always have a rollback plan
- What else has changed?
- The user may not be aware
- Environmental changes
- Infrastructure changes
- There may be some clues - Check OS log files
- Applications may have log information
Establish a theory
- Start with the obvious
- Occam’s razor applies
- Consider everything
- Even the not-so-obvious
- Make a list of all possible causes
- Start with the easy theories
- And the least difficult to test
- Research the symptoms
- Internal knowledgebase
- Google searches
Test the theory
- Confirm the theory
- Determine next steps to resolve problem
- Theory didn’t work?
- Re-establish new theory or escalate
- Call an expert
- The theory worked!
- Make a plan…
Create a plan of action
• Build the plan • Correct the issue with a minimum of impact • Some issues can’t be resolved during production hours • Identify potential effects • Every plan can go bad • Have a plan B • And a plan C
Implement the solution
- Fix the issue
- Implement during the change control window
- Escalate as necessary
- You may need help from a 3rd party
Verify full system functionality
- It’s not fixed until it’s really fixed
- The test should be part of your plan
- Have your customer confirm the fix
- Implement preventative measures
- Let’s avoid this issue in the future
Document findings
- It’s not over until you build the knowledge base
- Don’t lose valuable knowledge!
- What action did you take?
- What outcome did it have?
- Consider a formal database
- Help desk case notes
- Searchable database
Unexpected shutdowns
- No warning, black screen
- May have some details in your Event Viewer
- Heat-related issue
- High CPU or graphics, gaming
- Check all fans and heat sinks
- BIOS may show fan status and temperatures
- Failing hardware
- Has anything changed?
- Check Device Manager, run diagnostics
- Could be anything
- Eliminate what’s working
Lockups
- System completely stops
- Completely. Usually not much in the event log
- Similar to unexpected shutdowns
- Check for any activity
- Hard drive, status lights, try Ctrl-Alt-Del
- Update drivers and software patches
- Has this been done recently?
- Low resources
- RAM, storage
- Hardware diagnostics may be helpful
POST (Power On Self Test)
• Test major system components before booting the
operating system
• Main systems (CPU, CMOS, etc.)
• Video
• Memory
• Failures are usually noted with beeps and/or codes
• BIOS versions can differ, check your documentation
• Don’t bother memorizing the beep codes
• They’re all different between manufacturers
• Know what to do when you hear them
POST and boot
• Blank screen on boot • Bad video • Listen for beeps • BIOS configuration issue • BIOS time and setting • Maintained with the motherboard battery • Replace the battery • Attempts to boot to incorrect device • Set boot order in BIOS configuration • Confirm that the startup device has a valid operating system • Check for media in a startup device
Continuous reboots
• How far does the boot go before rebooting?
• BIOS only? OS splash screen?
• Bad driver or configuration
• F8, “Boot from last known working configuration”
• Try F8, Safe Mode
• If system starts, disable automatic restarts
in System Properties
• Bad hardware
• Try removing or replacing devices
• Check connections and reseat
No power
- No power
- No power at the source
- No power from the power supply
- Get out your multimeter
- Fans spin - no power to other devices
- Where is your fan power connected?
- No POST - bad motherboard?
- Case fans have lower voltage requirements
- Check the power supply output
Overheating
- No power
- No power at the source
- No power from the power supply
- Get out your multimeter
- Fans spin - no power to other devices
- Where is your fan power connected?
- No POST - bad motherboard?
- Case fans have lower voltage requirements
- Check the power supply output