Developing problem solving skills and learning techniques to help you identify the root cause of a technical issue is one of the most important skills a sysadmin needs to develop. Whether it's a complex problem involving servers and networking equipment, or your kid thought your PC's DVD tray was a chocolate-chip cookie holder, these IT problem solving techniques will help guide you through the problem solving process to a potential solution.
Restart, Reboot, Turn It Off and On Again
Replicate the Issue
Check event logs
Ask For Help
Ask For Help from Online Forums
Think Outside the Box
Document Your Troubleshooting Steps and Eventual Solution
Let's get this one out of the way first, because I would feel guilty if I didn't put our most infamous troubleshooting technique first on the list. Say what you will about this age-old solution, but there's no denying the effectiveness of restarting or rebooting a device.
So what makes this problem solving technique a top-tier solution? Two things: speed and effectiveness.
SSD's (solid-state drives) have taken computers to a whole new level of fast. The SSD has drastically increased the speed of certain operations, such as transferring files and loading applications, and most importantly, SSD's have greatly sped up startup and shutdown operations.
As for effectiveness, the results speak for themselves. I've seen a simple reboot fix the craziest problems you can imagine. I recently noticed my laptop screen seemed to be suffering from burn-in. Burn-in is when an image is permanently burned into your screen, like a ghost image. This is a fairly common issue for OLED screens, especially if they display static images for long periods of time. My screen, however, is not an OLED screen, and it's pretty new, so I ruled out that potential prognosis. I tried several things to get rid of the ghost image, such as changing the resolution, changing the refresh rate, and updating the display driver, none of which resolved the issue. So I resorted to the classic reboot technique and restarted my laptop. Just 20 seconds later I was back in my account and the ghost image was gone.
While I'm still wholeheartedly in camp, turn it off and on again, there are some fundamental issues with this technique, the main issue being that you don't learn anything. You may find yourself just becoming more dependent on the same solution instead of discovering the root cause of an issue. While I love answering the phone and asking users if they restarted their device, knowing full well the anguish that question causes them, there is a time and a place for it. Here's a guide that should help:
The first time a user is contacting you about a specific problem that hasn't occurred before.
When finding a resolution is time-sensitive, e.g., a live presentation may not be the best time for critical thinking and creative problem solving when a reboot could probably fix the issue.
When you've exhausted most of the obvious troubleshooting steps, and the problem still persists.
When a user is experiencing a persistent problem.
When there is a high probability that a restart won't fix the issue.
When the issue could be a security concern.
When there is a likely chance that the problem could persist or be experienced by other users.
When the device experiencing the issue is a critical system that requires high availability.
When you are training new IT personnel. You don't want them developing your bad habits.
Gathering information may seem less like a technique and more like a gimme, but how thorough are you when diagnosing a technical problem. Are you actively listening to the user as they go on about their potentially career-ending computer problem? Or are you zoned out because you know they aren't likely to give you very much relevant information?
To help uncover as much information as possible, use discovery words, (who, what, when, where, why, and how) as you conduct a brief intake interview for any ailing device.
Discovery words aren't just for investigative journalists. These questions get the critical thinking juices flowing, helping sysadmins quickly narrow in on a root cause of an issue.
As funny as some of these questions and answers above may be, they are unfortunately based on real-world experiences. Make your life and job easier by getting in the habit of asking meaningful questions. Keep a cheat sheet next to your phone with the 5 Ws and 1 H until you get in the habit of asking the right questions.
Replicating an issue is one of the most beneficial problem solving techniques in a system administrator’s toolkit. Replicating issues often leads to the possible cause of a problem.
When trying to replicate an issue, information gathering is key (see above). What applications were running? Was the user performing a specific task or process? What time did the issue occur? What has changed since the last time the task or process was completed successfully? Is the issue happening to anyone else, especially users with similar responsibilities and equipment? As you are gathering information, look for patterns, no matter how obscure. Really stretch those analytical skills.
Unfortunately, while some issues can be replicated easily, others can seem impossibly difficult to replicate because of the sheer number of variables at play. This is especially true when an issue is extremely sporadic, with no identifiable patterns appearing when the problem occurs. If you've spent what feels like an eternity trying to replicate a problem, it may be time to move on to another troubleshooting technique.
Although considered boring, checking event logs is one of the most efficient ways of determining the root cause of a problem. Logs will provide you with an event ID and an obscure message that takes most of the fun out of your IT detective work, but you can't argue with the results.
If you're into logs then you should be familiar with the Event Viewer built into Windows OS. Event Viewer is where you'll find Windows logs, including but not limited to error messages, system warnings, general system messages, security logs, and applications logs.
Windows has made it very easy to access Event Viewer. My preferred method is by right-clicking on the Start button and selecting Event Viewer.
Another way to access Event Viewer is to type event viewer into the Windows search bar, then click on the Event Viewer application.
If you're a macOS user, then you'll want to use the Console app to view your system logs. You can find the Console app by using the Spotlight search function or by navigating to Finder > Applications > Utilities > Console.
If Event Viewer or the macOS Console app doesn't contain the information you're looking for, many applications are capable of recording their own logs. For example, here's how to enable logging in Google Chrome:
Close any running Chrome instances.
Right-click on the Google Chrome desktop shortcut and click Properties.
In the Target field, add the following flag to the end of the target string: --enable-logging --v=1
The entire target field entry will look something like this: "C:\Program Files\Google\Chrome\Application\chrome.exe" --enable-logging --v=1
After applying these changes, you'll notice an interface that looks like a CLI launches in addition to Chrome. Chrome is now saving data to a log file that you can review.
The log file is located in "C:\Users\username\AppData\Local\Google\Chrome\User Data\chrome_debug.log" which you can open and review with a text editor like Notepad.exe. Be prepared for a lot of information. Log files like this can get pretty hefty.
The key to navigating log files is accurately tracking when events occur. If you know a user got a blue screen of death sometime in the last week, you're going to be searching log files for what feels like an eternity. On the other hand, if a user's computer unexpectedly restarted yesterday at 3:55 PM, you can easily track down the corresponding event log.
Filtering for a specific event ID is another way to navigate log files quickly, but you won't always know the event ID you're looking for beforehand.
As technology advances, its underlying infrastructure gets more complex. For example, applications went from a few hundred lines of code to thousands and millions of lines of code. Or, if you're Google, over two billion lines of code. With all that complexity, it's easy to understand how errors can work their way into these complex systems. These types of issues are often referred to as bugs. Some bugs are as simple as a graphical user interface issue, while others are more serious and could result in vulnerabilities.
Most developers take bugs pretty seriously, especially vulnerabilities, working diligently to fix them. Once a fix has been created, developers release the fixes as updates, also known as patches. Managing and deploying these updates is known as patch management and is an essential responsibility that is often given to sysadmins.
So how can updates help when you are seeking a solution to a problem? Well, there is a reasonable chance that whatever issue you are encountering is known by the developers, who may have already released a patch to resolve the issue. Applying the patch could be the only step you need to take to resolve a seemingly complex problem.
Staying on top of your patching needs can be difficult, but PDQ.com is here to help. Check out our Patch Tuesday articles, where we help you stay informed about the latest updates and vulnerabilities. If you're struggling to get your patches deployed, check out PDQ Deploy and PDQ Inventory, which can automate your patch management needs, leaving you free to focus on other tasks.
So you've worked through the other techniques so far, and you still don't have an effective solution? Then it might be time to try this little-known secret that has been helping sysadmins fix problems for years. Google.
Don't feel ashamed that your problem solving techniques fell short of fixing an issue. It happens to the best of us. As we say around here, life is an open book test — use whatever resources are available, and Google is pretty much the biggest resource of them all.
If you don't think Googling is a true skill, then you have much to learn. The internet contains a massive amount of data, so it's highly likely that somebody has already experienced and resolved the issue you're currently experiencing. Make sure your Google search terms are short and specific. Provide error codes and messages when available. If you end up on YouTube, make sure to check the comments sections if the video didn't resolve your issue. If you end up on a blog article that supports comments, check the comment section for additional information that the article didn't cover.
If you've got coworkers or people available with expertise on the matter you're currently working on, ask them for help. Don't be afraid to not know something. Most jobs require a good amount of humility as we grow and learn. IT is an especially difficult field to master as processes and systems are constantly evolving, making a sysadmin's job more about learning and growing than mastering any one particular aspect of IT.
Okay, so what if you've swallowed your pride and you finally ask your coworker for help, only for them to not have the answer? That's okay. You can still benefit from having someone to bounce ideas off of. This process of talking out a problem can be all you need to get the ball rolling, leading you to a possible solution.
Where do you turn if your Google search results didn't prove to be an effective problem solving strategy and your coworker is just as stumped as you? Online forums, that's where.
Online forums are like a cheat code when it comes to getting answers. Asking hundreds or thousands of other sysadmins in an online forum for help regarding your very specific situation is the ultimate big brain strategy. That kind of collective intelligence is almost guaranteed to provide results.
Once you've decided to seek help in an online forum, the only thing left to figure out is which forum to post to. The answer is — it depends. If you are looking for help for a specific product, see if the developer hosts a support forum. For example, here is our community support forum for PDQ.
If you are seeking help for something more general, try these forums:
Some issues can't be resolved with a traditional problem solving method and require a more lateral thinking approach. Creative ideas can seem far-fetched, but when you've exhausted your other resources, sometimes creative thinking is all you've got left.
I once had a user on my network that never seemed to experience any technical problems. That all changed when the user decided to re-arrange their office. Suddenly, the user was reporting that their computer would randomly shut off once or twice a week. Replicating the issue seemed impossible. There was no discernable pattern to when it would shut off. Then, one day as I was troubleshooting the setup, I pushed against the desk and noticed it wasn't very sturdy. I suddenly got the idea to push against the desk with a little more force. As soon as I did, the computer shut off. I quickly crawled under the desk to discover the surge protector's power button was just close enough to a bundle of cables that when the desk would shake, the wires would push into the power button enough to cut the power, but not enough to fully flip the switch. After moving the surge protector to a safer location, the user went back to working hard, and I went back to hardly working.
I can speak from experience that one of the worst feelings to have is fixing a technical problem and then forgetting the solution, only to have the problem appear again. So I’m living proof for always documenting your troubleshooting steps and eventual solution can save you from a massive headache in the future.
Many popular IT ticketing solutions come with documentation tools built-in. However, it's on you to make sure you provide as much detailed information as possible. One way to get in the habit of providing enough information in your documentation is to ask yourself if you would be able to solve the issue with only the documented information. If not, then you may have skipped important steps that you or your coworkers could use in the future. Don't skimp on the details. Your future self will thank you.
Technology is rapidly advancing, and keeping up with it isn't easy. Luckily, many of the problem solving techniques sysadmins have been using for years are just as relevant today as they were 20 years ago. Go forth and be the problem solver you are meant to be with these tips in your toolbelt.
Born in the '80s and raised by his NES, Brock quickly fell in love with everything tech. With over 15 years of IT experience, Brock now enjoys the life of luxury as a renowned tech blogger and receiver of many Dundie Awards. In his free time, Brock enjoys adventuring with his wife, kids, and dogs, while dreaming of retirement.