Windows Internals Guide

Introduction

Purpose

RPR can be used for a range of issues, but its main purpose is to identify the technical root cause of recurring grey problems in an end-to-end system. Analysis of log data and network traces can be used to quickly narrow the fault domain to a single 'box'. What if that box is a Windows Server, or a user's PC? What else can we do to drill down into the problem?

This guide shows RPR Practitioners how they can probe inside the Windows Server or PC to narrow the problem down to a component that has an owner; someone we can describe our findings to and ask for a fix.

The Windows Internals manual now comprises two books with a total of around 1,300 pages of text. We obviously can't cover all of this material, and there would be little point in duplicating what's already available. Luckily, we don't need to.

Objective

The objectives of this guide are to develop:

  • An adequate understanding of Windows operating system mechanisms so that he/she make an informed choice in the diagnostic tool to be used, and interpret the data produced
  • Knowledge of the capabilities of a set of diagnostic tools and facilities, and the reasons for their use
  • Knowledge of techniques to correlate the output of differing tools with one another
  • An understanding of marker techniques specific to the Windows environment
  • The ability to determine if an issue is an application or operating system issue, and present diagnostic evidence and a reasoned explanation to the appropriate technical support teams (including the application developer or Microsoft)

Scope

There is no attempt in this guide to cover Windows subsystems, in particular the .NET CLR and IIS. These are large and complex environments in their own rights and so are the subject of a separate guide.

Visual Studio has some great debugging tools and facilities. We don't cover Visual Studio in the course as it is not a program that is readily available to everyone in technical support. One of the guiding principals for RPR is that it should work in a production environment, and so we have used other tools that are more readily available.

Similarly, there is a growing list of Application Performance Monitoring tools on the market. Many of these tools can correlate the data from multiple tiers of a system to provide a view of the end-to-end transaction, on a transaction-by-transaction basis. These tools are a great addition to the RPR toolkit but may not be available and so are not covered here. However, it's worth noting that APM systems use techniques that are common to RPR, the most obvious one being time accounting.

This course is based on Windows 7 and Windows Server 2008 r2. Some of the finer detail will vary when looking at Windows 10 and Windows Server 2012.  The guide is also based on the 64-bit architecture.

Additional Material

  • Windows Internals Part 1 - ISBN: 978-0-7356-4873-9
  • Windows Internals Part 2 - ISBN: 978-0-7356-6587-3

Caveats

We've tried to ensure that everything presented in this course is factually correct, but several areas have been simplified to make the subject digestible. There are exceptions to some of the details described, but rather than cover every nuance of every point we have reflected the basics here. A full and more detailed explanation can be found in the resources used in the course.