Tuesday, 24 February 2015

Loading SOS in windbg. Why is it never quite as easy as you hope it’ll be?

I started analysing a production crash dump at my desk, with a set of libraries that don’t match those installed on the production server. Typically windbg would complain:

> !clrstack
The version of SOS does not match the version of CLR you are debugging.  Please
load the matching version of SOS for the version of CLR you are debugging.
CLR Version: 4.0.30319.1026
SOS Version: 4.0.30319.18444
Failed to load data access DLL, 0x80004005
Verify that 1) you have a recent build of the debugger (6.2.14 or newer)
            2) the file mscordacwks.dll that matches your version of clr.dll is 
                in the version directory or on the symbol path
            3) or, if you are debugging a dump file, verify that the file 
                mscordacwks___.dll is on your symbol path.
            4) you are debugging on supported cross platform architecture as 
                the dump file. For example, an ARM dump file must be debugged
                on an X86 or an ARM machine; an AMD64 dump file must be
                debugged on an AMD64 machine.
 
You can also run the debugger command .cordll to control the debugger's
load of mscordacwks.dll.  .cordll -ve -u -l will do a verbose reload.
If that succeeds, the SOS command should work on retry.
 
If you are debugging a minidump, you need to make sure that your executable
path is pointing to clr.dll as well.

The first step is to get hold of the correct libraries from the server:

cp \\server\c$\windows\microsoft.net\Framework64\v4.0.30319\sos.dll c:\temp
cp \\server\c$\windows\microsoft.net\Framework64\v4.0.30319\mscordacwks.dll c:\temp
cp \\server\c$\windows\microsoft.net\Framework64\v4.0.30319\clr.dll c:\temp

In theory at that point, I should just be able to run

> .load c:\temp\sos.dll
> !clrstack

but sadly still no joy. If I run

> .chain

it tells me that it knows about more than one version of SOS

…
Extension DLL chain:
    C:\Windows\Microsoft.NET\Framework64\v4.0.30319\sos: image 4.0.30319.18444, API 1.0.0, built Wed Oct 30 21:40:20 2013
        [path: C:\Windows\Microsoft.NET\Framework64\v4.0.30319\sos.dll]
    c:\temp\sos.dll: image 4.0.30319.1026, API 1.0.0, built Thu Jul 03 07:58:50 2014
        [path: c:\temp\sos.dll]
…

I can unload the non-useful one

> .unload C:\Windows\Microsoft.NET\Framework64\v4.0.30319\sos

and then run SOS commands happily.