Hungry Mind , Blog about everything in IT - C#, Java, C++, .NET, Windows, WinAPI, ...

A story of orphaned ReaderWriterLockSlim

Recently I got 2 dumps of a resource intensive process. The customer complained about hangs in web UI so the application had been killed and restarted numerous times. Quick WinDbg analysis spotted thousands of working threads in the pool:

0:000> !ThreadPool
CPU utilization: 6%
Worker Thread: Total: 6304 Running: 6303 Idle: 1 MaxLimit: 12000 MinLimit: 24
Work Request in Queue: 0
--------------------------------------
Number of Timers: 2
--------------------------------------
Completion Port Thread:Total: 2 Free: 1 MaxFree: 48 CurrentLimit: 1 MaxLimit: 12000 MinLimit: 24

Most of the threads wait for ReaderWriterLockSlim read lock on ManualResetEvent instance:

System.Threading.WaitHandle.WaitOneNative(System.Runtime.InteropServices.SafeHandle, UInt32, Boolean, Boolean)
System.Threading.WaitHandle.InternalWaitOne(System.Runtime.InteropServices.SafeHandle, Int64, Boolean, Boolean)
System.Threading.ReaderWriterLockSlim.WaitOnEvent(System.Threading.EventWaitHandle, UInt32 ByRef, TimeoutTracker)
System.Threading.ReaderWriterLockSlim.TryEnterReadLockCore(TimeoutTracker)
System.Threading.ReaderWriterLockSlim.TryEnterReadLock(TimeoutTracker)

One thread was waiting for write lock on the same object. No other stacks observed executing while holding the lock, all lock usages seemed proper:

s.EnterXXXLock();
try
{
   // Do the job
}
finally
{
   s.ExitXXXLock();
}

Yet the process is fucked up. What the hell is wrong here? Well, sometimes things get very complicated...

Lets take a look on reader writer lock instance:

0:3444> !do 0x0000000001affe60
Name:        System.Threading.ReaderWriterLockSlim
MethodTable: 000007f87a91c1a8
EEClass:     000007f87a639448
Size:        96(0x60) bytes
File:        C:\Windows\Microsoft.Net\assembly\GAC_MSIL\System.Core\v4.0_4.0.0.0__b77a5c561934e089\System.Core.dll
Fields:
              MT    Field   Offset                 Type VT     Attr            Value Name
000007f8802fc7b8  4000755       50       System.Boolean  1 instance                1 fIsReentrant
000007f8802fdc90  4000756       30         System.Int32  1 instance                0 myLock
000007f8802f1ed0  4000757       34        System.UInt32  1 instance                1 numWriteWaiters
000007f8802f1ed0  4000758       38        System.UInt32  1 instance             6293 numReadWaiters
000007f8802f1ed0  4000759       3c        System.UInt32  1 instance                0 numWriteUpgradeWaiters
000007f8802f1ed0  400075a       40        System.UInt32  1 instance                0 numUpgradeWaiters
000007f8802fc7b8  400075b       51       System.Boolean  1 instance                0 fNoWaiters
000007f8802fdc90  400075c       44         System.Int32  1 instance               -1 upgradeLockOwnerId
000007f8802fdc90  400075d       48         System.Int32  1 instance               -1 writeLockOwnerId
000007f8802f8d00  400075e        8 ...g.EventWaitHandle  0 instance 00000000f8e8f9c0 writeEvent
000007f8802f8d00  400075f       10 ...g.EventWaitHandle  0 instance 00000000fa23f040 readEvent
000007f8802f8d00  4000760       18 ...g.EventWaitHandle  0 instance 0000000000000000 upgradeEvent
000007f8802f8d00  4000761       20 ...g.EventWaitHandle  0 instance 0000000000000000 waitUpgradeEvent
000007f88030ff60  4000763       28         System.Int64  1 instance 9 lockID
000007f8802fc7b8  4000765       52       System.Boolean  1 instance                0 fUpgradeThreadHoldingRead
000007f8802f1ed0  4000766       4c        System.UInt32  1 instance       1073741824 owners
000007f8802fc7b8  4000767       53       System.Boolean  1 instance                0 fDisposed
000007f88030ff60  4000762      408         System.Int64  1   static 17381 s_nextLockID
000007f87a9399f0  4000764        8 ...ReaderWriterCount  0 TLstatic  t_rwc
    >> Thread:Value c18:0000000001917410 d18:00000000025a51c8 e54:000000000245d5f0 e90:0000000000000000 e20:00000000f90a6ce8 [>6000 more values]

The most valuable information is the owners field:

0:000> ? 0n1073741824
Evaluate expression: 1073741824 = 00000000`40000000

And heres what it means:

//The uint, that contains info like if the writer lock is held, num of
//readers etc. 
uint owners; 

//Various R/W masks 
//Note:
//The Uint is divided as follows:
//
//Writer-Owned  Waiting-Writers   Waiting Upgraders     Num-REaders 
//    31          30                 29                 28.......0
// 
//Dividing the uint, allows to vastly simplify logic for checking if a 
//reader should go in etc. Setting the writer bit, will automatically
//make the value of the uint much larger than the max num of readers 
//allowed, thus causing the check for max_readers to fail.

private const uint WRITER_HELD = 0x80000000;
private const uint WAITING_WRITERS = 0x40000000; 
private const uint WAITING_UPGRADER = 0x20000000;

So, we are waiting for writers. Hold on, there are no writers! The lock is not held. Conslusion - the lock state is corrupted and could never recover. This is called orphaned lock.

The only thing (I am aware of) might have caused the orphan - asynchronous thread aborts. If a thread is interrupted while taking a lock via [Try]EnterXXXLock method - we might come into described problem since those methods are not atomic. In my case thread aborts are triggered by WCF runtime (or perhaps Http runtime, it doesn't matter).

Heres a simple code to simulate the situation:

using System;
using System.Threading;
 
namespace CLRInv
{
   internal class Program
   {
      private static readonly ReaderWriterLockSlim rwl = new ReaderWriterLockSlim(LockRecursionPolicy.SupportsRecursion);
 
      private static void Main(string[] args)
      {
         rwl.EnterReadLock();
         do {
            rwl.ExitReadLock();
 
            var reader = new Thread(UseLockForRead);
            var writer = new Thread(UseLockForWrite);
            reader.Start();
            writer.Start();
 
            Thread.Sleep(TimeSpan.FromSeconds(2));
            writer.Abort();
            reader.Abort();
 
            reader.Join();
            writer.Join();
         }
         while (rwl.TryEnterReadLock(TimeSpan.FromSeconds(10)));
 
         Console.WriteLine("Gotcha!");
 
         // Forever young
         rwl.EnterWriteLock();
      }
 
      private static void UseLockForRead()
      {
         try {
            for (;;) {
               rwl.EnterReadLock();
               try {
               }
               finally {
                  rwl.ExitReadLock();
               }
            }
         }
         catch (ThreadAbortException) {
         }
      }
 
      private static void UseLockForWrite()
      {
         try {
            for (;;) {
               rwl.EnterWriteLock();
               try {
               }
               finally {
                  rwl.ExitWriteLock();
               }
            }
         }
         catch (ThreadAbortException) {
         }
      }
   }
}

The conclusion is not very optimistic - you can't use slim locks the way you normally use em if your application experiences timeouts and consequent thread aborts. Does this mean slim locks should be banned? Well, no. You just need to ensure special constructions are used to take and release locks.

First of all we need to prevent async aborts while executing [Try]EnterXXXLock. How to do that? You must take the lock inside so called protected region. Here they mention a protected region of code, such as a catch block, finally block, or constrained execution region. This basically means ThreadAbortExeption can't be thrown asynchronously while executing except and finally blocks of a try statement. So our [Try]EnterXXXLock should be wrapped like this:

try {} finally { rw.EnterXXXLock(); }

Weird? No, if you have .NET BCL source code. There are tonns of empty try blocks with excessive comments:

// prevent ThreadAbort while updating state
try { } 
finally
{
...
}

Proper slim lock usage turns out to be the following construction:

var lockIsHeld = false;
try {
   try {
   }
   finally {
      rwl.EnterReadLock();
      lockIsHeld = true;
   }
 
   // Do work here
}
finally {
   if (lockIsHeld) {
      rwl.ExitReadLock();
   }
}

Asynchronous ThreadAbortException is thrown either before lock is held or after lock is held making finally unlock the object if it has been locked.

Two things I havent studied yet - is it possible to observe the following situation:

try {
   // <-- Could it happen here, before finally block is run but after try has opened fault clause region?
   try {
   }
   finally {
      // Lock
   }
 
   // Use resource
}
finally {
   // Unlock
}

Thats why I used that condition flag to ensure the lock is held.

And the second one:

try {
}
finally {
   // Lock
}
try {
   // Use resource
}
finally {
   // Unlock
}

Is this one safe? Probably yes.

The bottom line is know your runtime environment, don't use new features cause they are cool or Mr. Jeff has fresh stuff in his brand new book you love so much. Or hire a professional like me [:-D].

12 коммент.:

Анонимный комментирует...

Appreciation to my father who stated to me concerning
this webpage, this web site is in fact remarkable.



Also visit my webpage ... http://ketoneadvanceds.org/ (wayofcontemplation.com)

Анонимный комментирует...

Thanks for the marvelous posting! I seriously enjoyed reading it,
you're a great author.I will be sure to bookmark your blog and
will often come back from now on. I want to encourage continue your
great posts, have a nice holiday weekend!

Have a look at my webpage :: Pure Muscle toner

Анонимный комментирует...

Excellent site you have here.. It's hard to find quality writing like yours these days.
I honestly appreciate individuals like you!
Take care!!

Feel free to surf to my web-site - dermolyte

Анонимный комментирует...

Hi I am so delighted I found your web site, I really found you by error, while I was browsing on Digg for something else, Anyhow I am here now and would just like to say thanks a lot for a
tremendous post and a all round entertaining blog (I also love the theme/design), I don’t have time to go
through it all at the minute but I have book-marked it and also added
in your RSS feeds, so when I have time I will be back to read more, Please do keep up the superb b.


my weblog Muscle zx90 supplements

Анонимный комментирует...

My point is, there are other shows that may compete with Grey's hotness;
it is, of course, entirely subjective anyway (maybe you find the
cast of. Mike: 'I believe it, but it sounds like Science-Fiction. To
match with the sunshine and the sand on the beach,
you may choose really bright colors of your silk scarf.

Feel free to visit my website :: hot chicks

Анонимный комментирует...

Hello there! I could have sworn I've been to this site before
but after going through many of the articles I
realized it's new to me. Nonetheless, I'm certainly
delighted I discovered it and I'll be bookmarking it and checking back regularly!



My site :: saffron pills

Анонимный комментирует...

It's amazing for me to have a web page, which is beneficial in favor of my experience.
thanks admin

my web-site; Buy Miracle Saffron (mystical-spirit.com)

Анонимный комментирует...

Wonderful work! That is the type of info that are supposed to
be shared around the web. Shame on the seek engines
for no longer positioning this post upper! Come on over and seek
advice from my website . Thanks =)

Feel free to visit my weblog - emergency hail damage services South Carolina

Анонимный комментирует...

You're soo awesome! I do not believe I've
truy read anything likke that before. So good to find
another person witfh original thoughts on this subject.
Really.. many thanks for starting this up. This sige is something that's needed oon the web,
someone with a little originality!

Here is my weblog ... best long beach search engine
optimization services - ,

Анонимный комментирует...

My brother suggested I may like this website. He was entirely right.
This put up truly made my day. You can not believe just how much time I had spent for this
information! Thanks!

my blog; get your ex back letter samples

Анонимный комментирует...

excellent points altogether, you just gained a brand new reader.
What could you suggest in regards to your post that you simply made a few days ago?
Any sure?

Here is my blog post: życzenia na nowy rok

Анонимный комментирует...

A big mistake that many college students make is design 365 spending too
much money on their dorm, rather than a simple horizontal linear layout.
What should you keep in mind when we look at the design guidelines.

Since these machines are specifically designed for personal
use or otherwise. Good Luck and let me know if I fumbled any wording there,
but don't take the precious space from your family.



My website - eames office chairs (http://helpdesk.ccbtools.com/)

Отправить комментарий

Copyright 2007-2011 Chabster