[logback-user] Best practices to avoid deadlocks while logging

Sat Jun 13 16:13:49 CEST 2009

Ralph Goers a écrit :
> Create a Jira issue.  If you really want it fixed create a patch and 
> add it to the issue. That isn't a guarantee, but it is more likely to 
> get attention if you have already done most of the work.
Done, here: http://jira.qos.ch/browse/LBCLASSIC-133.

Thanks to those who gave advices.
Olivier

>
> Ralph
>
> On Jun 5, 2009, at 2:56 PM, Olivier Cailloux wrote:
>
>> Chris Miller a écrit :
>>> Ah I see, I hadn't looked at your code closely enough so failed to
>>> realise it was your toString() implementation that was grabbing the
>>> lock, rather than Logback itself. That makes a lot more sense of 
>>> course!
>>>
>>> I don't think there's any solution from your side other than to rework
>>> the locking or call worker.toString() yourself, as you have indicated.
>>>
>>> So yes I agree with your conclusion, Logback should be evaluating the
>>> toString() call before it grabs any internal locks. This should solve
>>> the problem plus potentially improve performance of Logback when
>>> multiple threads are writing, without any obvious downside. After a
>>> quick glance at the Logback code though this may not be a
>>> straightforward change, since the worker object gets treated as an
>>> Object/Object[] until right at the last minute.
>>
>> Do you think I should file a request for improvement about this? It does
>> not seem like Ceki is reading this thread... What's the preferred way to
>> go about suggestions for change in the code for the logback project?
>> Olivier
>>
>>>
>>>> Hi,
>>>>
>>>> Thank you for your reply.
>>>>
>>>> I agree with your comments on synchronization best practices. I simply
>>>> produced a toy example to illustrate the problem I am encountering in
>>>> my application, but the code is not the one I use in my application. I
>>>> now changed the example to lock on a private lock object (hereby
>>>> attached). I also added a logback-test.xml file because the problem
>>>> only happen if logback indeed logs debug statements, so the
>>>> configuration file is needed to reproduce the problem. Also the
>>>> "final" on StringBuilder has indeed nothing to do with
>>>> synchronization, it is simply my habit of always declaring my
>>>> variables final when I don't need to change them after creation, for
>>>> code self-documentation and auto-verification purposes.
>>>>
>>>> I also agree on the fact that the deadlock happens because of two
>>>> different orders in locking. One thread ("WorkerThread") locks the
>>>> worker, then wants to acquire a lock on the logger, while the other
>>>> one ("AccessorThread") first locks the logger, and while holding the
>>>> lock, tries to acquire a lock on the worker object.
>>>>
>>>> But the problem is not solved by replacing the lock on the worker
>>>> object itself by a lock on an internal lock: the same mechanisms still
>>>> take place. The logger calls, in its internal routine,
>>>> worker.getStatus(), which tries to lock the worker internal lock.
>>>> Impossible as an other thread already holds the lock on the worker
>>>> internal lock (because it is being executing the very same method!).
>>>>
>>>> Please also note that logback does not really decide to lock the
>>>> worker object. It just calls worker.toString(), as is expected from
>>>> the logging statement ("s_logger.debug("Current worker status is:
>>>> {}.", worker);"). And this method transparently locks the worker lock.
>>>>
>>>> The more I think about it, the more I feel that the problem comes from
>>>> the fact that, AFAIU, logback does not release the lock on its
>>>> internal (shared) logger before calling a method on a not controlled,
>>>> externally-provided, potentially shared, object (i.e., worker). Would
>>>> logback 1) call worker.toString(), producing a String, and 2) acquire
>>>> the lock and print the string, then the deadlock would not happen. As
>>>> is illustrated by the fact that replacing the above logging statement
>>>> by the apparently very similar "s_logger.debug("Current worker status
>>>> (not deadlocking) is: {}.", worker.toString());" does not trigger the
>>>> deadlock any more.
>>>>
>>>> I would still be very interested by any proposal to change the
>>>> attached demonstration code to avoid triggering the deadlock (other
>>>> than systematically releasing any internal lock before issuing any
>>>> logging statement, which I hope is not the only solution because not
>>>> very convenient).
>>>>
>>>> Once again thank you for your comments.
>>>> Olivier
>>>> Chris Miller a écrit :
>>>>
>>>>> What's happening here, as I guess you're aware, is that multiple
>>>>> locks are not being aquired in the same fixed order globally - a
>>>>> situation prone to deadlocking. But on top of that, Logback is
>>>>> unintentially executing a denial-of-service attack on your app by
>>>>> locking the Worker instance. The reason this can happen is because
>>>>> your Worker code is synchronizing on the Worker object itself. That
>>>>> object is exposed to the outside world and therefore you have no
>>>>> control over who else can lock on it and therefore no way to prevent
>>>>> possible deadlocks or a plain denial-of-service.
>>>>>
>>>>> Best practices is to never use "synchronized(this) { ... }" or
>>>>> synchronize entire methods (which amounts to the same thing) in your
>>>>> code. Instead your Worker class should have a private lock object of
>>>>> its own that you synchronize on. This makes the lock object
>>>>> inaccessible to the outside world and therefore impossible (without
>>>>> malicious reflection at least!) to cause a denial-of-service
>>>>> situation like you are experiencing.
>>>>>
>>>>> eg:
>>>>>
>>>>> public class Worker {
>>>>> private static final Logger logger =
>>>>> LoggerFactory.getLogger(Worker.class);
>>>>> private final Object LOCK = new Object();
>>>>> public void work() {
>>>>> synchronized(LOCK) {
>>>>> ...
>>>>> logger.info("Did some work...");
>>>>> }
>>>>> ...
>>>>> }
>>>>> Note that the 'final' is essential for the LOCK object declaration
>>>>> (and really for the logger variable too). Without final, you risk
>>>>> different threads locking on different objects and visibility issues
>>>>> between threads. In your implementation of getStatus() { ... } you
>>>>> declare a final StringBuffer though - I guess you're aware that
>>>>> doesn't have any impact on the threadsafety of your code?
>>>>>
>>>>>> Dear list,
>>>>>>
>>>>>> I recently had a deadlock problem related to logging and I am
>>>>>> wondering if it comes from a problem in the logging framework (I
>>>>>> must say I doubt it) or from the way I use logging, in which case
>>>>>> I'd like to hear advices about how others avoid this kind of
>>>>>> problems.
>>>>>>
>>>>>> The pattern is the following. (Please see hereby the java classes
>>>>>> which
>>>>>> might be clearer.)
>>>>>> - In a class "Worker", a method "work()" locks the object instance
>>>>>> of
>>>>>> the class Worker to do some work and calls the logging framework to
>>>>>> print the result (while still holding the lock).
>>>>>> - In the same class "Worker", an other method "getStatus()" obtains
>>>>>> the
>>>>>> current status of the object instance of the class Worker and
>>>>>> returns
>>>>>> it
>>>>>> as a String (method used for logging). That method locks the object
>>>>>> at
>>>>>> some point to make sure it reads consistent internal status
>>>>>> information.
>>>>>> - In an other class "Accessor", a method "access()" performs some
>>>>>> work
>>>>>> (not locking anything), then logs the status of an object of the
>>>>>> class
>>>>>> Worker by using its getStatus() method.
>>>>>> Now the deadlock scenario is the following.
>>>>>> - A thread "WorkerThread" is executing the method "work()". It has
>>>>>> acquired the lock for the Worker object and has not reached the
>>>>>> logging
>>>>>> statement yet.
>>>>>> - An other thread "AccessorThread" is executing the method
>>>>>> "access()",
>>>>>> and is executing the logging statement.
>>>>>> AFAIU, logback console logger has some sort of "central" locking
>>>>>> point:
>>>>>> when a thread is logging, an other thread may not log in parallel
>>>>>> (that
>>>>>> seems good as the user does not want his different logging
>>>>>> statements
>>>>>> to
>>>>>> be mixed together)... Thus, the following happens.
>>>>>> - AccessorThread has a lock to the Logger, and waits to acquire the
>>>>>> Worker lock to execute the synchronized block in getStatus().
>>>>>> - WorkerThread is holding that precise lock, and will never release
>>>>>> it,
>>>>>> because arriving at the logging statement in "work()", it starts
>>>>>> waiting
>>>>>> for the Logger lock acquired by AccessorThread. These two threads
>>>>>> are
>>>>>> caught in a deadlock.
>>>>>> I feel the problem comes from not respecting the general rule that a
>>>>>> block of code holding a lock should not call a method on an
>>>>>> externally
>>>>>> controlled object (possibly shared with other threads) whose locking
>>>>>> behavior is unknown. (Note that both my code and the logging
>>>>>> framework
>>>>>> do not respect this rule, AFAIK.) So I should not have issued the
>>>>>> log
>>>>>> statement in my synchronized block in "work()". In my case, I can
>>>>>> remove this logging statement from the synchronized block, but it is
>>>>>> not clear to me whether it is always possible (or easy) to do so,
>>>>>> especially if the thing you want to log depends on an internal state
>>>>>> which might change as soon as you release the lock.
>>>>>> An other way to solve that problem is to not use the parametrized
>>>>>> message system in SLF4J. Indeed, it would disable the possibility of
>>>>>> the AccessorThread to try to acquire the lock on Worker while being
>>>>>> in the SLF4J method.
>>>>>>
>>>>>> I am wondering however if there is not a way that slf4j
>>>>>> implementation (logback) release the lock on its logger, then gets
>>>>>> the Worker lock, then re-acquire the lock on its logger only after
>>>>>> having released the Worker lock, so that it has only one lock at a
>>>>>> time? I tend to think that it would enable a user of the logging
>>>>>> framework to pay no attention of the deadlock possibilities incurred
>>>>>> by the logging framework, and to avoid him introducing subtle bugs
>>>>>> with logging statements... But I may be wrong and I am not an expert
>>>>>> in multithread programming.
>>>>>>
>>>>>> More generally speaking, what would be the advices from experienced
>>>>>> programmers to avoid this problem? How do you log in a
>>>>>> multithreading context?
>>>>>>
>>>>>> Also I am wondering if that (anti-)pattern could be documented on
>>>>>> the logback page, to avoid others to do the same mistake as me? Or
>>>>>> even on the SLF4J page, if it is not logback-specific?
>>>>>>
>>>>>> The attached classes provide a test case showing the deadlock
>>>>>> scenario. The test fails reliably on my box.
>>>>>>
>>>>>> Thank you for any help.
>>>>>> Olivier
> _______________________________________________
> Logback-user mailing list
> Logback-user at qos.ch
> http://qos.ch/mailman/listinfo/logback-user