[logback-dev] [JIRA] Commented: (LBCORE-97) Starvation on AppenderBase.doAppend

Ceki Gulcu ceki at qos.ch
Mon Jun 1 21:24:38 CEST 2009



I would just like to observe that when FileAppender writes an event to
a file, the said event must be written out completely as a single
block. We can't have event A with an associated exception
intermingled with event B in middle of A's stack trace.

I'll look into Joern's example tomorrow.

ralph.goers @dslextreme.com wrote:
> 
> 
> On Mon, Jun 1, 2009 at 6:51 AM, Joern Huxhorn <jhuxhorn at googlemail.com 
> <mailto:jhuxhorn at googlemail.com>> wrote:
> 
>     But locking will occur at some point and has to be done fairly.
>     Otherwise starvation will happen on multi-core systems.
> 
> 
> Not if the synchronized blocks take a minimal amount of time. My 
> modification to your sample app demonstrated that.
> 
> 
>     I can't imagine an appender that doesn't have to lock at *some*
>     point. Do you have any use case for this?
> 
> 
> Of course they will have to lock, possibly a few times. But if each lock 
> is for a very short duration the system will behave much better.
>  
> 
> 
> 
>     It would be possible to generally use a fixed-size blocking queue
>     (fair!) in case of UnsynchronizedAppenderBase, leaving the async
>     handling of events put into the queue to the actual implementation.
>     Assuming, of course, detached data objects again.
> 
>     What I really want is *one* correct way of handling/implementing
>     async appenders so the possibility of failure is reduced as much as
>     possible. Nothing is worse than a logging system that crashes the
>     logged application.
> 
> 
> We agree on that.
> 
> 
> 
> 
>             I'm simulating the AppenderBase situation here, lots of
>             threads are trying to access the very same resource.
>             With Java 1.6 on multi-core systems, only <number of cores>
>             threads will ever get the chance to obtain the object
>             monitor. All other threads are starving.
>             And this is a valid simplification because it shows exactly
>             what's happening in our webapp right now. We had to disable
>             logging to get it back into a usable state.
> 
>             Using a fair ReentrantLock fixes this. It makes sure that
>             threads are handled fairly concerning the shared resource,
>             i.e. they are served in a "first come, first served" order.
> 
> 
>         The issue here isn't really that there is a problem with how
>         synchronized locks work. If there were Tomcat would fall on its
>         face. Take a look at
>         https://svn.apache.org/repos/asf/tomcat/trunk/java/org/apache/catalina/loader/WebappClassLoader.java.
>         I guarantee you this is being called far more often then logging
>         is and it doesn't exhibit these problems.
> 
> 
>     The difference is that WebappClassLoader wouldn't suffer from the
>     starvation effect because threads (at least if they are ordinary
>     threads that don't perform manual classloading magic) will load all
>     classes in a burst and then just execute. In that case, the
>     optimization done in Java 6 will even increase the performance and
>     there is no point in using a fair lock.
> 
> 
> Your assumption here is wrong. Every call to new calls the class loader. 
> In most applications objects are created far more than logging is called.
>  
> 
> 
> 
>     Logging, however, is potentially executed all the time during the
>     life of the thread(s), probably even in a tight loop. This is were
>     starvation is kicking in @tight loop.
> 
> 
> That is no surprise to me.
>  
> 
> 
> 
>         The real culprit here is how much time is being spent inside the
>         synchronized block. When you look at the WebappClassLoader you
>         will see that locks are held for very short periods of time.
>         This is contrasted with Appenders that hold the lock for an
>         extremely long period of time.
> 
> 
>     See *.
> 
> 
> 
> 
>         To prove this I simply added
> 
>         for (int i=0; i < 100000; ++i) { if lock == null { throw new
>         NullPointerException(); } }
> 
>         before the synchronized or locked block (using lockObject
>         instead of lock for the synchronzied block).  While the counters
>         don't come out equal they are very close. This shows that the
>         problem is really how much time is spent in the synchronized
>         block vs the actual application.
> 
> 
>     I don't understand this example. The whole point of the program is
>     to show the starvation effect. The Thread.sleep is only included to
>     be able to interrupt the thread and simulate some work.
> 
> 
> Yes. And my point was to show that if your program logic is even 
> somewhat substantial compared to the synchronized block that the 
> starvation effect goes away. So changing Appenders to not have a single 
> lock held for a long time to several locks held for short durations 
> would minimize or eliminate this effect.
>  
> 
> 
> 
> 
> 
>         The significance of this is that even when you "fix" the locking
>         so it is fair your application performance is still going to
>         suck because it is spending so much time in the Appender vs
>         doing actual work.
> 
> 
>     See *.
> 
> 
> 
> 
> 
>             I assume you hadn't understood the point of LBCORE-97 until
>             now... and please don't take that as an insult ;)
> 
> 
>         I the understanding of it. I just don't think it is the right
>         way to fix the problem.
> 
> 
>             I wrote code under the assumption that synchronized would be
>             at least sufficiently semi-fair, Ceki did in both LOG4J and
>             Logback, and reality up until Java 1.6 on multi-core seemed
>             to indicate that it's ok to assume that total fairness isn't
>             necessary... but all this changed because synchronized is
>             absolutely unfair now - and the documentation shows that the
>             fault is on our side. We were simply working with wrong
>             assumptions.
> 
>             This just proves my point that locking should be handled by
>             the logging system in a correct, i.e. fair, way. It is
>             "downright dangerous" to assume that every person that
>             implements an appender is aware of this situation and the
>             problem will only show up on Java 1.6 multi-core systems.
>             This is a *very* big risk.
> 
>             But I agree 100% that the transformation/evaluation of the
>             logging event can and should be handled outside of whatever
>             fair locking mechanism is used. This would solve some issues
>             and seems very logical. We should definitely aim for this in
>             the future... Ideally, AppenderBase would be changed to
>             support this, IMHO.
> 
>             It would probably be a better idea to use a
> 
>             protected final ReentrantReadWriteLock readWriteLock=new
>             ReentrantReadWriteLock(true);
> 
>             fair ReentrantReadWriteLock in AppenderBase instead.
>             doAppend would then use the write lock while actual
>             implementations would have the freedom of using either read
>             or write lock, as needed. This would remove the need of
>             synchronizing the worker queue you mentioned earlier. I'm
>             not sure if we really need a ReadWriteLock, though, as
>             removal from the queue would be a write, too. But *some*
>             appender implementation could have a use case for
>             read-only-access...
> 
>             One way to handle transformation of the event outside the
>             lock is what I suggested in
>             http://jira.qos.ch/browse/LBCLASSIC-45.
>             Just dumb down LoggingEvent enough so that it already is
>             detached from the rest of the system after it's created. But
>             someone, and I honestly can't remember but I think it was
>             you, said that it would be bad if the MDC was already
>             transformed into Map<String,String> at that point...
> 
>             I've just changed my patch to make lock protected instead of
>             private. Seems to be a much better idea :p
> 
>             And just to get my point straight once more:
>             A logging system MUST NOT be a point of starvation! It will
>             naturally always be a chokepoint but it MUST NEVER be a
>             point of starvation!
> 
>         On this we agree. If switching to a ReentrantReadWriteLock fixes
>         the problem for AppenderBase then I certainly don't have a
>         problem with that. Personally, I see it as sort of a band-aid
>         though. I'd like to go through and deal with all the code that
>         isn't thread-safe and move the locking down to just those areas
>         where it can't be avoided. Every reference you can find on
>         multithreaded programming will tell you to hold locks for as
>         short a duration as possible. Logback and Log4j don't do that
>         which opens them up to these kinds of issues.
> 
> 
>     See * ;)
> 
>     That's correct. And yes, it's some kind of a band-aid, albeit a very
>     important one.
>     Async logging has other downsides, though, as it will be unable to
>     make sure that events are actually appended before the VM
>     exits/crashes, for example. This *can* be a real downside.
> 
> 
> I'm not talking about synchronous vs asynchronous logging. All appenders 
> should be modified to use UnsynchronizedAppenderBase. This is a fair 
> amount of work but will eliminate most, if not all, of these problems.
> 
> 
> 
>     So in some cases, e.g. file appender, you'll *want* synchronous
>     logging, no matter what the costs are. Moving the synchronization to
>     a lower level wouldn't help much, IMO, since it would simply move
>     the congestion and I don't really care if it happens while
>     converting events or while actually writing the data to a file if
>     writing has the performance impact. The application wouldn't be able
>     to handle other logging, anyway.
>     Unless I miss some important point, feel free to prove me wrong.
> 
> 
> In the case of file appenders, the congestion will end up occuring in 
> java.io <http://java.io> somewhere. This is a good thing. It avoids 
> deadlocks and any performance analysis  will show the bottleneck as 
> being there instead of in Logback.
>  
> 
> 
> 
>     Take a look at
>     http://apps.sourceforge.net/trac/lilith/wiki/AppenderPitfalls if you
>     haven't already and let me know if you find any mistakes, please.
> 
> 
> Sure, when I get a chance this evening.
> 
> 
> 
>     My big pain-point with UnsynchronizedAppenderBase is also it's main
>     feature: It's leaving all the locking to the respective implementation.
>     All I'm trying to say is that it's very easy to do that part wrong
>     and that it won't be apparent if it *is* done wrong.
> 
> 
> You'd be surprised how little locking is actually required. Before 
> condemning this approach I would suggest you take a look at what it 
> would actually require.
> 
> 
> Ralph
> 
> 
> ------------------------------------------------------------------------
> 
> _______________________________________________
> logback-dev mailing list
> logback-dev at qos.ch
> http://qos.ch/mailman/listinfo/logback-dev

-- 
Ceki Gülcü
Logback: The reliable, generic, fast and flexible logging framework for Java.
http://logback.qos.ch


More information about the logback-dev mailing list