[logback-user] AsyncAppenderBase not flushing queue during JVM shutdown
champion
champion at 163.com
Fri Mar 7 07:42:55 CET 2014
I've writen a faster AsyncAppender which use disruptor,and run stable in my project for years.the source is at https://github.com/cp149/jactor-logger, Maybe it can give you some help.
champion
From: Michael Reinhold
Date: 2014-03-07 02:48
To: logback users list
Subject: Re: [logback-user] AsyncAppenderBase not flushing queue during JVM shutdown
Hi David, Chris,
Thanks for the feedback. In the test case that I ran with the AsyncAppender, LoggerContext.stop did not wait for the async appender to finish flushing before exiting - the stop method returned immediately. I just retested this scenario to be sure. My test should generate exactly 135 log events (also going to a local file appender, so I can confirm that entries are present or missing). Without the async appender I get all of the expected events (at much lower throughput); with the async appender (using the call to LoggerContext.stop prior to exiting) I get a variable number (generally less than 20 events). It's possible that the lower latency of flushing to local files results in the async appender being able to flush completely to disk before the JVM exits, whereas the latency to Loggly is just too high. Queue size probably also has an impact - I allow for a fairly large queue because the penalty of falling back to synchronous logging for Loggly is large and my logging tends to come in bursts.
If stop() did wait for async appenders to flush, that would improve the situation. Although the use-case that involves shutdown hooks that want to log would still need to be addressed. The only way that I can see that does address allowing shutdown hooks to log is via a queue flushing shutdown hook (requiring a call to LoggerContext.stop won't help for shutdown hooks or applications that use JVM implicit shutdowns). My opinion is that a shutdown hook can satisfy both needs with minimal impact to other classes or semantics on how LoggerContext.stop functions. One hook per async appender would be simple to implement and would also allow for parallel flushing of async appender queues.
A default shutdown timeout as you described makes sense to me and fits with what I was thinking for implementation of a shutdown hook.
I could probably throw together a quick and dirty implementation tonight as a proof of concept. Thoughts?
Regards,
Mike Reinhold
On Thu, Mar 6, 2014 at 1:01 PM, Chris Pratt <thechrispratt at gmail.com> wrote:
Another option might be to require calling the LoggerContext.stop when using AsyncAppender's. The stop method could signal to the AsyncAppenders that they should drain their queue's and exit, which would allow the application to shut down naturally (if the AsyncAppenders aren't started as demon's)
(*Chris*)
On Thu, Mar 6, 2014 at 9:55 AM, David Roussel <nabble at diroussel.xsmail.com> wrote:
Mike,
I would expect the shutdown of logback to wait for async appenders to drain their queues before we can consider the shutdown down.
I have used async appenders to provide faster logging when writing to local files. And that's what I expected happens.
Of course there is the problem of waiting for too long. How about an async appender has a default shutdown timeout which can be overridden by config. So it can wait 2 seconds by default, but can be overridden to 30 seconds for the loggly usecase.
The simple solution is to put this all in AsyncAppender(Base), but if we want to allow multiple async appenders to shutdown in parallel, then the fix might involve more changes.
David
On 6 Mar 2014, at 12:51, Michael Reinhold <mike at coursescheduler.io> wrote:
Hi David,
I forgot to mention this in my original note and it is a good thought, but yes I have tried to cleanly shutdown Logback. Even when the LogbackContext stop method is invoked prior to JVM exit, the async queue is not flushed completely. When Logback is stopped cleanly, the async worker thread exits the "waiting for events in the queue" while loop and moves on to the "flush remaining elements from the queue" for loop (confirmed by the info message when debug is enabled and via breakpoint). Ultimately, because the thread is a daemon thread - this process is still interrupted by JVM shutdown. There is no guarantee that the async worker will get enough time to process all events.
From a theoretical standpoint, stopping Logback properly does not solve flushing issues with the AsycAppender in a number of cases (it may be fair to argue that it should not attempt to cover all of these cases):
Queue is relatively full or wrapped appender latency is high - the async worker may not be able to finish processing the queued items between when the LogbackContext.stop method is invoked and when the JVM actually exits
Implicit JVM exit upon end of last user thread - some applications don't explicitly call System.exit but rather rely on implicit JVM exit. In that case, where would you cleanly shutdown Logback? It can probably be done, but would require an application redesign to avoid reliance on implicit exit or some form of thread tracking thread.
Shutdown hooks are used to clean up resources - some applications have components or modules that need to clean up after themselves when the JVM shuts down. Typically, shutdown hooks are used for this purpose, however the AsyncAppender's worker thread is unable to process queued events after shutdown has been initiated (while the shutdown hooks are running). This also prevents shutdown hooks from being able to log events
Signal handling on *nix systems - signals can be sent to the application for a number of reasons, most often to request graceful shutdown or to terminate something that is non-responsive. If the signal corresponds to something that generally means "graceful shutdown" on the host system, normal cleanup routines should execute in the application (typically implemented as shutdown hooks, so maybe this is the same as the previous one).
From a practical standpoint, I am running into the first and the third scenarios. Even shutting down cleanly, the latency and queue depth can cause elements to be missed; additionally, I have shutdown hooks that clean up resources and expect the ability to log events and errors.
My thought is that a modified AsyncAppenderBase implementation could (optionally) install a shutdown hook that continues watching the queue for new events. A configurable "queue idle time" could be used to tell the shutdown hook that it has probably caught all events and can safely shutdown (not foolproof, I'm still thinking about other ways of handling it). Additionally, the hook could have a configurable "max processing time" so that it doesn't keep the JVM alive for an undesirable amount of time (thinking in the context of automation software where shutdown requests typically have a max execution time before they are considered as "failed online").
I hope this helps clarify the scenario I am considering and a proposed solution!
Regards,
Mike Reinhold
On Wed, Mar 5, 2014 at 5:30 PM, David Roussel <nabble at diroussel.xsmail.com> wrote:
Did you try shutting down logback cleanly. Like this http://stackoverflow.com/questions/3678755/do-i-need-to-flush-events-when-shutting-down-using-logback
David
On 5 Mar 2014, at 20:44, Michael Reinhold <mike at coursescheduler.io> wrote:
Hi Ceki,
I am currently using the AsyncAppender in combination with the LogglyAppender from the Logback extensions project. While working on a number of aspects of my application, I found that I was not consistently getting all of the log messages that I expected. In particular, when the application shuts down immediately (or very shortly) after a burst of logging activity, the tail of those log events is often not present in Loggly. From a number of tests, this issue is not restricted to use with the LogglyAppender, but is simply more evident because of the latency involved.
Looking through the source code for the AsyncAppenderBase, I saw the following:
You create the Async sender thread as a Daemon thread
addInfo("Setting discardingThreshold to " + discardingThreshold);
worker.setDaemon(true);
worker.setName("AsyncAppender-Worker-" + worker.getName());
// make sure this instance is marked as "started" before staring the worker Thread
super.start();
worker.start();
In the sender thread, if you determine that the parent thread has stopped or the async sender thread has been interrupted, you allow the worker thread to flush remaining log events in the queue.
while (parent.isStarted()) {
try {
E e = parent.blockingQueue.take();
aai.appendLoopOnAppenders(e);
} catch (InterruptedException ie) {
break;
}
}
addInfo("Worker thread will flush remaining events before exiting. ");
for (E e : parent.blockingQueue) {
aai.appendLoopOnAppenders(e);
}
aai.detachAndStopAllAppenders();
From what I can tell, during JVM shutdown (for a standalone SE instance, the same is probably not true for EE instances with application servers) daemon threads may be terminated without allowing the the AsyncAppenderBase to escape the while loop and proceed onto the queue flush for loop.
From Brian Goetz in Java Concurrency in Practice:
"When a thread exits, the JVM performs an inventory of running threads, and if the only threads that are left are daemon threads, it initiates an orderly shutdown. When the JVM halts, any remaining daemon threads are abandoned finally blocks are not executed, stacks are not unwound the JVM just exits. Daemon threads should be used sparingly few processing activities can be safely abandoned at any time with no cleanup. In particular, it is dangerous to use daemon threads for tasks that might perform any sort of I/O. Daemon threads are best saved for "housekeeping" tasks, such as a background thread that periodically removes expired entries from an in-memory cache."
To test this, I inserted a break point in the AsyncAppenderBase at the call to addInfo and ran a variety of different scenarios. At no point in time was I able to get the breakpoint to stop at the addInfo line.
I don't think there are any clear cut solutions to this. Making the worker thread a user thread instead of daemon thread has its own implications, particularly that if all other user threads have exited the async threads would keep the JVM instance alive (unless System.exit has been called, in which case I believe that you will still have lost log events even if the async processing thread is not a daemon). It might be possible to create a shutdown hook that does the queue flushing for the async worker - though you may need to consider the possibility of other shutdown hooks wanting to log events as well.
I'm currently mocking up a version of the AsyncAppenderBase and AsyncAppender that installs a shutdown hook as described previously. I think a "queue idle time" configuration might be the best way to handle other shutdown hooks adding log events (aka - after processing whatever was in the queue, if no new events are added within x ms then the shutdown hook can assume nothing else will be adding log events and can exit). An alternative might be to have the shutdown hook query the ThreadMXBean API to determine if other shutdown hooks are still executing and possibly adding log events (though the threads that are expected to be running during shutdown may be not only application specific but also JVM implementation specific... I'm not sure).
Let me know what you think. I'll let you know if I feel that my mockup may be viable...
Regards,
Mike Reinhold
_______________________________________________
Logback-user mailing list
Logback-user at qos.ch
http://mailman.qos.ch/mailman/listinfo/logback-user
_______________________________________________
Logback-user mailing list
Logback-user at qos.ch
http://mailman.qos.ch/mailman/listinfo/logback-user
_______________________________________________
Logback-user mailing list
Logback-user at qos.ch
http://mailman.qos.ch/mailman/listinfo/logback-user
_______________________________________________
Logback-user mailing list
Logback-user at qos.ch
http://mailman.qos.ch/mailman/listinfo/logback-user
_______________________________________________
Logback-user mailing list
Logback-user at qos.ch
http://mailman.qos.ch/mailman/listinfo/logback-user
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.qos.ch/pipermail/logback-user/attachments/20140307/1a8bcd3a/attachment-0001.html>
More information about the Logback-user
mailing list