[logback-user] AsyncAppenderBase not flushing queue during JVM shutdown

Thu Mar 6 19:48:01 CET 2014

Hi David, Chris,

Thanks for the feedback. In the test case that I ran with the
AsyncAppender, LoggerContext.stop did not wait for the async appender to
finish flushing before exiting - the stop method returned immediately. I
just retested this scenario to be sure. My test should generate exactly 135
log events (also going to a local file appender, so I can confirm that
entries are present or missing). Without the async appender I get all of
the expected events (at much lower throughput); with the async appender
(using the call to LoggerContext.stop prior to exiting) I get a variable
number (generally less than 20 events). It's possible that the lower
latency of flushing to local files results in the async appender being able
to flush completely to disk before the JVM exits, whereas the latency to
Loggly is just too high. Queue size probably also has an impact - I allow
for a fairly large queue because the penalty of falling back to synchronous
logging for Loggly is large and my logging tends to come in bursts.

If stop() did wait for async appenders to flush, that would improve the
situation. Although the use-case that involves shutdown hooks that want to
log would still need to be addressed. The only way that I can see that does
address allowing shutdown hooks to log is via a queue flushing shutdown
hook (requiring a call to LoggerContext.stop won't help for shutdown hooks
or applications that use JVM implicit shutdowns). My opinion is that a
shutdown hook can satisfy both needs with minimal impact to other classes
or semantics on how LoggerContext.stop functions. One hook per async
appender would be simple to implement and would also allow for parallel
flushing of async appender queues.

A default shutdown timeout as you described makes sense to me and fits with
what I was thinking for implementation of a shutdown hook.

I could probably throw together a quick and dirty implementation tonight as
a proof of concept. Thoughts?

Regards,

Mike Reinhold

On Thu, Mar 6, 2014 at 1:01 PM, Chris Pratt <thechrispratt at gmail.com> wrote:

> Another option might be to require calling the LoggerContext.stop when
> using AsyncAppender's.  The stop method could signal to the AsyncAppenders
> that they should drain their queue's and exit, which would allow the
> application to shut down naturally (if the AsyncAppenders aren't started as
> demon's)
>   (*Chris*)
>
>
> On Thu, Mar 6, 2014 at 9:55 AM, David Roussel <nabble at diroussel.xsmail.com
> > wrote:
>
>> Mike,
>>
>> I would expect the shutdown of logback to wait for async appenders to
>> drain their queues before we can consider the shutdown down.
>>
>> I have used async appenders to provide faster logging when writing to
>> local files. And that's what I expected happens.
>>
>> Of course there is the problem of waiting for too long. How about an
>> async appender has a default shutdown timeout which can be overridden by
>> config. So it can wait 2 seconds by default, but can be overridden to 30
>> seconds for the loggly usecase.
>>
>> The simple solution is to put this all in AsyncAppender(Base), but if we
>> want to allow multiple async appenders to shutdown in parallel, then the
>> fix might involve more changes.
>>
>> David
>>
>> On 6 Mar 2014, at 12:51, Michael Reinhold <mike at coursescheduler.io>
>> wrote:
>>
>> Hi David,
>>
>> I forgot to mention this in my original note and it is a good thought,
>> but yes I have tried to cleanly shutdown Logback. Even when the
>> LogbackContext stop method is invoked prior to JVM exit, the async queue is
>> not flushed completely. When Logback is stopped cleanly, the async worker
>> thread exits the "waiting for events in the queue" while loop and moves on
>> to the "flush remaining elements from the queue" for loop (confirmed by the
>> info message when debug is enabled and via breakpoint). Ultimately, because
>> the thread is a daemon thread - this process is still interrupted by JVM
>> shutdown. There is no guarantee that the async worker will get enough time
>> to process all events.
>>
>> From a theoretical standpoint, stopping Logback properly does not solve
>> flushing issues with the AsycAppender in a number of cases (it may be fair
>> to argue that it should not attempt to cover all of these cases):
>>
>>    - Queue is relatively full or wrapped appender latency is high - the
>>    async worker may not be able to finish processing the queued items between
>>    when the LogbackContext.stop method is invoked and when the JVM actually
>>    exits
>>    - Implicit JVM exit upon end of last user thread - some applications
>>    don't explicitly call System.exit but rather rely on implicit JVM exit. In
>>    that case, where would you cleanly shutdown Logback? It can probably be
>>    done, but would require an application redesign to avoid reliance on
>>    implicit exit or some form of thread tracking thread.
>>    - Shutdown hooks are used to clean up resources - some applications
>>    have components or modules that need to clean up after themselves when the
>>    JVM shuts down. Typically, shutdown hooks are used for this purpose,
>>    however the AsyncAppender's worker thread is unable to process queued
>>    events after shutdown has been initiated (while the shutdown hooks are
>>    running). This also prevents shutdown hooks from being able to log events
>>    - Signal handling on *nix systems - signals can be sent to the
>>    application for a number of reasons, most often to request graceful
>>    shutdown or to terminate something that is non-responsive. If the signal
>>    corresponds to something that generally means "graceful shutdown" on the
>>    host system, normal cleanup routines should execute in the application
>>    (typically implemented as shutdown hooks, so maybe this is the same as the
>>    previous one).
>>
>> From a practical standpoint, I am running into the first and the third
>> scenarios. Even shutting down cleanly, the latency and queue depth can
>> cause elements to be missed; additionally, I have shutdown hooks that clean
>> up resources and expect the ability to log events and errors.
>>
>> My thought is that a modified AsyncAppenderBase implementation could
>> (optionally) install a shutdown hook that continues watching the queue for
>> new events. A configurable "queue idle time" could be used to tell the
>> shutdown hook that it has probably caught all events and can safely
>> shutdown (not foolproof, I'm still thinking about other ways of handling
>> it). Additionally, the hook could have a configurable "max processing time"
>> so that it doesn't keep the JVM alive for an undesirable amount of time
>> (thinking in the context of automation software where shutdown requests
>> typically have a max execution time before they are considered as "failed
>> online").
>>
>> I hope this helps clarify the scenario I am considering and a proposed
>> solution!
>>
>> Regards,
>>
>> Mike Reinhold
>>
>>
>> On Wed, Mar 5, 2014 at 5:30 PM, David Roussel <
>> nabble at diroussel.xsmail.com> wrote:
>>
>>> Did you try shutting down logback cleanly. Like this
>>> http://stackoverflow.com/questions/3678755/do-i-need-to-flush-events-when-shutting-down-using-logback
>>>
>>> David
>>>
>>> On 5 Mar 2014, at 20:44, Michael Reinhold <mike at coursescheduler.io>
>>> wrote:
>>>
>>> Hi Ceki,
>>>
>>> I am currently using the AsyncAppender in combination with the
>>> LogglyAppender from the Logback extensions project. While working on a
>>> number of aspects of my application, I found that I was not consistently
>>> getting all of the log messages that I expected. In particular, when the
>>> application shuts down immediately (or very shortly) after a burst of
>>> logging activity, the tail of those log events is often not present in
>>> Loggly. From a number of tests, this issue is not restricted to use with
>>> the LogglyAppender, but is simply more evident because of the latency
>>> involved.
>>>
>>> Looking through the source code for the AsyncAppenderBase, I saw the
>>> following:
>>>
>>> You create the Async sender thread as a Daemon thread
>>>
>>>
>>> addInfo("Setting discardingThreshold to " + discardingThreshold);
>>>     worker.setDaemon(true);
>>>     worker.setName("AsyncAppender-Worker-" + worker.getName());
>>>     // make sure this instance is marked as "started" before staring the
>>> worker Thread
>>>     super.start();
>>>     worker.start();
>>>
>>>
>>> In the sender thread, if you determine that the parent thread has
>>> stopped or the async sender thread has been interrupted, you allow the
>>> worker thread to flush remaining log events in the queue.
>>>
>>>
>>> while (parent.isStarted()) {
>>>         try {
>>>           E e = parent.blockingQueue.take();
>>>           aai.appendLoopOnAppenders(e);
>>>         } catch (InterruptedException ie) {
>>>           break;
>>>         }
>>>       }
>>>
>>>       addInfo("Worker thread will flush remaining events before exiting.
>>> ");
>>>       for (E e : parent.blockingQueue) {
>>>         aai.appendLoopOnAppenders(e);
>>>       }
>>>
>>>       aai.detachAndStopAllAppenders();
>>>
>>>
>>> From what I can tell, during JVM shutdown (for a standalone SE instance,
>>> the same is probably not true for EE instances with application servers)
>>> daemon threads may be terminated without allowing the the AsyncAppenderBase
>>> to escape the while loop and proceed onto the queue flush for loop.
>>>
>>> From Brian Goetz in Java Concurrency in Practice:
>>> "When a thread exits, the JVM performs an inventory of running threads,
>>> and if the only threads that are left are daemon threads, it initiates an
>>> orderly shutdown. When the JVM halts, any remaining daemon threads are
>>> abandoned finally blocks are not executed, stacks are not unwound the JVM
>>> just exits. Daemon threads should be used sparingly few processing
>>> activities can be safely abandoned at any time with no cleanup. In
>>> particular, it is dangerous to use daemon threads for tasks that might
>>> perform any sort of I/O. Daemon threads are best saved for "housekeeping"
>>> tasks, such as a background thread that periodically removes expired
>>> entries from an in-memory cache."
>>>
>>> To test this, I inserted a break point in the AsyncAppenderBase at the
>>> call to addInfo and ran a variety of different scenarios. At no point in
>>> time was I able to get the breakpoint to stop at the addInfo line.
>>>
>>> I don't think there are any clear cut solutions to this. Making the
>>> worker thread a user thread instead of daemon thread has its own
>>> implications, particularly that if all other user threads have exited the
>>> async threads would keep the JVM instance alive (unless System.exit has
>>> been called, in which case I believe that you will still have lost log
>>> events even if the async processing thread is not a daemon). It might be
>>> possible to create a shutdown hook that does the queue flushing for the
>>> async worker - though you may need to consider the possibility of other
>>> shutdown hooks wanting to log events as well.
>>>
>>> I'm currently mocking up a version of the AsyncAppenderBase and
>>> AsyncAppender that installs a shutdown hook as described previously. I
>>> think a "queue idle time" configuration might be the best way to handle
>>> other shutdown hooks adding log events (aka - after processing whatever was
>>> in the queue, if no new events are added within *x* ms then the
>>> shutdown hook can assume nothing else will be adding log events and can
>>> exit).  An alternative might be to have the shutdown hook query the
>>> ThreadMXBean API to determine if other shutdown hooks are still executing
>>> and possibly adding log events (though the threads that are expected to be
>>> running during shutdown may be not only application specific but also JVM
>>> implementation specific... I'm not sure).
>>>
>>> Let me know what you think. I'll let you know if I feel that my mockup
>>> may be viable...
>>>
>>> Regards,
>>>
>>> Mike Reinhold
>>>
>>> _______________________________________________
>>> Logback-user mailing list
>>> Logback-user at qos.ch
>>> http://mailman.qos.ch/mailman/listinfo/logback-user
>>>
>>>
>>> _______________________________________________
>>> Logback-user mailing list
>>> Logback-user at qos.ch
>>> http://mailman.qos.ch/mailman/listinfo/logback-user
>>>
>>
>> _______________________________________________
>> Logback-user mailing list
>> Logback-user at qos.ch
>> http://mailman.qos.ch/mailman/listinfo/logback-user
>>
>>
>> _______________________________________________
>> Logback-user mailing list
>> Logback-user at qos.ch
>> http://mailman.qos.ch/mailman/listinfo/logback-user
>>
>
>
> _______________________________________________
> Logback-user mailing list
> Logback-user at qos.ch
> http://mailman.qos.ch/mailman/listinfo/logback-user
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.qos.ch/pipermail/logback-user/attachments/20140306/6a9d6426/attachment-0001.html>