[logback-dev] [JIRA] Issue Comment Edited: (LBCORE-128) Please support implementation of binary log files in RollingFileAppender/FileAppender
Ceki Gulcu (JIRA)
noreply-jira at qos.ch
Fri Feb 19 10:36:33 CET 2010
[ http://jira.qos.ch/browse/LBCORE-128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=11541#action_11541 ]
Ceki Gulcu edited comment on LBCORE-128 at 2/19/10 10:36 AM:
-------------------------------------------------------------
Thank you for providing these figures.
If I understand correctly, an event uses up 1.49 KB (kilobytes) of space without compression which is reduced to 742 bytes after compression. I am also assuming and this is pretty important, that each event is written in isolation to the output stream. In other words, you create a new ObjectOutputStream, write the event as an object and flush the stream. In compression mode the, a new GZIPOutputStream is created for each event, the ObjectOutputStream writes to the GZIPOutputStream. The finish() method of GZIPOutputStream is called after writing each event.
Anyway, in serialization tests we do in logback, the everage footprint of event is less than 160 bytes in size, almost ten times smaller than your standard size. However, 160 is already the result of some compression since ObjectOutputStream keeps track of previous objects. Instead of writing a whole object, ObjectOutpuStram will write the reference of the previous occurrence which can result in very substantial gains in space.
But yes, 50% although not very good for logging (where you aim for at least 90%), is far from my "won't compress at all" assertion. I stand corrected.
You argument about not wishing to read a whole stream to decode a given event was very convincing. So letting a encoder to decorate an OutputStream seems like overkill because the idea of a long uninterrupted stream does not work well for storage purposes. (You don't want to read 100'000 events before reading the one you are interested in.) Moreover, in my yet unpublished "encoder" branch (which does not even compile), the main method in the Encoder interface, that is doEncode takes an event and the OutputStream as a second argument. Here is the Encoder interface as exists in my "encoder" branch:
public interface Encoder<E> extends ContextAware, LifeCycle {
void doEncode(E event, OutputStream os) throws IOException;
void close(OutputStream os) throws IOException;
}
In a nutshell, it is now the responsibility of FileAppender to provide a valid OutputSream to the encoder whose responsibility is to encode the event and also to write the results (encoded bytes) onto the stream. The encoder is given total freedom in how it writes the events.
was (Author: noreply.ceki at qos.ch):
Thank you for providing these figures.
If I understand correctly, an event uses up 1.49 KB (kilobytes) of space without compression which is reduced to 742 bytes after compression. I am also assuming and this is pretty important, that each event is written in isolation to the output stream. In other words, you create a new ObjectOutputStream, write the event as an object and flush the stream. In compression mode the, a new GZIPOutputStream is created for each event, the ObjectOutputStream writes to the GZIPOutputStream. The finish() method of GZIPOutputStream is called after writing each event.
Anyway, in serialization tests we do in logback, each event is less than 160 bytes in size, almost ten times smaller than your standard size. While 160 is perhaps a bit small, 1500 bytes is perhaps too big to be "representative". The word representative is in quote because, you seem to be compressing real events so they are certainly more representative than my imaginary events.
But yes, 50% although not very good for logging (where you aim for at least 90%), is far from my "won't compress at all" assertion. I stand corrected.
You argument about not wishing to read a whole stream to decode a given event was very convincing. So letting a encoder to decorate an OutputStream seems like overkill because the idea of a long uninterrupted stream does not work well for storage purposes. (You don't want to read 100'000 events before reading the one you are interested in.) Moreover, in my yet unpublished "encoder" branch (which does not even compile), the main method in the Encoder interface, that is doEncode takes an event and the OutputStream as a second argument. Here is the Encoder interface as exists in my "encoder" branch:
public interface Encoder<E> extends ContextAware, LifeCycle {
void doEncode(E event, OutputStream os) throws IOException;
void close(OutputStream os) throws IOException;
}
In a nutshell, it is now the responsibility of FileAppender to provide a valid OutputSream to the encoder whose responsibility is to encode the event and also to write the results (encoded bytes) onto the stream. The encoder is given total freedom in how it writes the events.
> Please support implementation of binary log files in RollingFileAppender/FileAppender
> -------------------------------------------------------------------------------------
>
> Key: LBCORE-128
> URL: http://jira.qos.ch/browse/LBCORE-128
> Project: logback-core
> Issue Type: Improvement
> Components: Appender
> Affects Versions: 0.9.17
> Reporter: Joern Huxhorn
> Assignee: Ceki Gulcu
>
> This was discussed briefly at http://marc.info/?l=logback-dev&m=124905434331308&w=2 and I forgot to file a ticket about this.
> Currently, RandomFileAppender => FileAppender => WriterAppender is using the following method in WriterAppender to actually write the data:
> protected void writerWrite(String s, boolean flush) throws IOException
> Please add an additional method like
> protected void writerWrite(byte[] bytes, boolean flush) throws IOException
> to write to the underlying stream directly.
> writerWrite(String, boolean) could call that method after performing the transformation internally, making this change transparent for the rest of the implementation.
> Using a binary format for logfiles could have tremendous performance impact as can be seen here: http://sourceforge.net/apps/trac/lilith/wiki/SerializationPerformance
--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://jira.qos.ch/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira
More information about the logback-dev
mailing list