[slf4j-user] Best practice for logging XML and other byte oriented formats with slf4j?

Maarten Bosteels mbosteels.dns at gmail.com
Fri Nov 20 20:17:59 CET 2009


On Fri, Nov 20, 2009 at 7:44 PM, Thorbjoern Ravn Andersen
<ravn at runjva.com>wrote:

> Maarten Bosteels skrev:
>
>
>>
>>    XMLEncoder will have a severe impact on performance, I've tested
>>    this extensively.
>>    Have a look at
>>    http://sourceforge.net/apps/trac/lilith/wiki/SerializationPerformance
>>    In my testcases, XMLEncoder serialized 300 events while a protobuf
>>    serializer managed to handle nearly 10.000!
>>    I'd therefore suggest that you take a mixed approach. Using
>>    protobuf to serialize the events to a file and writing an
>>    additional converter to convert that files to whatever you'd like
>>    as XML-Output as needed
>>
>>
> I think I didn't catch on to that discussion when you had it. Probably
> because I didn't understand it enough from a brief skim :)
>
>
>     A discussion about such a topic was started here:
>>    http://marc.info/?l=logback-dev&m=124905434331308&w=2
>>    <http://marc.info/?l=logback-dev&m=124905434331308&w=2> but I
>>    completely forgot to file an RFE for it.
>>    I've done just that now, thanks for the reminder!
>>    http://jira.qos.ch/browse/LBCORE-128
>>
>>
>> I agree with Joern that XMLEncoder is not really suited when throughput is
>> important to you.
>>
> For our purpose these "log this complex object" happen rarely enough that
> we are willing to accept a penalty here, to get a humanly readable
> rendering.
>
>
It's not about logging complex objects. It's about the number of log events
that you can write per second. And the CPU cycles that you waste by
generating XML.


>
>
>>
>>    > My current thoughts is to use a ByteArrayOutputStream and
>>    generate a String using the UTF-8 decoding. The resulting string
>>    contains a <?xml ... encoding="UTF-8"?> which is stripped
>>    resulting in an XML String containing Unicode chars (instead of
>>    encoded bytes).
>>
>>
>> What is the difference between "Unicode chars" and "encoded bytes" ?
>>
> I am talking about internal representation as char's and the encoded
> version which is a stream of bytes (which usually is put raw in a file).
>
>
>
>
>  Every unicode codepoint has to be encoded somehow, no ? UTF-8 is one way
>> to encode the codepoint (and imho the encoding everyone should use)
>>
>>    This can then be flattened to an ASCII version, by converting all
>>    characters outside 32..127 to their numeric entity (&#1234;), and
>>    THAT can be safely logged. I guess :)
>>    >
>>
>>
>> If you want to use XML, then I really don't see the problem with leaving
>> it in UTF-8 ?
>>
> There is absolutely no guarantee that the final destination of the log
> string will be able to handle UTF-8 encoded strings. How does UTF-8 encoded
> strings end up looking when written using MacRoman under OS X?


IMHO, you should embrace UTF-8 instead of being afraid of it.  Tools that
are not able to handle UTF-8 are simply not worth using.

1)  from http://www.w3.org/TR/xml11/#charencoding:   "All XML processors *
MUST* be able to read entities in both the UTF-8 and UTF-16 encodings."
2)  I don't know MacRoman but from http://en.wikipedia.org/wiki/Mac_OS_Roman
:   "With the release of Mac OS X <http://en.wikipedia.org/wiki/Mac_OS_X>,
Mac OS Roman was replaced by UTF-8 <http://en.wikipedia.org/wiki/UTF-8> as
the standard character encoding for the Macintosh operating system."

regards,
Maarten



>
>
>  Especially since you state that "a humanly readable transport format will
>> be preferred." I would prefer to see Σ instead of &#931;
>>
>> Of course, it should be possible to tell the XMLEncoder which encoding to
>> use (instead of using the default encoding of the platform).
>>
>>  XMLEncoder does not have the encoding public. Bah :)
>
>
> --
>  Thorbjørn Ravn Andersen  "...plus... Tubular Bells!"
>
> _______________________________________________
> user mailing list
> user at slf4j.org
> http://www.slf4j.org/mailman/listinfo/user
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://qos.ch/pipermail/slf4j-user/attachments/20091120/07d06c8e/attachment.htm>


More information about the slf4j-user mailing list