[slf4j-user] Best practice for logging XML and other byte oriented formats with slf4j?
Maarten Bosteels
mbosteels.dns at gmail.com
Fri Nov 20 20:17:59 CET 2009
On Fri, Nov 20, 2009 at 7:44 PM, Thorbjoern Ravn Andersen
<ravn at runjva.com>wrote:
> Maarten Bosteels skrev:
>
>
>>
>> XMLEncoder will have a severe impact on performance, I've tested
>> this extensively.
>> Have a look at
>> http://sourceforge.net/apps/trac/lilith/wiki/SerializationPerformance
>> In my testcases, XMLEncoder serialized 300 events while a protobuf
>> serializer managed to handle nearly 10.000!
>> I'd therefore suggest that you take a mixed approach. Using
>> protobuf to serialize the events to a file and writing an
>> additional converter to convert that files to whatever you'd like
>> as XML-Output as needed
>>
>>
> I think I didn't catch on to that discussion when you had it. Probably
> because I didn't understand it enough from a brief skim :)
>
>
> A discussion about such a topic was started here:
>> http://marc.info/?l=logback-dev&m=124905434331308&w=2
>> <http://marc.info/?l=logback-dev&m=124905434331308&w=2> but I
>> completely forgot to file an RFE for it.
>> I've done just that now, thanks for the reminder!
>> http://jira.qos.ch/browse/LBCORE-128
>>
>>
>> I agree with Joern that XMLEncoder is not really suited when throughput is
>> important to you.
>>
> For our purpose these "log this complex object" happen rarely enough that
> we are willing to accept a penalty here, to get a humanly readable
> rendering.
>
>
It's not about logging complex objects. It's about the number of log events
that you can write per second. And the CPU cycles that you waste by
generating XML.
>
>
>>
>> > My current thoughts is to use a ByteArrayOutputStream and
>> generate a String using the UTF-8 decoding. The resulting string
>> contains a <?xml ... encoding="UTF-8"?> which is stripped
>> resulting in an XML String containing Unicode chars (instead of
>> encoded bytes).
>>
>>
>> What is the difference between "Unicode chars" and "encoded bytes" ?
>>
> I am talking about internal representation as char's and the encoded
> version which is a stream of bytes (which usually is put raw in a file).
>
>
>
>
> Every unicode codepoint has to be encoded somehow, no ? UTF-8 is one way
>> to encode the codepoint (and imho the encoding everyone should use)
>>
>> This can then be flattened to an ASCII version, by converting all
>> characters outside 32..127 to their numeric entity (Ӓ), and
>> THAT can be safely logged. I guess :)
>> >
>>
>>
>> If you want to use XML, then I really don't see the problem with leaving
>> it in UTF-8 ?
>>
> There is absolutely no guarantee that the final destination of the log
> string will be able to handle UTF-8 encoded strings. How does UTF-8 encoded
> strings end up looking when written using MacRoman under OS X?
IMHO, you should embrace UTF-8 instead of being afraid of it. Tools that
are not able to handle UTF-8 are simply not worth using.
1) from http://www.w3.org/TR/xml11/#charencoding: "All XML processors *
MUST* be able to read entities in both the UTF-8 and UTF-16 encodings."
2) I don't know MacRoman but from http://en.wikipedia.org/wiki/Mac_OS_Roman
: "With the release of Mac OS X <http://en.wikipedia.org/wiki/Mac_OS_X>,
Mac OS Roman was replaced by UTF-8 <http://en.wikipedia.org/wiki/UTF-8> as
the standard character encoding for the Macintosh operating system."
regards,
Maarten
>
>
> Especially since you state that "a humanly readable transport format will
>> be preferred." I would prefer to see Σ instead of Σ
>>
>> Of course, it should be possible to tell the XMLEncoder which encoding to
>> use (instead of using the default encoding of the platform).
>>
>> XMLEncoder does not have the encoding public. Bah :)
>
>
> --
> Thorbjørn Ravn Andersen "...plus... Tubular Bells!"
>
> _______________________________________________
> user mailing list
> user at slf4j.org
> http://www.slf4j.org/mailman/listinfo/user
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://qos.ch/pipermail/slf4j-user/attachments/20091120/07d06c8e/attachment.htm>
More information about the slf4j-user
mailing list