[slf4j-user] Best practice for logging XML and other byte oriented formats with slf4j?
Thorbjoern Ravn Andersen
ravn at runjva.com
Fri Nov 20 19:44:17 CET 2009
Maarten Bosteels skrev:
>
>
> XMLEncoder will have a severe impact on performance, I've tested
> this extensively.
> Have a look at
> http://sourceforge.net/apps/trac/lilith/wiki/SerializationPerformance
> In my testcases, XMLEncoder serialized 300 events while a protobuf
> serializer managed to handle nearly 10.000!
> I'd therefore suggest that you take a mixed approach. Using
> protobuf to serialize the events to a file and writing an
> additional converter to convert that files to whatever you'd like
> as XML-Output as needed
>
I think I didn't catch on to that discussion when you had it. Probably
because I didn't understand it enough from a brief skim :)
> A discussion about such a topic was started here:
> http://marc.info/?l=logback-dev&m=124905434331308&w=2
> <http://marc.info/?l=logback-dev&m=124905434331308&w=2> but I
> completely forgot to file an RFE for it.
> I've done just that now, thanks for the reminder!
> http://jira.qos.ch/browse/LBCORE-128
>
>
> I agree with Joern that XMLEncoder is not really suited when
> throughput is important to you.
For our purpose these "log this complex object" happen rarely enough
that we are willing to accept a penalty here, to get a humanly readable
rendering.
>
>
> > My current thoughts is to use a ByteArrayOutputStream and
> generate a String using the UTF-8 decoding. The resulting string
> contains a <?xml ... encoding="UTF-8"?> which is stripped
> resulting in an XML String containing Unicode chars (instead of
> encoded bytes).
>
>
> What is the difference between "Unicode chars" and "encoded bytes" ?
I am talking about internal representation as char's and the encoded
version which is a stream of bytes (which usually is put raw in a file).
> Every unicode codepoint has to be encoded somehow, no ? UTF-8 is one
> way to encode the codepoint (and imho the encoding everyone should use)
>
> This can then be flattened to an ASCII version, by converting all
> characters outside 32..127 to their numeric entity (Ӓ), and
> THAT can be safely logged. I guess :)
> >
>
>
> If you want to use XML, then I really don't see the problem with
> leaving it in UTF-8 ?
There is absolutely no guarantee that the final destination of the log
string will be able to handle UTF-8 encoded strings. How does UTF-8
encoded strings end up looking when written using MacRoman under OS X?
> Especially since you state that "a humanly readable transport format
> will be preferred." I would prefer to see Σ instead of Σ
>
> Of course, it should be possible to tell the XMLEncoder which encoding
> to use (instead of using the default encoding of the platform).
>
XMLEncoder does not have the encoding public. Bah :)
--
Thorbjørn Ravn Andersen "...plus... Tubular Bells!"
More information about the slf4j-user
mailing list