[logback-dev] What is the most efficient way - preferrably platform agnostic - to submit events from "the outside"?

Joern Huxhorn jhuxhorn at googlemail.com
Sun Mar 1 20:16:59 CET 2009


On 01.03.2009, at 18:39, Thorbjoern Ravn Andersen wrote:

> Joern Huxhorn skrev:
>>
>> On 28.02.2009, at 21:44, Thorbjoern Ravn Andersen wrote:
>>>>> I am still pondering on a language agnostic receiver.   The  
>>>>> reason for the XML being uninteresting was because it was much  
>>>>> more verbose than the plain serialised byte object?
>>>>
>>>> I wouldn't call it uninteresting ;)
>>>> It's just more expensive to create such events so I'd only use it  
>>>> if I have to.
>>> Are they?  How come?     Perhaps if using XMLEncoder instead of  
>>> rolling your own :)
>>
>> I've done some more benchmarking concerning purely serialization/ 
>> deserialization without any disk I/O, i.e. using just byte[].
>>
>> http://apps.sourceforge.net/trac/lilith/ticket/28
>> the last table at the bottom.
>>
>> That's actually quite interesting... I didn't expect that using my  
>> own XML Serializer + compression would actually result in smaller  
>> data size than using java serialization + compression.
>> Even uncompressed, my Serializer doesn't produce much more bytes  
>> than java serialization.... I didn't expect that...
>>
>> However, creation and handling of XML is still much slower than  
>> pure java serialization - which doesn't surprise me at all.
>> My own implementation (using StAX) is a lot faster than the generic  
>> java.beans.XML one, though.
>
> I have seen your data and it is interesting.  If you want  
> compression you need something lighter than the default setting of  
> the gzip compressor, as even the lightest setting give reasonable  
> results for repetitive data without using much time.
>
> Why use StAX - you need a SAX parser or something more  
> sophisticated?  I am unfamiliar with the project.

Well, I just like the API.
See http://www.developer.com/xml/article.php/3397691 for a comparison.  
While I don't use the skip-feature right now, I like to have that  
option. I dislike everything that's using a DOM because has to read  
the complete XML and keep it in memory... all in all, it's just a  
matter of personal preference... ;)

>
>>>>> Would a sufficiently terse xml-dialect be interesting?  I was  
>>>>> thinking of having one-character node names and attribute names?  
>>>>> (and moving the namespace outside the fragments).
>>>>
>>>> I'm not sure about that. Thinking about a binary format would  
>>>> probably be more worthwhile...
>>> Binary formats are rather painful to extend at a later date.  Just  
>>> see how hard it is even with help from the serialization modules.
>>
>> The main problem with java serialization is IMHO that it's not  
>> possible to *somehow* load older versions of a class. If you change  
>> a class and change it's serialVersionUID then you have no chance to  
>> load any previously serialized objects. No chance at all. That's  
>> quite evil...
>> To do something like this one would need to reimplement the class,  
>> with a different name, and implement conversion from old to new.
>>
> Since the serialization mechanism in java relies on the exact  
> ordering and type of each field in order to generate and interpret  
> the byte stream without any additional information in the byte  
> stream but the raw values, it makes sense to me that it will break  
> if the class signature change.  It would not be hard to use an  
> interface internally and let multiple versions of the class to  
> serialize implement it.    It is not an evil, it is just a trade  
> off :)

Yes, I understand the reasons for the situation but it doesn't make it  
easy to read serialized data of previous versions of the class.
The easiest way to do this would probably be the use of versioned  
classes, e.g. LoggingEventVO0916.
So if an incompatible change would occur in lets say 0.9.18 a new  
LoggingEventVO0918 would be created, without any relationship to the  
previous one. That would have to stay exactly the same.
The deserializer could then do checks using instanceof and convert  
accordingly (i.e. convert "old" to "new"). This would prohibit the  
implementation of an interface, though, because that would mean that  
the old class would need to be changed to support the new data. While  
this would be possible using empty stubs I'm not sure how elegant that  
would be :p

I'm not sure about the elegance of this whole approach, either. I'm  
just brainstorming a bit...
One *big* downside would be that the value objects, as well as all  
contained value objects, would have to have such a version in it's  
name :p

Ceki, are you still with us? What's your plan to support the  
deserialization of previous versions of the VO class?
As a general comment, it has been proven to be really helpful to use a  
general interface for serialization.
http://apps.sourceforge.net/trac/sulky/browser/trunk/sulky-generics/src/main/java/de/huxhorn/sulky/generics/io

That way, the whole logic of transforming a given object to an  
arbitrary byte array is entirely detached from the class itself.
It would also remove the need that LoggingEvent is aware of persisting  
at all. The responsibility is entirely in the Serializer/Deserializer  
implementation, i.e. there's no need for getLoggerContextVO or similar  
methods.

>
>>>
>>> You may remember that I am in a situation where our production  
>>> servers are inaccessible and where I want our logs to be both  
>>> humanly readable as well as reprocessable.    I would be very  
>>> interested in defining a very terse XML dialect for this purpose,  
>>> as Ceki has demonstrated earlier that my needs cannot be  
>>> fullfilled by the log4j dtd.
>>
>> Have you had the time to check my xml format? While it's not  
>> exactly terse ;) it's definitely human-readable.
> I think I have understood what the problem with my use of the  
> "humanly readable" term is.  It is not the actual XML with names and  
> tags, but the data carried I am talking about.  Timestamps are fine  
> but is not really readable to a human.
>
> I can easily live with one character tag and attribute names if the  
> data in the fields carry meaning to the human eye :)

Well, I'm using yyyy-MM-dd'T'HH:mm:ss.SSSZ as the primary time stamp  
format and then apply some magic to change it into a valid xml  
timestamp, i.e. I add a colon between hh and mm of the timezone.
It mad my sigh quite a bit when I realized that SimpleDateFormat did  
not support this...

>
>>> I see a good need for a production strength "xml receiver -> sfl4j  
>>> events" added to the slf4j-ext package (or so) - would it  
>>> licensewise be possible to adapt your work into this?
>>
>> My code is licensed under LGPL v3 so this shouldn't be a problem.  
>> Since I'm the only developer I could grant any license, anyway.
>> I'm not sure what you mean with "slf4j events", though. You mean  
>> "logback" instead of "slf4j", right?
> No, in the above I actually mean slf4j events.    A receiver which  
> accepts incoming events and throw them straight into a slf4j  
> log.debug(....) statement (or whatever level was used) - which to me  
> would be a generic way to glue two separate platforms together.

But slf4j doesn't really define the events, does it?

Joern.


More information about the logback-dev mailing list