[slf4j-dev] Benefits of Message objects?

Joern Huxhorn jhuxhorn at googlemail.com
Fri Sep 9 22:52:29 CEST 2011


On 09.09.2011, at 19:11, John Vasileff wrote:

> 
> On Sep 9, 2011, at 6:23 AM, Joern Huxhorn wrote:
> 
>> 
>> On 08.09.2011, at 21:11, John Vasileff wrote:
>> 
>>> I put together the following examples to help net out benefits of supporting "Map supplementalData" arguments in Logger methods in addition to Message object support.
>>> 
>>> Support for supplementalData does not replace Message objects, but is perhaps more of a shortcut for common usage scenarios.  In the end, a Message object is always created by either the application writer or SLF4J.
>>> 
>>> 
>>> Scenario #1: Working with a domain object "posting" that implements java.util.Map.  A 1.6.2 error log may look like:
>>> 
>>> logger.error("error parsing markdown '{}'", badString, e);
>>> 
>>> but we want to add all details of the "posting" domain object without including them in the message.  Using the supplementalData log methods:
>>> 
>>> // just add the data
>>> Logger logger = LoggerFactory.getLogger(MyClass.class);
>>> logger.error(posting, "error parsing markdown '{}'", badString, e);
>>> 
>>> // add the data, but also allow named parameter formatting
>>> Logger logger = LoggerFactory.getLogger(MyClass.class).formatWith(NamedParamFormatter.getInstance());
>>> logger.error(posting, "error parsing markdown for posting {id} '{}'", badString, e);
>>> 
>>> // just add the data, and format using java.util.formatter
>>> Logger logger = LoggerFactory.getLogger(MyClass.class).formatWith(JavaUtilFormatter.getInstance());
>>> logger.error(posting, "error parsing markdown '%s'", badString, e);
>> 
>> One huge problem with this approach is that Logger, for a given name/class, has to be a singleton. You absolutely must not change its state, otherwise chaos will happen.
>> 
>> I'll give you just one example:
>> Suppose ThreadA is changing the Logger with formatWith. Then, before ThreadA executes its log statement, ThreadB is changing the same instance with a different formatWith call. This will cause ThreadA to log with a wrong formatting. A mutable Logger would automatically make it non-threadsafe.
>> 
>> I firmly believe that default formatting must not be configurable. Doing so would just introduce a big pile of problems.
>> 
> 
> I agree 100% that formatting should never be changed once the logger is setup.  And I also see that the same instance of Logger is returned for a given name in, for example, Log4jLoggerFactory.  But if in the end the idea of formatWith() makes for a nice application facing API, I would hate to rule it out prematurely based on what may be implementation details.

I'm not saying that this would be impossible to implement. It just comes with an enormous cost that is actually threefold:
- on the SLF4J implementation side, getLogger() would have to return a new instance for every call. That instance would need to contain an optional formatter that must be configurable on a per Thread base (there's a nasty classloader memory leak lurking in the dark at this point, btw). And what should happen in case of multiple calls to formatWith()? An Exception?
- all of the above would have a serious impact on the performance.
- you'd need to educate the users about all this. This task is actually entirely futile. ;)

> 
> Having said that, I don't really have a strong need for formatWith(...) - java.util.Formatter seems unnecessary to me, so it would just be an added convenience for NamedParameter style formatting.  But I do feel that whether in 2.0, 3.0, or whenever, the ability to add this type of feature without breaking binary compatibility is very valuable.  So I think it is worth netting out, even if the consensus is that the particular feature of overriding the formatter isn't necessary or desired for 2.0.
> 
>>> 
>>> If we used Message objects:
>>> 
>>> Logger logger = LoggerFactory.getLogger(MyClass.class);
>>> 
>>> // just add the data
>>> logger.error(new DefaultMessage(posting, "error parsing markdown '{}'", badString, e));
>>> 
>>> // add the data, but also allow named parameter formatting
>>> logger.error(new NamedParamFormattingMessage(posting, "error parsing markdown for posting {id} {}", badString, e));
>>> 
>>> // just add the data, and format using java.util.formatter
>>> logger.error(new JavaUtilFormatterMessage(posting, "error parsing markdown for posting %s", badString, e));
>>> 
>>> 
>>> Scenario #2: Support for StructuredData.  The StructuredData object may augment a log message, or both augment the message and provide the message with toString().
>>> 
>>> Using SupplementalData methods, where SD implements Map
>>> 
>>> StructuredData sd = new StructuredData(...);
>>> logger.error(sd);
>>> logger.error(sd, "some log message"); 
>>> 
>>> Using Message object methods, where SD implements Message
>>> 
>>> StructuredData sd = new StructuredDataMessage(...);
>>> logger.error(sd);
>>> StructuredData sd = new StructuredDataMessage(..., "some log message");
>>> logger.error(sd);
>>> 
>> 
>> The idea was to put both the internal structure and the formatting in the respective Message implementation, with ParameterizedMessage being the implementation for the default SLF4J formatting. A shortcut exists for this type of messages, namely the foo(String pattern, Object... args) methods, since this is the message type encouraged by the logging framework.
>> 
>> It is encouraged since it is a perfect mix of features (placeholder replacement) and speed (it is *much* faster than JUL and likely faster than most other implementation). It is also the type of formatting already documented by the API and accepted by its users.
>> 
>> The 99% of the users will just keep on using the "normal" logging methods and ignore the newly defined ones. This is the main target we are developing for.
>> 
>> The foo(Message message) type of methods, however, enable the 1% "pro users" to implement specific features they need - and the logging framework will be able to (optionally, you could filter on the Message class) cope with those arbitrary Message types in a default way or implement specialized handling if that is required.
>> 
>> Take the StructuredDataMessage ( http://bugzilla.slf4j.org/show_bug.cgi?id=148 ) as an example. It does not just define a Message with key-value pairs in a Map<String, String>. Instead, it is more specific. It is used to create messages that conform to ( http://tools.ietf.org/html/rfc5424 ). This involves making sure that the keys must have a maximum length of 32 characters. Such a requirement can't be met by the abstract logging interface. It is an implementation detail of RFC5424 Syslog messages.
>> In addition to the Map<String, String>, such a StructuredDataMessage could implement the whole range of the specification, e.g. it could bypass the normal logging levels and make use of the Syslog severities instead.
>> 
> 
> In the 99% group, there are some (me!) that would really like to keep everything as is, but also be able to quickly and conveniently add some key/value pair data to the log entry.

But you already have this ability using the MDC. What Ralph requested in #148 was specific support for the Syslog format as specified by RFC5424

> 
> I was trying to show that this also supports bug 148.  StructuredData could implement Map<String,Object>.  Loggers that understand only Map<String,Object> would at least handle the parts that are exposed by the Map interface.  Loggers that understand StructuredData could provide 100% custom handling, such as creating a Log4j2 StructuredDataMessage from the StructuredData object.  In either case, key length validation, etc. would be handled by the StructuredData class.

But when would it be handled and what happens if the key is too long? Would the Logger throw an exception in case of Log4J2 but work in case of Logback? Or would the key-length be truncated to 32 chars? This could result in duplicates that overwrite each other.
Using Syslog with RFC5424 is a rather specific requirement. I never heard of it before Ralph issued that bug report. I have serious doubt that lots of users know about it. And what happens if the RFC is superseded by a new one that suddenly allows 64 chars? How would you implement this supporting both the 32 and the 64 version?
If all of this is logic is contained in specific Message implementations then it would be a no-brainer to support either one. The documentation of the requirements could reside in the Javadoc of the respective Message and everyone is happy.
It wouldn't be a very good idea to directly implement the Map interface as this interface does not impose any of those restrictions. It would be technically possible but semantically not entirely correct.

> I think it would be valuable to introduce supplementalData/namedParameters to the masses, while also providing things like StructuredData to advanced users.

Having messages with placeholders like {variableName} come at a cost.
Would you really like to use it regularly even if it imposes worse performance and also forces you to write bigger logging statements? Are you really sure? ;)
I know that it *sounds* nice, initially, but is there really a use-case beside "Oh, that would be neat."?

The statement
logger.info("User {}", user);
would (minimally, using an absolutely awfully hacky interface (we'd never implement it that way)) look like this:
logger.info("User {user}", "user", user);

A more reasonable implementation would even look like this:
Map<String, Object> parameters = new HashMap<String, Object>();
parameters.add("user", user);
logger.info("User {user}", parameters);

It just creates more noise in the code, IMHO.

> To be clear: I am not suggesting we avoid Message/Entry in favor of supplementalData/namedParameters.  I think the Message concept is great and think we should have both.  Use supplementalData/namedParameters as a quick way to add data to an entry, or use Message/Entry to "take over" all data and formatting of the log entry.
> 
>> One should not forget that someone implementing his own message type could also define a wrapper around the SLF4J Logger to support efficient creation of his handcrafted Messages. A special SyslogLogger implementation could automatically match the Syslog severity to a proper SLF4J level, for example, while still using the whole logging backend without any wheel-reinvention.
>> 
>> Another example for specific needs would be end-user-facing localized messages. Why should I reinvent my own logging layer if I could just use SLF4J/<insert backend here> for that?
>> 
>> Audit logging is another use case.
>> See http://audit.qos.ch/manual/index.html and http://audit.qos.ch/apidocs/ch/qos/logback/audit/AuditEvent.html 
>> This could be implemented using a AuditMessage instead while still being able to use the full range of Logback-Classic appenders but with the ability to add specific appenders handcrafted for specific tasks like writing them into a special audit database.
>> 
>> One small, additional thing is the possibility of immutable Message instances that could be kept and reused if there are no variables involved.
>> 
> 
> I think all of these would be supported - I am trying to build on your proposal.
> 
> For localized messages specifically, there may be a really nice way to handle this with "formatWith(...)" - I'll have to give it some thought.  But in any case, Message/Entry objects could always be used.  In any case, I agree 100% that SLF4J should handle localization, otherwise we would be breaking pluggability of back end loggers.

It wouldn't have to, at least not initially. Putting that logic into other Message implementations would mean that we don't have to implement it in SLF4J, at least not initially.
This could be added conveniently in, say, SLF4J 2.1 or 2.2. Whenever we have an implementation ready that we deem good, stable and mature enough to add it to the general API.

>> Concerning "COMPATIBILITY #1" in your previous message:
>> It is absolutely crucial that the original interface is left unchanged so that both use and implementations of that interface will keep on working as expected. It also needs to stay 1.4 compatible and logging frameworks that do not require 1.5 should still just implement the old SLF4J interface. The new interfaces will be available automatically via wrapper in that case.
>> 
>> If the logging implementation requires Java >= 1.5 anyway (e.g. Logback) then it would be a good idea to implement the new interface instead, providing backwards-compatibility by means of a wrapper in the other direction.
> 
> I think we are in agreement here, and we definitely cannot break < 2.0 Logger implementations.  In 2.0, I think we can enhance org.slf4j.Logger, maintain compatibility, and even provide many of the new features to 2.0 api users that choose to use a < 2.0 logger.

But you can not enhance org.slf4j.Logger without breaking compatibility in at least the implementations of that interface.
Further, if you enhance org.slf4j.Logger with JDK 1.5 features then it would be impossible to use in applications that are still using something else (not only JDK 1.4 but probably also Android, I'm not sure).

Leaving the original interface untouched and redefining a new interface in a different package is the only way that is really safe, allowing both implementations to exist side by side.

Apache enhanced their commons-lang module the same way. See http://java.dzone.com/articles/first-look-at-commons-lang-3
This is the only chance of guaranteeing compatibility in both directions. And this is crucial if you take into consideration the amount of third-party libraries that depend on the current SLF4J. They would simply eat us alive if we fail on this.
(both scenarios are already tested in my redesign-branch)

>  If I am right, I think this is a huge advantage, both in improving the feature set for older Logger users, and in sticking with only one interface rather than making users move to a different one (and for new users, avoid confusion about two Loggers, etc.)

Yes, a new package is an annoyance.
People are accustomed to this, though. JUnit did it. commons-lang did it, lots of others did it, too.
In contrast to some of the mentioned frameworks, our change could *really* be done by a global search & destroy...err....replace over the whole source tree. The old API calls will translate perfectly to the new ones. This is not binary compatibility *but* semi-source-compatibility (i.e. the package must be replaced).

> I'll provide more detail on this soon.
> 
>> 
>> Concerning your Entry interface:
>> The main idea of the very minimal Message interface is the absolute freedom it gives to the developer. A message does not have to have a pattern and parameters. Why should it? And why should I create an abstract base-class that returns null for all methods that are not applicable to that specific message/entry type? Isn't that enough evidence that the interface would be too specific?
>> I did consider to add Throwable to the Message interface but ultimately decided against it for exactly that reason. It just doesn't make sense for every type of message imaginable.
> 
> My idea is to encapsulate all data and formatting for a log entry, and simplify all interfaces.  The Entry interface basically defines the fundamental data components we are providing to Logger implementations.  There aren't that many methods, and having them all here provides the ability for Entry classes to handle all data aspects of an entry, and clarity that this can be done.  I'll add more thoughts on this later.

There simply isn't necessarily anything in common for such entries. An AccessLogger is logging an entirely different type of event than a classic LoggingEvent. The same is the case for AuditLogging.

> 
>> I'm also not a big fan of including either MDC or NDC in the Message since those belong to the LoggingEvent, similarly to the call-stack or the timestamp of the event. Neither of them are visible at the SLF4J interface level.
>> 
> 
> I haven't spent much time looking at MDC/NDC, but if (somewhere) there is support for named parameters, it might be nice to do:
> 
> ("user {mdcUserId} had a problem with {}", problemString)
> 
> A problem I have had with NDC before is that I have disabled logging NDC info to cut down on noise.  This would provide a mechanism to force a particular value into a message when it is relevant.  Just a thought.

Yes. It would be kind of nice, I agree.

But it would also result in a significantly slower message formatting simply because the formatter would have to be a lot more complex. The default formatter is deliberately "dumb". It does not support reordering of arguments as JUL does, for example. Neither are localized messages supported. This results in a way better performance - and we are not talking about 1 ns.
(Perhaps Ceki can share some  numbers on this, if he is still with us ;))

We don't want to degrade the performance of the whole system for everyone simply because there is a use-case for messages like this.
If someone wants this functionality then he can implement a Message supporting this style of formatting.

We could (and probably should) implement a Message that provides this functionality in SLF4J at some point. But we should do so with great care.
I'd imagine an implementation with a fluent interface, i.e. something like
new MappedMessage("User {user} just entered {method}.").add("user", "Foouser").add("method","Barmethod")

Keep in mind that those messages won't be localizable, either.

Adding localized messages is not trivial at all if you want to do it properly.
Simply using the system locale would not be sufficient, for example, since different locales for different users of a web-application isn't a very far-fetched scenario.

We actually really considered building a special (Logback) appender that puts logging events (e.g. the last ten) into the session for consumption in the view. Right now those would all have the same locale but a special Message implementation that would also contain the requested output locale could provide such a functionality. This implementation would use the JUL-Messagestyle that is referring to arguments using {0} {1:format} and so on since the functionality is already available in the JDK and translations must be able to reorder and replace arguments..

>> To finish this wall of text:
>> I can not stress enough that the Message interface would enable us to *filter* on the specific type of Message in the logging configuration and/or appender implementation. Granted, a better place for this argument would be the Logback mailing list but it IS a major feature and does indeed open a huge door of possibilities for future enhancements, all without touching the logging interfaces again.
>> 
>> Joern
>> _______________________________________________
>> slf4j-dev mailing list
>> slf4j-dev at qos.ch
>> http://qos.ch/mailman/listinfo/slf4j-dev
> 
> Additional thoughts on StructuredData:  mapping to Syslog severities raises an interesting question.  Perhaps Event should also encapsulate Level and Marker.  I know you are suggesting a more simple Message object, and this pulls in the opposite direction, but it would really provide a clean design whereby an application could specify _only_ the syslog severity, and the StructuredData class could handle mapping to trace/debug/etc. levels.  The interface would simply be:
> 
> public log(Entry entry);
> 
> Entry would provide full power and encapsulation over individual log events.  If I had time I would try to rework some of my above comments with this in mind, but we are just sharing ideas, right?

Yes, I'm indeed pulling into the opposite direction. ;)
The LoggingEvent is indeed as much implementation detail as it can get in a logging framework.
And they indeed differ significantly in the different implementations (JUL vs. Log4J vs. Logback vs. Lilith that is supporting a union of all information contained in any of them).

The purpose of SLF4J is to shield the user from all of this by means of a common interface that is as simple as possible but not simpler.
This is also one of the reasons why I defined a BaseLogger interface that is even simpler than the Logger interface we are accustomed to. One could work with just that simple interface, but at the cost of convenience.

You could indeed implement a separate SyslogLogger interface that would specifically support exactly this. But the implementation of that interface should wrap a SLF4J-Logger, not implement its interface. Otherwise you'd still be able to create Syslog-Messages that are logged with the wrong level (compared to the severity) by mistake/user-error. And if we can prevent user-errors, we should.



I really hope that all this does not come across arrogantly or that you feel bashed by me.

I appreciate this discussion very much!

It's just that I have given that whole issue a rather big amount of thought in the last years. I stumbled into several dead-ends in the process (extending the original Logger interface was one of those) but I'm also pretty confident that my latest proposal was pretty win-win concerning the requirements & features. The fact that I still like it more than a year later is also a good sign. :D

Cheers,
Joern.


More information about the slf4j-dev mailing list