Up to and including version 1.2, log4j used a class called
LogLog
to output internal logging information,
LogLog
being a very thin wrapper around
System.out
. In other words, while log4j offered a
wide-range of features to logging statements generated by its
users, it offered little control over the logging output
generated by log4j itself.
Modifying log4j so that it becomes self-reliant for its own
logging is no trivial task. However, from an architectural
perspective, it would mean a significant jump in the maturity of
the design. From a more practical point-of-view, self-logging
would allow a given appender to log information about itself
through other appenders. For example, a failing
DBAppender
could log information about its troubles
which would be reported through say SMTPAppender
.
Self-logging would also help solving the logging separation problem. For example, imagine an Internet service provider running multiple applications belonging to different clients. In this scenario, compartmentalizing client resources, including access to the console, may be an important requirement.
When self-logging, log4j components will follow the same paradigm as regular log4j users. It goes something like:
Logger logger = Logger.getLogger(this.getClass().getName()); logger.debug("Hello from XYZAppender".);
Since log4j never sets up a configuration without explicit input from the user, log4j internal logging may occur before the log4j environment is set up. In particular, internal logging may occur while a configurator is processing a configuration file.
We could have simplified things by ignoring logging events generated during the configuration phase. However, the events generated during the configuration phase contain information useful in debugging the log4j configuration file. Under many circumstances, this information is considered more useful than all the subsequent logging events put together.
In order to capture the logs generated during configuration phase, log4j simply collects logging events in a temporary appender. At the end of the configuration phase, these recorded events are replayed within the context of the new log4j environment, (the one which was just configured). The temporary appender is then closed and detached from the log4j environment.
In case of errors during configuration, for example due to misplaced or misnamed configuration files, or syntactical errors in the file itself, replaying the events recorded during configuration may not work properly. However, one of the goals of log4j self-logging is to report error conditions related to log4j itself. There are two log4j features that help to overcome this chicken-and-egg situation.
First, configurators can be directly queried about the errors that occurred during configuration. Here is how:
JoranConfigurator jc = new JoranConfigurator(); jc.doConfigure("some/configuration/file.xml", LogManager.getLoggerRepository()); List errorList = jc.getErrorList(); for(int i = 0; i < errorList.size(); i++) { System.out.println(errorList.get(i)); }
You can also ask the configurator to explicitly configure a
temporary ConsoleAppender
which will be used to
output logging events while the configuration file is
parsed. This can be easily accomplished by setting the debug flag
to true within the configuration file.
For example,
<?xml version="1.0" encoding="UTF-8" ?> <!DOCTYPE log4j:configuration> <log4j:configuration debug="true"> .... </log4j:configuration>
Obviously, the debug flag will only have an effect if the configuration file is found.
In order to prevent infinite loops, the
AppenderSkeleton.doAppend()
method has a guard. If
the guard is set, then we are in the presence of recursion and
the AppenderSkeleton.doAppend()
method will return
immediately without any further processing.
In order to prevent synchronization deadlocks critical log4j
components now use ReadWriteLocks
instead of simple
synchronization.
Given the recursive nature of log4j internal logging, care must be exercised before resorting to internal logging. In other words, not all log4j components can safely use internal logging for error reporting.
However, it is still possible to formulate several guidelines which I hope Blinky, Springfield's 3-eyed fish, will find helpful.
Guideline 1: Avoid logging if you can, log only if you must.
Even when disabled, logging statements have a cost. Thus,
log4j components should not contain logging statement which may
be invoked frequently. As a rule of thumb, any log statement
which may be invoked more than 10 times during the lifecycle of
an application is considered "frequent". A logging statement of
level DEBUG within the append()
method of an
Appender
, a method which will be invoked for each
enable logging statement, would definitely be extremely wasteful.
Put more succinctly, log4j components should contain logging statements of level ERROR or WARN, seldom informational messages (of level INFO), and very rarely logging statements of level DEBUG.
Guideline 2: Don't assume a single
LoggerRepository
(LR)
Log4j is a prime example of an API which may be used by
multiple applications within the same JVM. For example, it is
more than likely we will encounter multiple web-applications
living within the same Servlet Container, such as Jetty, Resin or
Tomcat, all relying on log4j for their logging. Come to think of
it, the Servlet Container itself may be using log4j for its
logging purposes. Consequently, log4j components may not assume
that they will be used within the context of a single
LoggerRepository
. Please refer to the logging separation document for a more
detailed discussion of this topic. For the remainder of this
document, we may abbreviate the term
LoggerRepository
as LR or just "repository".
It follows that while thousands of applications may advantageously rely on a static logger, log4j components may not use the classical logger retrieval pattern:
static final Logger logger = Logger.getLogger(XYZ.class);
Instead, regular log4j components such as appenders,
receivers, or layouts, must have a Logger
field as
an instance variable. The usage pattern becomes:
package org.apache.log4j; class XYZAppender implements Appender { Logger logger; XYZComponent(...) { } .... }
LogManager.getLogger()
to return correct resultsIf you look at the code of Logger.getLogger()
it
should be obvious that it is just alias for
LogManager.getLogger()
. If you dig a little more, you
will see that LogManager.getLogger() delegates the choice of the
repository to use to its repository selector.
class LogManager { ... public static Logger getLogger(String name) { // Delegate the actual manufacturing of the logger to // a selected logger repository. return repositorySelector.getLoggerRepository().getLogger(name); } }
The LogManager.getLogger()
method will yield
correct results under the assumption that the
RepositorySelector
will always return the correct
repository. During the configuration phase of a repository, the
correct repository is by definition the repository being
configured.
One notable case where LogManager.getLogger() uses an
incorrect repositoy occurs during the initialization of a
web-application using log4j and
ContextJNDISelector
. Let us call this web-application
myWebApp. Further assume that the web.xml file in
myWebApp specifies that a repository named
myRepository should be used for myWebApp.
During the initialization phase of myWebApp, log4j classes will be loaded into memory. This in turn will trigger the configuration of the default logger repository. If the application server has a default log4j configuration file in the classpath, then appropriate configuration steps will be taken for configuring the default repository. While configuring the default repository, inevetiably log4j components will need to be initialized. If any of these components invoke LogManager.getLogger or Logger.getLogger, the retuned logger will incorrectly belong to myRepository. Not only that, simply beginning to use myRepository will trigger the configuration of myRepository interrupting the ongoing configuration of the default repository.
Interestingly enough, both configuration steps will complete. However, the log4j components attached to the default repository will be using loggers attached to myRepository. You just have been introduced to "evil weevil", the kind bug that mandates architectural changes.
The point to retain from this example is that log4j components
may not safely rely on
LogManager.getLoggerRepository()
and by ricochet on
LogManager.getLogger()
to find the appropriate LR
instance. The only perfectly safe way to ensure that a log4j
component will log using its owning LR is by injecting the LR
directly into the component.
Component
and ComponentBase
Internal logging is not the only reason for log4j components to
hold a reference to their owning LR. An LR acts a bag for objects
shared by one or more components. For example, the user can define
new conversion words to be shared by all instances of
PatternLayout
owned by the repository. The same goes
for filter chain references and objects of other types.
As such most log4j components are expected to implement the
Component
interface:
package org.apache.log4j.spi; public interface Component { /** * Set owning logger repository for this component. This * operation can only be performed once. Once set, the * owning repository cannot be changed. **/ public void setLoggerRepository(LoggerRepository repository); }
Concrete classes will extend ComponentBase
:
package org.apache.log4j.spi; import org.apache.log4j.LogManager; import org.apache.ugli.ULogger; public class ComponentBase implements Component { protected LoggerRepository repository; private ULogger logger; public void setLoggerRepository(LoggerRepository repository) { if(this.repository == null) { this.repository = repository; } else { throw new IllegalStateException("Repository has been already set"); } } /** * Return an instance specific logger to be used only by the * component itself, hence the protected keyword. * * In case the repository for this components is not set, * this implementations returns a SimpleLogger instance. */ protected ULogger getLogger() { if(logger == null) { if(repository != null) { logger = repository.getLogger(this.getClass().getName()); } else { logger = LogManager.SIMPLE_LOGGER_FA.getLogger(this.getClass().getName()); } } return logger; } }
For components belonging to a class hierarchy with many members,
the base class in the hierarchy will extend
ComponentBase
, inheriting a LR regerence, a logger
member field and a getLogger()
method. For example,
lIn the case of the Appender
class hierarchy, we have
enhanced AppenderSkeleton
to derive from
ComponentBase
.
Note that the getLogger()
method initialized the
logger variable lazily. Indeed, given Guideline 1 above, the
appender may never need log a message during its lifetime. This may
be considered as a minor optimization. Much more importantly, the
getLogger()
method will return a logger named after
the exact type of the appender.
Since the owning LR is injected directly into the component, we
do not have to rely on LogMaganer
. In case a reference
to the owning repository was not set, then the above implementation
will fallback on a SimpleLogger
part of the
org.apache.ugli
package.
The case of utility classes with only static methods turns out to be even more delicate. Unless the static utility method is explicitly handed a logger repository, it should not log using log4j loggers.
If internal logging is a must, then consider modifying the utility methods to become instance methods instead of static.
OLD and BAD:
class Util { public static int convertToInt(String str) throws NumberFormatException { try { Integer.parseInt(str) } catch(NumberFormatException nfe) { LogManager.getLogger(Util.class).error("Failed conversion for ["+str"].", nfe); throw nfe; } } // methods without log statements can remain static public static SQLDialect getDialectFromCode(int dialectCode) { SQLDialect sqlDialect = null; switch (dialectCode) { case POSTGRES_DIALECT: sqlDialect = new PostgreSQLDialect(); break; case MYSQL_DIALECT: sqlDialect = new MySQLDialect(); break; } return sqlDialect; } }
NEW and GOOD:
class Util extends ComponentBase { // now an instance method public int convertToInt(String str) throws NumberFormatException { try { Integer.parseInt(str) } catch(NumberFormatException nfe) { LogManager.getLogger(Util.class).error("Failed conversion for ["+str"].", nfe); throw nfe; } } // methods without log statements can remain static public static SQLDialect getDialectFromCode(int dialectCode) { SQLDialect sqlDialect = null; switch (dialectCode) { case POSTGRES_DIALECT: sqlDialect = new PostgreSQLDialect(); break; case MYSQL_DIALECT: sqlDialect = new MySQLDialect(); break; } return sqlDialect; } }
Users of the convertToInt
method in class
Util
will now have to create an instance of Util and
set its repository.
BEFORE:
void foo(String str) { int i = Util.convertToInt(str); SQLDialect dialect = Util.getDialectFromCode(i); }
AFTER:
void foo(String str) { Util u = new Util(); u.setLoggerRepository(repository); int i = u.convertToInt(str); // can remain unchanged SQLDialect dialect = Util.getDialectFromCode(i); }
Guideline 3: For core components where logging is impossible, use error lists.
In case of core components such as Logger
,
RootLogger
or Hierarchy
, where the
recursive implications of internal logging can be mind boggling,
add an ErrorItem
to the error list of the containing
repository. In extremely rare situations (corresponding to a
programming errors) where the logger repository has not been set
yet, throw an IllegalStateException
.
Guideline 4: If you can't be sure, choose between the lesser of two weevils, otherwise ask for help
Deciding on the correct error reporting pattern can sometimes be a daunting task. In case of doubt, ask for comment from the rest of the bunch who usually hang around at Moe's bar@.