Internal logging in log4j 1.3

by Ceki Gülcü, © All rights reserved
created 2004-11-15, last updated on 2004-12-30

The problem: logging from within log4j

Up to and including version 1.2, log4j used a class called LogLog to output internal logging information, LogLog being a very thin wrapper around System.out. In other words, while log4j offered a wide-range of features to logging statements generated by its users, it offered little control over the logging output generated by log4j itself.

Modifying log4j so that it becomes self-reliant for its own logging is no trivial task. However, from an architectural perspective, it would mean a significant jump in the maturity of the design. From a more practical point-of-view, self-logging would allow a given appender to log information about itself through other appenders. For example, a failing DBAppender could log information about its troubles which would be reported through say SMTPAppender.

Self-logging would also help solving the logging separation problem. For example, imagine an Internet service provider running multiple applications belonging to different clients. In this scenario, compartmentalizing client resources, including access to the console, may be an important requirement.

Internal logging usage pattern

When self-logging, log4j components will follow the same paradigm as regular log4j users. It goes something like:

Logger logger = Logger.getLogger(this.getClass().getName()); logger.debug("Hello from XYZAppender".);

Boot-strapping log4j internal-logging

Since log4j never sets up a configuration without explicit input from the user, log4j internal logging may occur before the log4j environment is set up. In particular, internal logging may occur while a configurator is processing a configuration file.

We could have simplified things by ignoring logging events generated during the configuration phase. However, the events generated during the configuration phase contain information useful in debugging the log4j configuration file. Under many circumstances, this information is considered more useful than all the subsequent logging events put together.

In order to capture the logs generated during configuration phase, log4j simply collects logging events in a temporary appender. At the end of the configuration phase, these recorded events are replayed within the context of the new log4j environment, (the one which was just configured). The temporary appender is then closed and detached from the log4j environment.

In case of errors during configuration

In case of errors during configuration, for example due to misplaced or misnamed configuration files, or syntactical errors in the file itself, replaying the events recorded during configuration may not work properly. However, one of the goals of log4j self-logging is to report error conditions related to log4j itself. There are two log4j features that help to overcome this chicken-and-egg situation.

First, configurators can be directly queried about the errors that occurred during configuration. Here is how:

JoranConfigurator jc = new JoranConfigurator(); jc.doConfigure("some/configuration/file.xml", LogManager.getLoggerRepository()); List errorList = jc.getErrorList(); for(int i = 0; i < errorList.size(); i++) { System.out.println(errorList.get(i)); }

You can also ask the configurator to explicitly configure a temporary ConsoleAppender which will be used to output logging events while the configuration file is parsed. This can be easily accomplished by setting the debug flag to true within the configuration file.

For example,

<?xml version="1.0" encoding="UTF-8" ?> <!DOCTYPE log4j:configuration> <log4j:configuration debug="true"> .... </log4j:configuration>

Obviously, the debug flag will only have an effect if the configuration file is found.

How does it work?

In order to prevent infinite loops, the AppenderSkeleton.doAppend() method has a guard. If the guard is set, then we are in the presence of recursion and the AppenderSkeleton.doAppend() method will return immediately without any further processing.

In order to prevent synchronization deadlocks critical log4j components now use ReadWriteLocks instead of simple synchronization.

Guidelines for log4j developers

Given the recursive nature of log4j internal logging, care must be exercised before resorting to internal logging. In other words, not all log4j components can safely use internal logging for error reporting.

However, it is still possible to formulate several guidelines which I hope Blinky, Springfield's 3-eyed fish, will find helpful.

Guideline 1: Avoid logging if you can, log only if you must.

Even when disabled, logging statements have a cost. Thus, log4j components should not contain logging statement which may be invoked frequently. As a rule of thumb, any log statement which may be invoked more than 10 times during the lifecycle of an application is considered "frequent". A logging statement of level DEBUG within the append() method of an Appender, a method which will be invoked for each enable logging statement, would definitely be extremely wasteful.

Put more succinctly, log4j components should contain logging statements of level ERROR or WARN, seldom informational messages (of level INFO), and very rarely logging statements of level DEBUG.

Guideline 2: Don't assume a single LoggerRepository (LR)

Log4j is a prime example of an API which may be used by multiple applications within the same JVM. For example, it is more than likely we will encounter multiple web-applications living within the same Servlet Container, such as Jetty, Resin or Tomcat, all relying on log4j for their logging. Come to think of it, the Servlet Container itself may be using log4j for its logging purposes. Consequently, log4j components may not assume that they will be used within the context of a single LoggerRepository. Please refer to the logging separation document for a more detailed discussion of this topic. For the remainder of this document, we may abbreviate the term LoggerRepository as LR or just "repository".

It follows that while thousands of applications may advantageously rely on a static logger, log4j components may not use the classical logger retrieval pattern:

static final Logger logger = Logger.getLogger(XYZ.class);

Instead, regular log4j components such as appenders, receivers, or layouts, must have a Logger field as an instance variable. The usage pattern becomes:

package org.apache.log4j; class XYZAppender implements Appender { Logger logger; XYZComponent(...) { } .... }

Don't count on `LogManager.getLogger()` to return correct results

If you look at the code of Logger.getLogger() it should be obvious that it is just alias for LogManager.getLogger(). If you dig a little more, you will see that LogManager.getLogger() delegates the choice of the repository to use to its repository selector.

class LogManager { ... public static Logger getLogger(String name) { // Delegate the actual manufacturing of the logger to // a selected logger repository. return repositorySelector.getLoggerRepository().getLogger(name); } }

The LogManager.getLogger() method will yield correct results under the assumption that the RepositorySelector will always return the correct repository. During the configuration phase of a repository, the correct repository is by definition the repository being configured.

One notable case where LogManager.getLogger() uses an incorrect repositoy occurs during the initialization of a web-application using log4j and ContextJNDISelector. Let us call this web-application myWebApp. Further assume that the web.xml file in myWebApp specifies that a repository named myRepository should be used for myWebApp.

During the initialization phase of myWebApp, log4j classes will be loaded into memory. This in turn will trigger the configuration of the default logger repository. If the application server has a default log4j configuration file in the classpath, then appropriate configuration steps will be taken for configuring the default repository. While configuring the default repository, inevetiably log4j components will need to be initialized. If any of these components invoke LogManager.getLogger or Logger.getLogger, the retuned logger will incorrectly belong to myRepository. Not only that, simply beginning to use myRepository will trigger the configuration of myRepository interrupting the ongoing configuration of the default repository.

Interestingly enough, both configuration steps will complete. However, the log4j components attached to the default repository will be using loggers attached to myRepository. You just have been introduced to "evil weevil", the kind bug that mandates architectural changes.

The point to retain from this example is that log4j components may not safely rely on LogManager.getLoggerRepository() and by ricochet on LogManager.getLogger() to find the appropriate LR instance. The only perfectly safe way to ensure that a log4j component will log using its owning LR is by injecting the LR directly into the component.

Enter `Component` and `ComponentBase`

Internal logging is not the only reason for log4j components to hold a reference to their owning LR. An LR acts a bag for objects shared by one or more components. For example, the user can define new conversion words to be shared by all instances of PatternLayout owned by the repository. The same goes for filter chain references and objects of other types.

As such most log4j components are expected to implement the Component interface:

package org.apache.log4j.spi; public interface Component { /** * Set owning logger repository for this component. This * operation can only be performed once. Once set, the * owning repository cannot be changed. **/ public void setLoggerRepository(LoggerRepository repository); }

Concrete classes will extend ComponentBase:

package org.apache.log4j.spi; import org.apache.log4j.LogManager; import org.apache.ugli.ULogger; public class ComponentBase implements Component { protected LoggerRepository repository; private ULogger logger; public void setLoggerRepository(LoggerRepository repository) { if(this.repository == null) { this.repository = repository; } else { throw new IllegalStateException("Repository has been already set"); } } /** * Return an instance specific logger to be used only by the * component itself, hence the protected keyword. * * In case the repository for this components is not set, * this implementations returns a SimpleLogger instance. */ protected ULogger getLogger() { if(logger == null) { if(repository != null) { logger = repository.getLogger(this.getClass().getName()); } else { logger = LogManager.SIMPLE_LOGGER_FA.getLogger(this.getClass().getName()); } } return logger; } }

For components belonging to a class hierarchy with many members, the base class in the hierarchy will extend ComponentBase, inheriting a LR regerence, a logger member field and a getLogger() method. For example, lIn the case of the Appender class hierarchy, we have enhanced AppenderSkeleton to derive from ComponentBase.

Note that the getLogger() method initialized the logger variable lazily. Indeed, given Guideline 1 above, the appender may never need log a message during its lifetime. This may be considered as a minor optimization. Much more importantly, the getLogger() method will return a logger named after the exact type of the appender.

Since the owning LR is injected directly into the component, we do not have to rely on LogMaganer. In case a reference to the owning repository was not set, then the above implementation will fallback on a SimpleLogger part of the org.apache.ugli package.

The case of utility classes with static methods

The case of utility classes with only static methods turns out to be even more delicate. Unless the static utility method is explicitly handed a logger repository, it should not log using log4j loggers.

If internal logging is a must, then consider modifying the utility methods to become instance methods instead of static.

OLD and BAD:

class Util { public static int convertToInt(String str) throws NumberFormatException { try { Integer.parseInt(str) } catch(NumberFormatException nfe) { LogManager.getLogger(Util.class).error("Failed conversion for ["+str"].", nfe); throw nfe; } } // methods without log statements can remain static public static SQLDialect getDialectFromCode(int dialectCode) { SQLDialect sqlDialect = null; switch (dialectCode) { case POSTGRES_DIALECT: sqlDialect = new PostgreSQLDialect(); break; case MYSQL_DIALECT: sqlDialect = new MySQLDialect(); break; } return sqlDialect; } }

NEW and GOOD:

class Util extends ComponentBase { // now an instance method public int convertToInt(String str) throws NumberFormatException { try { Integer.parseInt(str) } catch(NumberFormatException nfe) { LogManager.getLogger(Util.class).error("Failed conversion for ["+str"].", nfe); throw nfe; } } // methods without log statements can remain static public static SQLDialect getDialectFromCode(int dialectCode) { SQLDialect sqlDialect = null; switch (dialectCode) { case POSTGRES_DIALECT: sqlDialect = new PostgreSQLDialect(); break; case MYSQL_DIALECT: sqlDialect = new MySQLDialect(); break; } return sqlDialect; } }

Users of the convertToInt method in class Util will now have to create an instance of Util and set its repository.

BEFORE:

void foo(String str) { int i = Util.convertToInt(str); SQLDialect dialect = Util.getDialectFromCode(i); }

AFTER:

void foo(String str) { Util u = new Util(); u.setLoggerRepository(repository); int i = u.convertToInt(str); // can remain unchanged SQLDialect dialect = Util.getDialectFromCode(i); }

Guideline 3: For core components where logging is impossible, use error lists.

In case of core components such as Logger, RootLogger or Hierarchy, where the recursive implications of internal logging can be mind boggling, add an ErrorItem to the error list of the containing repository. In extremely rare situations (corresponding to a programming errors) where the logger repository has not been set yet, throw an IllegalStateException.

Guideline 4: If you can't be sure, choose between the lesser of two weevils, otherwise ask for help

Deciding on the correct error reporting pattern can sometimes be a daunting task. In case of doubt, ask for comment from the rest of the bunch who usually hang around at Moe's bar@.