Supporting the log4j `RepositorySelector` in Servlet Containers

The problem: Logging Separation

Since time immemorial users have struggled to control the logging configuration of multiple web-applications deployed on the same Servlet Container (e.g. Tomcat). If you think that in your environment you will always run one and only one web-application, then do yourself a favor, stop reading. This technical document is likely to be waste of your time.

What does separation of logging mean? In a separated logging environment, each web-application can configure log4j in different ways such that the settings of one web-application do not interfere with the settings of another. A variant of this problem is the separation of web-application logging and the logging of the container itself.

Goals

The goal of this document is to achieve logging separation for web-applications. The following cases must be taken into consideration:

Servlet classes that are used in a single web-application (unshared servlets). More generally, libraries or classes that are used by one and only one web-application (unshared libraries).
Servlet classes that are used in a multiple web-applications (shared servlets). More generally, libraries or classes that are shared between multiple web-applications (shared libraries).
Loggers which are class static variables.
Loggers which are instance variables of the containing class.
Loggers which are local variables of the containing class method.

In case logging separation cannot be achieved for a particular case, this must be well documented such that users can be aware of the potential problems and possibly avoid the troublesome case altogether.

Another important goal of this document is to get Servlet Container developers to tackle the logging separation problem. Log4j provides support but the problem will not be solved unless container developers also participate in the discussion. If you are Container developer and you think that this proposal does not address your needs, lacks detail or just plain sucks, then please do not hesitate to come forward. The log4j community is likely to listen carefully to what you have to say. We are reachable at log4j-dev@logging.apache.org.

Non Goals

This technical specification assumes that the user has adopted log4j as her logging API. It does not deal with the following:

Interactions with other logging APIs.
Interactions with the commons-logging API.

First Solution

Assuming each web-application is loaded by a distinct classloader, then placing a copy of log4j.jar under WEB-INF/lib/ directory of each web-application will automatically result in distinct log4j-logging universes. Simply put, each web-application will load its own distinct copy of log4j classes into memory. All such copies are invisible and inaccessible to each other.

This solution is not too complicated to set up but has drawbacks:

Multiple copies of log4j.jar take more disk and memory space. On today's computers with huge disk spaces and memory, the waste of a few hundred kilobytes is hardly a serious issue.

The Java classloader delegation model gives precedence to parent classloaders. This means that if log4j.jar is available on the CLASSPATH, or under JAVA_HOME/jre/lib/ext or to any classloader which is a parent of the web-application's classloader, then that copy of log4j will be loaded into memory and shared by all web-applications.

Thus, this solution is brittle: its success depends on external factors. If your environment is not setup properly then the solution won't work. If the container itself uses log4j and makes it visible to web-applications, it won't work. In general, solutions depending on classloader tricks don't very work well. They are complicated and fragile. Most Java developers do not understand classloaders. They just don't. Dealing with classloader related problems requires that the developer understands classloaders as well as the classloader hierarchy of the particular container she is using. Different containers exhibit different class loading behaviors. In some cases, different versions of the same container behave differently.
Assuming you are lucky and you successfully setup different log4j-logging environment for each web-application, then since every copy of the log4j classes are invisible to each other, they will also be invisible to any management entity. In other words, you will not be able to manage the different log4j instances from a single management console.

Second solution

Let me thank you for reading this far without falling asleep. Thank you. At this stage, give yourself a break, get a cup of coffee, checkout today's Dilbert.

Nice to have you back.

In a nutshell, the second solution relies on the Servlet Container to keep track of the execution context and provide a different logging environment for each context. Put differently, the Servlet Container provides a separate hierarchy instance for each web-application. As you probably know each logger object, a.k.a. category, that log4j creates is attached to a hierarchy. The Hierarchy class implements the LoggerRepository interface by arranging logger objects in a tree according to their name.

The Logger.getLogger() method is actually implemented as follows:

  static public Logger getLogger(String name) {
    return LogManager.getLogger(name);
  }

In other words, the Logger class simply calls the class static getLogger method in the LogManager class. The LogManager.getLogger() method is implemented as follows:

  public static Logger getLogger(String name) {
     // Delegate the actual manufacturing of the logger to the logger repository.
    return repositorySelector.getLoggerRepository().getLogger(name);
  }

The repositorySelector variable is a private class static variable of type RepositorySelector. The RepositorySelector interface boasts only one method: getLoggerRepository. The RepositorySelector interface is reproduced (in its entirety) below:

  package org.apache.log4j.spi;
  public interface RepositorySelector {
    public LoggerRepository getLoggerRepository();
  }

By default, the class static repositorySelector variable of the LogManager class is set to a trivial RepositorySelector implementation which always returns the same logger repository, which also happens to be a Hierarchy instance. What a coincidence, no? In other words, by default log4j will use one hierarchy, the default hierarchy. Obviously, you can override this behavior. The LogManager class has a setter method, namely the setRepositorySelector method, that can cause the LogManager class to use a different RepositorySelector implementation. Let us call this implementation, CRS, for Contextual Repository Selector.

CRS, or Contextual Repository Selector, is such that depending on the current execution context, it returns a different LoggerRepository instance. But since the getLoggerRepository() method takes no parameters how can it know the current execution context? The answer to this question depends on the Servlet Container. In Apache Tomcat for example, each web-application has its own classloader and Tomcat sets the Thread Context Classloader, or TCL, to be the classloader of the currently executing web-application.

Under this assumption our CRS can return a Hierarchy instance depending on the TCL. Below is a possible implementation of the CRS specifically designed for Tomcat.


package org.apache.tomcat.wombat;

import org.apache.log4j.spi.RepositorySelector;
import org.apache.log4j.spi.LoggerRepository;
import org.apache.log4j.spi.RootCategory;
import org.apache.log4j.Hierarchy;
import org.apache.log4j.Level;
import java.util.Hashtable;

public class CRS implements RepositorySelector {
  
  // key: current thread's ContextClassLoader, 
  // value: Hierarchy instance
  private Hashtable ht;
  
  public CRS() {
   ht = new Hashtable(); 
  }

  // the returned value is guaranteed to be non-null
  public LoggerRepository getLoggerRepository() {
    ClassLoader cl = Thread.currentThread().getContextClassLoader();
    Hierarchy hierarchy = (Hierarchy) ht.get(cl);

    if(hierarchy == null) {
      hierarchy = new Hierarchy(new RootCategory((Level) Level.DEBUG));
      ht.put(cl, hierarchy);
    } 
    return hierarchy;
  }

  /** 
   * The Container should remove the entry when the web-application
   * is removed or restarted.
   * */
  public void remove(ClassLoader cl) {
    ht.remove(cl); 
  } 
}

The Servlet Container will set the repository selector to a CRS instance when it starts up. This is as simple as calling:

	Object guard = new Object();
	LogManager.setRepositorySelector(new CRS(), guard);

Thereafter, the repository selector can only be changed by supplying the guard. Those who do not know it cannot change the repository selector. Note that the CRS implementation is Container specific i.e. it is part of the Container, not log4j.

JNDI variant I

A variant of the above solution relies on the structure of the JNDI name space. In J2EE environments, each web-application is guaranteed to have its own JNDI context relative to the java:comp/env context. In EJBs, each enterprise bean (albeit not each application) has its own context relative to the java:comp/env context.

For example, a web-application could configure its deployment descriptor by adding an env-entry specifying its logging context. As in,

    <web-app>
      <description>The deployment descriptor for the Tiger web-application</description>
      ... 
         
      <env-entry>
        <description>Sets the logging context for the Tiger web-app</description>
        <env-entry-name>logging-context</env-entry-name>
        <env-entryvalue>TigerLoggingContext</env-entryvalue>
        <env-entry-type>java.lang.String</env-entry-type>
      </env-entry>

      ....
    </web-app>

Once the env-entry is set, a repository selector can query the JNDI application context (the java:comp/env context) to look up the value of logging-context. The logging context of the web-application will depend on the value logging-context.

Below is a simplified implementation of a JNDI-based repository selector.

package org.apache.X;

import org.apache.log4j.spi.RepositorySelector;
import org.apache.log4j.spi.LoggerRepository;
import org.apache.log4j.spi.RootCategory;
import org.apache.log4j.Hierarchy;
import org.apache.log4j.Level;
import java.util.Hashtable;

import javax.naming.InitialContext;
import javax.naming.Context;
import javax.naming.NameNotFoundException;
import javax.naming.NamingException;

/** JNDI based Repository selector */
public class JNDIRS implements RepositorySelector {
  
  // key: name of logging context, 
  // value: Hierarchy instance
  private Hashtable ht;
  private Hierarchy defaultHierarchy;
  
  public JNDIRS() {
   ht = new Hashtable(); 
   defaultHierarchy = new Hierarchy(new RootCategory(Level.DEBUG));
  }

  // the returned value is guaranteed to be non-null
  public LoggerRepository getLoggerRepository() {
    String loggingContextName = null;    

    try {
      Context ctx = new InitialContext();      
      loggingContextName = (String) ctx.lookup("java:comp/env/logging-context");
    } catch(NamingException ne) {      
      // we can't log here
    } 

    if(loggingContextName == null) {
      return defaultHierarchy;
    } else {
      Hierarchy h = (Hierarchy) ht.get(loggingContextName);
      if(h == null) {
        h = new Hierarchy(new RootCategory(Level.DEBUG));
        ht.put(loggingContextName, h);
      }
      return h;
    } 
  }
}

JNDIRS is container independent. JNDIRS relies on a standard technology, namely JNDI. Servlet and EJB containers are obliged to support JNDI because the administrative resources of most J2EE applications depend on it. In other words, JNDIRS merely leverages existing infrastructure to provide separation of logging. Just as importantly, the JNDI space is shared by JSP, Servlets and EJBs belonging to the same application. JNDIRS will work under all application servers (e.g. JBoss, Weblogic, Websphere) or from within Servlets containers (e.g. Jetty, Resin, Tomcat) and even when Servlet containers are embedded within application servers.

JNDI variant II

Costin Manolache observed thatthe previous solution allowed a malevolent application to spoof the logging environment of another application by setting the same string value for java:comp/env/logging-context.

In other words, JNDIRS solves the voluntary logging separation problem but not the mandatory separation problem.

A container can prevent spoofing by malevolent applications by prefixing the name of the respository by either the application name or host name (in case multiple hosts live under the same container). Thus, if a given application desires to have a separate unspoofable logger repository, it will ask the container to do so in its deployment descriptor.

Here is a pseudo-implementation:

public class JNDIRS2 implements RepositorySelector {
   
  ... same as JNDIRS

  public LoggerRepository getLoggerRepository() {
    ... same as JNDIRS
   
     if(loggingContextName == null) {
      return defaultHierarchy;
    } else {
      if(mandatory separation for this application) {
        String applicationName = getApplicationNameThroughContainerMagic();
        loggingContextName = applicationName + loggingContextName;     
      }
      Hierarchy h = (Hierarchy) ht.get(loggingContextName);
      if(h == null) {
        h = new Hierarchy(new RootCategory(Level.DEBUG));
        ht.put(loggingContextName, h);
      }
      return h;
    } 
  }
}

Contraty to the previous case, JNDIRS2 requires support from the container.

Advantages of context-based repository selectors

One advantage of context-based repository selectors is that log4j users will continue to call Logger.getLogger method in their code as usual, but their web-applications will use different hierarchy instances, which effectively achieves separation of logging per web-application. It does not matter if log4j.jar file is on the CLASSPATH, in JAVA_HOME/jre/lib/ext/ or in the Container's "common" classloader. Moreover, web-applications will no longer need to add log4j.jar to their WEB-INF/lib directory.

There is another extremely important advantage. By controlling the logger repository the Servlet Container can also safely control the Logger implementation returned by the repository. The Container's particular Logger implementation can posses different characteristics:

impose stricter security,
it can be a NullLogger implementation in case logging is disabled for a given web-application,
it can transparently interact with the web-application's Container specific logging settings.

These implementations result respectively in higher security, better performance and better control.

Supporting the log4j RepositorySelector in Servlet Containers