Wednesday, September 13, 2006

Splunk Integration with Log4j

Hey. I'm back with some new tips on Splunk integration. This time, I'm documenting how to hook up an existing log4j based logging system with Splunk. This post details some of the work I've been doing in my development environment lately and includes a really great quote from a co-worker impressed with my demo of the new setup.

Let's get to it.

First, what is log4j. Log4j is an open source logging tool from the Apache Software Foundation.

It is used by the majority of Java application servers these days, including Tomcat and JBoss. I covered a Splunk Integration with JBoss in an earlier post here:

Now I'm on to Tomcat, a slightly lighter weight web application server. Another diffference from the JBoss example, and the reason for this post, is that the Tomcat server does not live on the same server as Splunk. In this case, getting the logs from Tomcat into Splunk is a bit trickier.

Luckily, the good folks at Splunk have solved this problem for me already. They have provided a log4j TCP appender. This is a bit of client code in the form of a jar file that gets installed on the Tomcat side. Combined with a configuration element, this code sends log4j traffic in real time to the Splunk host over a TCP connection, by default on port 9555.

Here's my configuration stanza which I added to on my Tomcat server:

log4j.rootLogger=INFO, SplunkAppender

#SplunkAppender - Sends events to port on Splunk box


log4j.appender.SplunkAppender.layout.ConversionPattern=[%d{yyyy-MM-dd HH:mm:ss}] coreservices %-5p: %m%n

Thanks, nhauser, for the config snippet.

After setting up the client side, move on to the server where you'll need to find and enable the config.xml file for the listener input module on the Splunk server.

A few ACL firewall changes later, I was in business. A new source showed up on my Splunk server, source::tcp:9995. This corresponds to traffic on port 9995 from all my Tomcat servers. I can see the data sorted by hosts, in my current configuration, I have five hosts configured to use this data path.

Some other cool features, the host portion of the splunk event metadata is published by reverse DNS lookup. Good thing I run the DNS so this works flawlessly. I noticed one small configuration element that I needed to tweak, namely adding some detail about which web application under Tomcat was generating the log entry. This way, I have a searchable way to distinguish my various web apps.

In all, I'm very happy with the new setup. It beats the pants off the old local files tailed by logminion and send via syslog approach. By using a native log4j logging appender, I have the abillity to keep multiline Java logs in tact which is a huge improvement. Also, this solution feels cleaner. I don't have the duplicate tiemstamps I was seeing before and I'm not dependant on a third piece of software to complete the flow of data.

To get the configurations just right, I brought in a friend and co-worker who specializes in java development. When I showed him the configuration I had going, he replied:

This is really cool. You just did the work of five ops guys in my previous company. They tried for years to do something like this and got nowhere near as cool as this.

Sphere: Related Content


jrodman said...

This is a pretty interesting solution, but unfortunately hasn't really been maintained to work with modern versions of Splunk.

For modern Splunk users arriving here, I'd recommend a choice of the syslog appender, or logging to files which feed Splunk forwarders.