Yarn does not by default aggregate logs before an application finishes, which can be problematic with streaming jobs that don't even terminate.
A workaround is to use rsyslog
, which is available on most linux machines.
First, allow incoming udp requests by uncommenting the following lines in /etc/rsyslog.conf
:
$ModLoad imudp
$UDPServerRun 514
Edit your log4j.properties
(see the other examples on this page) to use SyslogAppender
:
log4j.rootLogger=INFO, file
# TODO: change package logtest to your package
log4j.logger.logtest=INFO, SYSLOG
# Log all infos in the given file
log4j.appender.file=org.apache.log4j.FileAppender
log4j.appender.file.file=${log.file}
log4j.appender.file.append=false
log4j.appender.file.layout=org.apache.log4j.PatternLayout
log4j.appender.file.layout.ConversionPattern=bbdata: %d{yyyy-MM-dd HH:mm:ss,SSS} %-5p %-60c %x - %m%n
# suppress the irrelevant (wrong) warnings from the netty channel handler
log4j.logger.org.jboss.netty.channel.DefaultChannelPipeline=ERROR, file
# rsyslog
# configure Syslog facility SYSLOG appender
# TODO: replace host and myTag by your own
log4j.appender.SYSLOG=org.apache.log4j.net.SyslogAppender
log4j.appender.SYSLOG.syslogHost=10.10.10.102
log4j.appender.SYSLOG.port=514
#log4j.appender.SYSLOG.appName=bbdata
log4j.appender.SYSLOG.layout=org.apache.log4j.EnhancedPatternLayout
log4j.appender.SYSLOG.layout.conversionPattern=myTag: [%p] %c:%L - %m %throwable %n
The layout is important, because rsyslog treats a newline as a new log entry. Above, newlines (in stacktraces for example) will be skipped. If you really want multiline/tabbed logs to work "normally", edit rsyslog.conf
and add:
$EscapeControlCharactersOnReceive off
The use of myTag:
at the beginning of the conversionPattern
is useful if you want to redirect all your logs into a specific file. To do that, edit rsyslog.conf
and add the following rule:
if $programname == 'myTag' then /var/log/my-app.log
& stop