EMR - Use custom logging appender in Hadoop (and YARN)

2015-12-03T16:40:53

In our EMR clusters, we are using custom log4j-appenders and log4j.properties to allow us to forward logs to Splunk and to let us do some magic that the provided libraries and configurations don't know how to do.

In EMR 3.x we did it using a bootstrap action did:

  1. Download from s3 our custom log4j appender jar, log4j.properties, container-log4j.properties that we customized.
  2. Put our custom log4j appender jar into the yarn lib directory at /home/hadoop/share/hadoop/yarn/lib/.
  3. Update Hadoop classpath to use our custom log4j appender
  4. Push our modified container-log4j.properties into hadoop-yarn-server-nodemanager.jar at /home/hadoop/share/hadoop/yarn/

All this worked and allowed us to use our appender all over the Hadoop processes.

We tried to upgrade to EMR release v4, but we noticed there is a major change in that bootstrap actions are being executed when there is no hadoop-yarn installed (the path /usr/lib/hadoop-yarn/ doesn't exist), therefore there is no hadoop-yarn-server-nodemanager.jar to modify as it's not yet installed (we modify the jar using the command: jar uf /usr/lib/hadoop-yarn/hadoop-yarn-server-nodemanager.jar container-log4j.properties ) and not lib folder in which we can place our custom log4j-appender.

How can we make these changes in EMR 4.x, to allow our custom logging?

Copyright License:
Author:「summerbulb」,Reproduced under the CC 4.0 BY-SA copyright license with link to original source & disclaimer.
Link to:https://stackoverflow.com/questions/34061287/emr-use-custom-logging-appender-in-hadoop-and-yarn

About “EMR - Use custom logging appender in Hadoop (and YARN)” questions

In our EMR clusters, we are using custom log4j-appenders and log4j.properties to allow us to forward logs to Splunk and to let us do some magic that the provided libraries and configurations don't ...
How to specify custom log4j appender in Hadoop 2 (amazon emr)? Hadoop 2 ignores my log4j.properties file that contains custom appender, overriding it with internal log4j.properties file. There is a
I am running Java Spark (3.1.2) Streaming application on EMR 6.5.0. I am getting continuous stream of Yarn Info messages as below 22/11/26 15:19:47 INFO Client: Application report for
Unlike HortonWorks or Cloudera, AWS EMR does not seem to give any GUI to change xml configurations of various hadoop ecosystem frameworks. Logging into my EMR namenode and doing a quick find \ ...
Usecase => Create two YARN queues: Q1 and Q2 with the configuration below. [ { "Classification": "capacity-scheduler", "Properties": { "yarn.scheduler.capacity.root.queues" :
I need to make a change to the YARN configuration on an EMR cluster. Do I need to make the change to just the yarn-site.xml file on the Hadoop master ? If so, how can I propagate the change to the
I am looking for an efficient way to modify both the mapred-site.xml and the yarn-site.xml in my configuration file for Hadoop on AWS EMR. I can achieve this manually using vim to edit it however I...
I am learning Spark fundamentals and in order to test my Pyspark application created an EMR instance with Spark, Yarn, Hadoop, Oozie on AWS. I am successfully able to execute a simple pyspark appli...
I have a long running YARN application running on EMR cluster. Based on Canceling EMR Steps, the running steps can be canceled with command aws emr cancel-steps as long as Amazon EMR versions 5.28...
As of today (2022-06-28), AWS EMR latest version is 6.6.0, which uses Hadoop 3.2.1. I need to use a different Hadoop version (3.2.2). I tried the following approach, but it doesn't work. You can ei...

Copyright License:Reproduced under the CC 4.0 BY-SA copyright license with link to original source & disclaimer.