How to change yarn scheduler configuration on aws EMR?

2017-04-14T10:04:41

Unlike HortonWorks or Cloudera, AWS EMR does not seem to give any GUI to change xml configurations of various hadoop ecosystem frameworks.

Logging into my EMR namenode and doing a quick

find \ -iname yarn-site.xml

I was able to find it to be located at /etc/hadoop/conf.empty/yarn-site.xml and capacity-scheduler to be located at /etc/hadoop/conf.empty/capacity-scheduler.xml.

But note how these are under conf.empty and I suspect these might not be the actual locations for yarn-site and capacity-scheduler xmls.

I understand that I can change these configurations while making a cluster but what I need to know is how to be able to change them without tearing apart the cluster.

I just want to play around scheduling properties and such and try out different schedulers to identify what might work will with my spark applications.

Thanks in advance!

Copyright License:
Author:「Kumar Vaibhav」,Reproduced under the CC 4.0 BY-SA copyright license with link to original source & disclaimer.
Link to:https://stackoverflow.com/questions/43404236/how-to-change-yarn-scheduler-configuration-on-aws-emr

About “How to change yarn scheduler configuration on aws EMR?” questions

Unlike HortonWorks or Cloudera, AWS EMR does not seem to give any GUI to change xml configurations of various hadoop ecosystem frameworks. Logging into my EMR namenode and doing a quick find \ ...
I am trying to start an EMR cluster with bootstrap actions to configure YARN scheduler. This is the article I used to find the values. http://docs.aws.amazon.com/datapipeline/latest/DeveloperGui...
I am creating AWS EMR using cloudformation template. I need to run the steps parallel. For that I am trying to change the YARN Scheduler from FIFO to fair / capacity scheduler. I have added: yarn.
Currently I am using the default Yarn scheduler but would like to do something like - Run Yarn using the default scheduler If (number of jobs in queue > X) { Change the Yarn scheduler to F...
I need to make a change to the YARN configuration on an EMR cluster. Do I need to make the change to just the yarn-site.xml file on the Hadoop master ? If so, how can I propagate the change to the
I am have a problem with: running beyond physical memory limits. Current usage: 1.5 GB of 1.4 GB physical memory used; 3.4 GB of 6.9 GB virtual memory used. Killing container. My cluster is: 4x c3.
I'm running Jupyterhub with pyspark3 kernel on AWS EMR Cluster. As we might know Jupyterhub pyspark3 on EMR uses Livy session to run workloads on AWS EMR YARN scheduler. My question is about the
I am unable to launch the cluster for the EMR release-5.11.0 (AWS JAVA SDK version 1.11.221) by providing Hadoop configuration. However, whenever the external Hadoop configuration is omitted (remo...
Usecase => Create two YARN queues: Q1 and Q2 with the configuration below. [ { "Classification": "capacity-scheduler", "Properties": { "yarn.scheduler.capacity.root.queues" :
I changed AWS EMR's YARN Scheduler by adding the following xml code to /etc/hadoop/config/yarn-site.xml <name>yarn.acl.enable</name> <value>true</value> </property&

Copyright License:Reproduced under the CC 4.0 BY-SA copyright license with link to original source & disclaimer.