Yarn queue capacity not working as expected for CORE nodes on EMR (emr-5.26.0)

2020-03-13T17:07:02

Usecase => Create two YARN queues: Q1 and Q2 with the configuration below.

[
  {
    "Classification": "capacity-scheduler",
      "Properties": {
"yarn.scheduler.capacity.root.queues" : "Q1,Q2",
"yarn.scheduler.capacity.root.Q1.capacity" : "60",
"yarn.scheduler.capacity.root.Q2.capacity" : "40",
"yarn.scheduler.capacity.root.Q1.accessible-node-labels" : "*", 
"yarn.scheduler.capacity.root.Q2.accessible-node-labels" : "*",
"yarn.scheduler.capacity.root.Q1.accessible-node-labels.CORE.capacity" : "60",
"yarn.scheduler.capacity.root.Q2.accessible-node-labels.CORE.capacity" : "40",
"yarn.scheduler.capacity.root.Q1.accessible-node-labels.CORE.maximum-capacity" : "60"
      }
  },
  {
    "Classification": "yarn-site",
      "Properties": {
        "yarn.acl.enable": "true",
        "yarn.resourcemanager.scheduler.class": "org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler"
      }
  }
]

Expected behavior: Q1 should not use more than 60% percent of CORE nodes so that 40% is always available for Q2. See YARN doc to understand Queue configs. Another reference to understand max-capacity configuration in this book.

Actual behavior: Q1 uses more than 60%, i.e. "Absolute Used Capacity" for the queue "Q1" is greater than the "Absolute Configured Max Capacity".

This is not as per the YARN documentation. I would like to understand the cause behind this behavior. And alternatives solutions to this.

Update 1: This problems seems to be only with the CORE nodes. If I specify below, it works as expected for task nodes. yarn.scheduler.capacity.root.Q1.maximum-capacity: 60

EMR creates CORE nodes under the YARN node label as CORE. By default, EMR creates this node label. More on the YARN node labels and EMR 5.19.0 using the YARN node label feature. IMHO, while using YARN node label feature for CORE nodes, EMR is either over-riding or have broken this configuration for CORE nodes.

Copyright License:
Author:「san」,Reproduced under the CC 4.0 BY-SA copyright license with link to original source & disclaimer.
Link to:https://stackoverflow.com/questions/60667585/yarn-queue-capacity-not-working-as-expected-for-core-nodes-on-emr-emr-5-26-0

About “Yarn queue capacity not working as expected for CORE nodes on EMR (emr-5.26.0)” questions

Usecase => Create two YARN queues: Q1 and Q2 with the configuration below. [ { "Classification": "capacity-scheduler", "Properties": { "yarn.scheduler.capacity.root.queues" :
Now that EMR supports downsizing of Core nodes on EMR, if I create an EMR cluster with 1 of the core nodes as a spot instance. What happens when the spot price exceeds the bid price for my core node?
My use-case: We have a long running Spark job. Here after called, LRJ. This job runs once in a week. We have multiple small running jobs that can come at any time. These jobs has high priority t...
I am trying to start an EMR cluster with bootstrap actions to configure YARN scheduler. This is the article I used to find the values. http://docs.aws.amazon.com/datapipeline/latest/DeveloperGui...
If I search for a generic definition of "capacity", Oxford languages says, "the maximum amount that something can contain". If I ask yarn for the status of the default queue, I...
Can we reserve space for application master on core node so that task containers are restricted to run on core node in EMR. And I do not want disable it. For us, task containers are consuming maxi...
I'm facing an issue while trying to run parallel Spark Streaming jobs on EMR. YARN is configured to use capacity scheduler and 3 queues A,B,C are configured. I submit the first streaming job A into
I'm running a job on Apache Spark on Amazon Elastic Map Reduce (EMR). Currently I'm running on emr-4.1.0 which includes Amazon Hadoop 2.6.0 and Spark 1.5.0. When I start the job, YARN correctly has
I have 4 queues under the root queue with the following configuration. |-------------|-----------------|---------------------|-------------------| | Queue Name | Capacity (in %) | Max Capacity (i...
Using the following queue configuration under YARN capacity policy, how is the default queue chosen when no queue is specified at job launch? <?xml version="1.0"?> <configuration> <

Copyright License:Reproduced under the CC 4.0 BY-SA copyright license with link to original source & disclaimer.