CANCELing a YARN step in EMR

2020-12-31T07:38:17

I have a long running YARN application running on EMR cluster. Based on Canceling EMR Steps, the running steps can be canceled with command aws emr cancel-steps as long as Amazon EMR versions 5.28.0 and later is being used (which is the case for me), however when I issue the above against my running step it never kills the actual yarn application. I can see the step changing it's status to Canceled in the UI, however if I ssh into the EMR and execute yarn application -list I still can swe my application alive and well :) In the logs I see

INFO waitProcessCompletion ended with exit code 137 : hadoop jar /var/lib/aws/emr/step-runner/hadoop-...
INFO total process run time: 344 seconds
2020-12-30T23:13:42.362Z INFO Step created jobs: 
2020-12-30T23:13:42.362Z WARN Step failed with exitCode 137 and took 344 seconds

Which, based on my understanding, means that the container did receive the SIGKILL command. Can someone advise why it is still not killing the application?

P.S I am using the TERMINATE_PROCESS cancelation option when executing the cancel-steps command.

Thank you!

Copyright License:
Author:「user3693309」,Reproduced under the CC 4.0 BY-SA copyright license with link to original source & disclaimer.
Link to:https://stackoverflow.com/questions/65514902/canceling-a-yarn-step-in-emr

About “CANCELing a YARN step in EMR” questions

I have a long running YARN application running on EMR cluster. Based on Canceling EMR Steps, the running steps can be canceled with command aws emr cancel-steps as long as Amazon EMR versions 5.28...
Yarn shows the jobs is succeeded(in Yarn UI) but the EMR shows the step(in EMR console UI) is still running and it shows like tat forever. Any thought ? I am writing to s3 as json part files and I ...
Am aggregating EMR yarn application logs to S3 using below YARN configuration: [ { "Classification": "yarn-site", "Properties": { "yarn.log-aggreg...
I have an AWS EMR cluster (emr-4.2.0, Spark 1.5.2), where I am submitting steps from aws cli. My problem is, that if the Spark application fails, then YARN is trying to run the application again (u...
I need to make a change to the YARN configuration on an EMR cluster. Do I need to make the change to just the yarn-site.xml file on the Hadoop master ? If so, how can I propagate the change to the
is there a way to set a timeout for a step in Amazon Aws EMR? I'm running a batch Apache Spark job on EMR and I would like the job to stop with a timeout if it doesn't end within 3 hours. I canno...
I am trying to get EMR to run a simple hello world type app. from pyspark import SparkContext from operator import add sc = SparkContext() data = sc.parallelize(list("Hello World")) data.show() ...
I am very new to AWS Step Functions and AWS Lambda Functions and could really use some help getting an EMR Cluster running through Step Functions. A sample of my current State Machine structure is
After a spark job completion(spark job is able to upload the files to S3 successfully), Yarn shows the job is completed in Yarn UI, but the EMR shows the step is still running (in AWS EMR console) ...
I am learning Spark fundamentals and in order to test my Pyspark application created an EMR instance with Spark, Yarn, Hadoop, Oozie on AWS. I am successfully able to execute a simple pyspark appli...

Copyright License:Reproduced under the CC 4.0 BY-SA copyright license with link to original source & disclaimer.