Mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Harish JP <...@hortonworks.com>
Subject Re: Tez GC issues perhaps? not sure.
Date Wed, 14 Dec 2016 02:37:13 GMT
AFAIK, HS2 uses a pool of AMs and submits query to any free AM. There should be configs which
control number of free AMs, timeout and so on for the pool used by HS2.


On 14-Dec-2016, at 7:54 AM, Stephen Sprague <spragues@gmail.com<mailto:spragues@gmail.com>>
wrote:

interesting.... thank you.   pretty sure they are being submitted through the HS2 service.

On Tue, Dec 13, 2016 at 5:21 PM, Harish JP <hjp@hortonworks.com<mailto:hjp@hortonworks.com>>
wrote:
Hi Stephen,

How are you starting these jobs, beeline, hive-cli, ...?  It looks like they are being started
in session mode, which means the AM waits for 5 minutes (default value) for a new DAG/query
to be submitted, if it does not receive a query it will timeout and shutdown. The config for
this tez.session.am.dag.submit.timeout.secs.

—
Thanks,
Harish

On 14-Dec-2016, at 6:13 AM, Stephen Sprague <spragues@gmail.com<mailto:spragues@gmail.com>>
wrote:

hey guys,
gotta slightly weird issue here.   Tez runs great. :)  client completes in a short amount
time (5 minutes) but - and here's the gotcha -  the tez server side process takes upwards
of an hour to clear out of the RM.

This is a problem for us since the queue it's in has maxRunning set to 15 and these jobs are
just squatting holding slots.

The thing is... why?  i'm wondering if it isn't some kind for GC going on but sure how to
diagnose.  i can logon to a DN and cat stderr but its not particularly useful to me but i
can pass it along if desired.

Here's a screenshot of the "squatters":


<image.png>

all have one container that the histogram shows 100%. And the client has completed an hour
ago! that's the part i don't get.

Any other output and/or configs to pass along?  Tez v0.8.4, hive v2.1.0.

Much appreciated,
Stephen



Mime
View raw message