Mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stephen Sprague <sprag...@gmail.com>
Subject Re: Tez GC issues perhaps? not sure.
Date Wed, 14 Dec 2016 02:24:58 GMT
interesting.... thank you.   pretty sure they are being submitted through
the HS2 service.

On Tue, Dec 13, 2016 at 5:21 PM, Harish JP <hjp@hortonworks.com> wrote:

> Hi Stephen,
>
> How are you starting these jobs, beeline, hive-cli, ...?  It looks like
> they are being started in session mode, which means the AM waits for 5
> minutes (default value) for a new DAG/query to be submitted, if it does not
> receive a query it will timeout and shutdown. The config for this
> tez.session.am.dag.submit.timeout.secs.
>
> —
> Thanks,
> Harish
>
> On 14-Dec-2016, at 6:13 AM, Stephen Sprague <spragues@gmail.com> wrote:
>
> hey guys,
> gotta slightly weird issue here.   Tez runs great. :)  client completes in
> a short amount time (5 minutes) but - and here's the gotcha -  the tez
> server side process takes upwards of an hour to clear out of the RM.
>
> This is a problem for us since the queue it's in has maxRunning set to 15
> and these jobs are just squatting holding slots.
>
> The thing is... why?  i'm wondering if it isn't some kind for GC going on
> but sure how to diagnose.  i can logon to a DN and cat stderr but its not
> particularly useful to me but i can pass it along if desired.
>
> Here's a screenshot of the "squatters":
>
>
> <image.png>
>
> all have one container that the histogram shows 100%. And the client has
> completed an hour ago! that's the part i don't get.
>
> Any other output and/or configs to pass along?  Tez v0.8.4, hive v2.1.0.
>
> Much appreciated,
> Stephen
>
>

Mime
View raw message