Mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stephen Sprague <>
Subject Re: Tez GC issues perhaps? not sure.
Date Wed, 14 Dec 2016 02:34:38 GMT
i didn't mean to hit send just yet.

well we are seeing these sessions sitting around for over an hour - yet i
don't see that config set so perhaps the default 5 minutes might not be in
play in my case.  settings i do see are:

set                      : hive.cli.tez.session.async=true
set                      : hive.convert.join.bucket.mapjoin.tez=false
set                      :
set                      : hive.merge.tezfiles=false
set                      :
set                      : hive.server2.tez.session.lifetime=162h
set                      : hive.server2.tez.session.lifetime.jitter=3h
set                      : hive.server2.tez.sessions.init.threads=16
set                      : hive.server2.tez.sessions.per.default.queue=1
set                      :
set                      : hive.tez.bucket.pruning=false
set                      : hive.tez.bucket.pruning.compat=true
set                      : hive.tez.container.size=-1
set                      : hive.tez.cpu.vcores=-1
set                      : hive.tez.dynamic.partition.pruning=true
set                      :
set                      :
set                      : hive.tez.enable.memory.manager=true
set                      : hive.tez.exec.inplace.progress=true
set                      : hive.tez.exec.print.summary=false
set                      :
set                      : hive.tez.input.generate.consistent.splits=true
set                      : hive.tez.log.level=INFO
set                      : hive.tez.max.partition.factor=2.0
set                      : hive.tez.min.partition.factor=0.25

anything dumb set above?  the first one, hive.cli.tez.session.async=true,
strikes me as why the client finished yet the "server-side" session was
still alive.

i'll dig deeper into this "session mode" as well as the client i'm using is
home grown and is sending the sql over the wire on port 10001 but it does
so in an "async" fashion so as to query the HS2 logs while the query is
running. maybe something in that logic is getting tweaked out.

i'll keep you posted.


On Tue, Dec 13, 2016 at 6:24 PM, Stephen Sprague <> wrote:

> interesting.... thank you.   pretty sure they are being submitted through
> the HS2 service.
> On Tue, Dec 13, 2016 at 5:21 PM, Harish JP <> wrote:
>> Hi Stephen,
>> How are you starting these jobs, beeline, hive-cli, ...?  It looks like
>> they are being started in session mode, which means the AM waits for 5
>> minutes (default value) for a new DAG/query to be submitted, if it does not
>> receive a query it will timeout and shutdown. The config for this
>> —
>> Thanks,
>> Harish
>> On 14-Dec-2016, at 6:13 AM, Stephen Sprague <> wrote:
>> hey guys,
>> gotta slightly weird issue here.   Tez runs great. :)  client completes
>> in a short amount time (5 minutes) but - and here's the gotcha -  the tez
>> server side process takes upwards of an hour to clear out of the RM.
>> This is a problem for us since the queue it's in has maxRunning set to 15
>> and these jobs are just squatting holding slots.
>> The thing is... why?  i'm wondering if it isn't some kind for GC going on
>> but sure how to diagnose.  i can logon to a DN and cat stderr but its not
>> particularly useful to me but i can pass it along if desired.
>> Here's a screenshot of the "squatters":
>> <image.png>
>> all have one container that the histogram shows 100%. And the client has
>> completed an hour ago! that's the part i don't get.
>> Any other output and/or configs to pass along?  Tez v0.8.4, hive v2.1.0.
>> Much appreciated,
>> Stephen

View raw message