Mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jose Rozanec <jose.roza...@mercadolibre.com>
Subject Re: Tez 0.8.3 on EMR hanging with Hive task
Date Wed, 15 Jun 2016 18:49:26 GMT
Thanks! We could access the logs the way you pointed out.

2016-06-15 12:33 GMT-03:00 Hitesh Shah <hitesh@apache.org>:

> If log aggregation is not enabled, the next best thing would be to
> download the application master logs from the RM UI for the apps in
> question. Those would provide a good starting point for figuring out what
> is going on.
>
> thanks
> β€” HItesh
>
>
> > On Jun 15, 2016, at 8:29 AM, Jose Rozanec <jose.rozanec@mercadolibre.com>
> wrote:
> >
> > Hello,
> >
> > We provide an update. Seems we understood something wrong: hive returned
> us an error in the query, while Tez job was running not reporting progress.
> We did not cancel it, since seemed that it hanged. After two hours reported
> as finished on the UI; while still held running state when listed from YARN
> for some time more and finished finally finished.
> > We have log aggregation enabled, but after the job finished, we still
> get the same message as reported in the previous email.
> >
> > Now will research why Hive detached from Tez while still running; and if
> we can improve query accept times, since is taking a while to start
> executing complex queries.
> >
> > Thanks,
> >
> >
> >
> >
> > 2016-06-15 12:09 GMT-03:00 Jose Rozanec <jose.rozanec@mercadolibre.com>:
> > Hello,
> >
> > I ran the command, and got the following message:
> > 16/06/15 15:07:35 INFO impl.TimelineClientImpl: Timeline service
> address: http://ip-10-64-23-215.ec2.internal:8188/ws/v1/timeline/
> > 16/06/15 15:07:35 INFO client.RMProxy: Connecting to ResourceManager at
> ip-10-64-23-215.ec2.internal/10.64.23.215:8032
> > /var/log/hadoop-yarn/apps/hadoop/logs/application_1465996511770_0001
> does not exist.
> > Log aggregation has not completed or is not enabled.
> >
> > I think we are missing some configuration that would help us get more
> insight?
> >
> > Thanks!
> >
> > Joze.
> >
> > 2016-06-15 12:03 GMT-03:00 Hitesh Shah <hitesh@apache.org>:
> > Hello Joze,
> >
> > Would it be possible for you to provide the YARN application logs
> obtained via β€œbin/yarn logs -applicationId <appId>” for both of the cases
> you have seen? Feel free to file JIRAs and attach logs to each of them.
> >
> > thanks
> > β€” Hitesh
> >
> > > On Jun 15, 2016, at 7:38 AM, Jose Rozanec <
> jose.rozanec@mercadolibre.com> wrote:
> > >
> > > Hello,
> > >
> > > We are experiencing some issues with Tez 0.8.3 when we issue heavy
> queries from Hive. Seems some jobs hang on Tez and never return. Those jobs
> show up in the DAG web-ui, but no progress is reported on UI nor on Hive
> logs. Any ideas why this could happen? We detect happens with certain
> memory configurations, which if missing, the job dies soon (we guess due to
> OOM).
> > >
> > > Most probably not related to this, at some point we also got the
> following error: "org.apache.tez.dag.api.SessionNotRunning: TezSession has
> already shutdown. Application xxxxx failed 2 times due to AM Container". We
> are not sure can be related to TEZ-2663, which should be solved since
> version 0.7.1 onwards.
> > >
> > > Thanks in advance,
> > >
> > > Joze.
> >
> >
> >
>
>

Mime
View raw message