Mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Furkan KAMACI <furkankam...@gmail.com>
Subject Re: nutch 1.12 and Solr 5.4.1
Date Thu, 22 Dec 2016 19:57:51 GMT
Hi Michael,

It's great that your problem is resolved ;) Don't hesitate to ask if you
have any other questions.

Kind Regards,
Furkan KAMACI

On Thu, Dec 22, 2016 at 9:44 PM, Michael Coffey <mcoffey@yahoo.com.invalid>
wrote:

> Thank you very much for replying. I know it's holiday season and you
> probably have a million things to do!
> OMG, it is working now that I am using the version of SolrUtils you
> pointed to. I had previously focused on a version where it uses
> SystemDefaultHttpClient but not as a static. It seems that making it static
> made a critical difference. So this is awesome.
> For the record, I would say I am using solrj 5.4.1, based on the presence
> of the following files in my Nutch directories.
> ./apache-nutch-1.12/runtime/local/plugins/indexer-solr/
> solr-solrj-5.4.1.jar
> ./apache-nutch-1.12/build/plugins/indexer-solr/solr-solrj-5.4.1.jar
>
> For httpclient, within the nutch.12 directories, I have a lot of jars in
> my nutch folder.
> ./apache-nutch-1.12/runtime/local/lib/httpclient-4.3.5.jar
> ./apache-nutch-1.12/runtime/local/lib/commons-httpclient-3.1.jar
> ./apache-nutch-1.12/runtime/local/plugins/protocol-httpclient/protocol-
> httpclient.jar
> ./apache-nutch-1.12/runtime/local/plugins/indexer-solr/
> httpclient-4.4.1.jar
> ./apache-nutch-1.12/runtime/local/plugins/lib-htmlunit/
> httpclient-4.3.4.jar
> ./apache-nutch-1.12/runtime/local/plugins/lib-selenium/
> httpclient-4.5.1.jar
> ./apache-nutch-1.12/runtime/local/plugins/indexer-
> cloudsearch/httpclient-4.3.6.jar
> ./apache-nutch-1.12/build/protocol-httpclient/protocol-httpclient.jar
> ./apache-nutch-1.12/build/lib/httpclient-4.3.5.jar
> ./apache-nutch-1.12/build/lib/commons-httpclient-3.1.jar
> ./apache-nutch-1.12/build/plugins/protocol-httpclient/
> protocol-httpclient.jar
> ./apache-nutch-1.12/build/plugins/indexer-solr/httpclient-4.4.1.jar
> ./apache-nutch-1.12/build/plugins/lib-htmlunit/httpclient-4.3.4.jar
> ./apache-nutch-1.12/build/plugins/lib-selenium/httpclient-4.5.1.jar
> ./apache-nutch-1.12/build/plugins/indexer-cloudsearch/httpclient-4.3.6.jar
> The hadoop directory has the following httpclient-related
> jars/posix/hadoop-2.7.2/share/hadoop/kms/tomcat/webapps/kms/
> WEB-INF/lib/httpclient-4.2.5.jar
> /posix/hadoop-2.7.2/share/hadoop/httpfs/tomcat/webapps/
> webhdfs/WEB-INF/lib/httpclient-4.2.5.jar
> /posix/hadoop-2.7.2/share/hadoop/tools/lib/httpclient-4.2.5.jar
> /posix/hadoop-2.7.2/share/hadoop/tools/lib/commons-httpclient-3.1.jar
> /posix/hadoop-2.7.2/share/hadoop/common/lib/httpclient-4.2.5.jar
> /posix/hadoop-2.7.2/share/hadoop/common/lib/commons-httpclient-3.1.jar
>
> Over on the Solr5 machine, we have./solr-5.4.1/dist/solrj-
> lib/httpclient-4.4.1.jar
> ./solr-5.4.1/server/solr-webapp/webapp/WEB-INF/lib/httpclient-4.4.1.jar
>
> thanks again
>       From: Furkan KAMACI <furkankamaci@gmail.com>
>  To: Michael Coffey <mcoffey@yahoo.com>
> Cc: "user@nutch.apache.org" <user@nutch.apache.org>
>  Sent: Thursday, December 22, 2016 10:29 AM
>  Subject: Re: nutch 1.12 and Solr 5.4.1
>
> Hi Michael,
>
> That dependencies you sent are from ivy cache. I need to know the versions
> of Solr and HTTP Client. You problem is probably a jar mismatch between
> hadoop and Solr. Nutch 1.12 should work with Solr 5.4.1 as you can check
> from here:
> https://github.com/apache/nutch/blob/release-1.12/src/
> plugin/indexer-solr/ivy.xml
>
> So, there maybe a bug at Nutch. Here is a workaround at given issue by you:
> https://issues.apache.org/jira/browse/NUTCH-2267 Could you apply it to
> SolrUtils.java (
> https://github.com/sjwoodard/nutch/blob/master/src/plugin/
> indexer-solr/src/java/org/apache/nutch/indexwriter/solr/SolrUtils.java)
> and check again? If you still get that error, I can try to fix it.
>
> Kind Regards,
> Furkan KAMACI
>
> On Thu, Dec 22, 2016 at 6:26 PM, Michael Coffey <mcoffey@yahoo.com> wrote:
>
> > Is it possible to get around this problem by using an older version of
> > Solr or Nutch or both?
> >
> >
> > ------------------------------
> > *From:* Michael Coffey <mcoffey@yahoo.com.INVALID>
> > *To:* "user@nutch.apache.org" <user@nutch.apache.org>; Furkan KAMACI <
> > furkankamaci@gmail.com>; Michael Coffey <mcoffey@yahoo.com>
> > *Sent:* Tuesday, December 20, 2016 8:41 PM
> > *Subject:* Re: nutch 1.12 and Solr 5.4.1
> >
> > This should work, shouldn't it? But it is not working. I am using Nutch
> > 1.12 with the recommended version of Solr (5.4.1) and Hadoop 2.7.2. I
> > haven't changed any Java code, but I get a low-level Java error when
> trying
> > to write to the index. Is this not a tested configuration? Based on web
> > searching, I know that others have had similar problems, going back
> several
> > months, but I haven't seen any solutions. I did try a couple of
> variations
> > on the patch posted for NUTCH-2267 (a slightly different manifestation)
> and
> > that did not help. I notice that the 2267 patch has been reverted in the
> > master branch.
> > I am willing to work on some Java code, if necessary, to help resolve
> > this. At this point, I don't know what to try next, other than switching
> to
> > ElasticSearch.
> >
> >      From: Michael Coffey <mcoffey@yahoo.com.INVALID>
> >
> > To: "user@nutch.apache.org" <user@nutch.apache.org>; Furkan KAMACI <
> > furkankamaci@gmail.com>; Michael Coffey <mcoffey@yahoo.com>
> > Sent: Monday, December 19, 2016 7:13 PM
> > Subject: Re: nutch 1.12 and Solr 5.4.1
> >
> > Some additional info: I am using solr.server.type=http, not cloud. I have
> > tried plugins.include with protocol-http and also with
> protocol-httpclient.
> > My current settings are listed below. Also, I am using hadoop 2.7.2, in
> > case that matters.
> > <property>
> >  <name>plugin.includes</name>
> >  <value>protocol-http|urlfilter-regex|parse-(html|
> > tika)|index-(basic|anchor)|indexer-solr|scoring-opic|
> > urlnormalizer-(pass|regex|basic)</value>
> > </property>
> >
> > <property>
> >  <name>solr.server.type</name>
> >  <value>http</value>
> >  <description>
> >    Specifies the SolrServer implementation to use. This is a string value
> >    of one of the following 'cloud', 'concurrent', 'http' or 'lb'.
> >    The values represent CloudSolrServer, ConcurrentUpdateSolrServer,
> >    HttpSolrServer or LBHttpSolrServer respectively.
> >  </description>
> > </property>
> >
> > <property>
> >  <name>solr.server.url</name>
> >  <value>http://solr5-00:8983/solr/nutch-0</value>
> >  <description>
> >      Defines the Solr URL into which data should be indexed using the
> >      indexer-solr plugin.
> >  </description>
> > </property>
> >
> >      From: Michael Coffey <mcoffey@yahoo.com.INVALID>
> > To: Furkan KAMACI <furkankamaci@gmail.com>; "user@nutch.apache.org" <
> > user@nutch.apache.org>
> > Sent: Monday, December 19, 2016 5:10 PM
> > Subject: Re: nutch 1.12 and Solr 5.4.1
> >
> > I'm not sure how to do that. According to a find command, I have more
> than
> > one solrj on the nutch machine../hadass/apache-nutch-
> > 1.12/runtime/local/plugins/indexer-solr/solr-solrj-5.4.1.
> > jar./hadass/apache-nutch-1.12/build/plugins/indexer-solr/
> > solr-solrj-5.4.1.jar./.ivy2/cache/org.apache.solr/solr-
> > solrj./.ivy2/cache/org.apache.solr/solr-solrj/jars/solr-
> > solrj-5.4.1.jar./.ivy2/cache/org.apache.solr/solr-solrj/jars/solr-solrj-4.6.0.jar
> On
> > the solr machine, I have./solr-5.4.1/dist/solrj-lib
> > ./solr-5.4.1/server/solr-webapp/webapp/WEB-INF/lib/solr-solrj-5.4.1.jar
> > ./solr-5.4.1/docs/solr-solrj
> > ./solr-5.4.1/docs/solr-solrj/org/apache/solr/client/solrj
> > ./solr-5.4.1/docs/solr-core/org/apache/solr/client/solrj
> >
> > Should I make the change to SolrUtils.java, mentioned in
> > https://issues.apache.org/jira/browse/NUTCH-2267
> > Lewis and Stephen might know about this.
> >
> >      From: Furkan KAMACI <furkankamaci@gmail.com>
> > To: Michael Coffey <mcoffey@yahoo.com>; user@nutch.apache.org
> > Sent: Monday, December 19, 2016 4:13 PM
> > Subject: Re: nutch 1.12 and Solr 5.4.1
> >
> > Hi Michael,
> > Could you check the version of solrj at your Nutch and compare it with
> > version of Solr at your server?
> > Kind Regards,Furkan KAMACI
> > On Dec 20, 2016 1:01 AM, "Michael Coffey" <mcoffey@yahoo.com.invalid>
> > wrote:
> >
> > What is the recommended fix (or workaround) for the "bad return type"
> > error related to "Type 'org/apache/http/impl/client/ DefaultHttpClient'
> > (current frame, stack[0]) is not assignable to
> > 'org/apache/http/impl/client/ CloseableHttpClient'"
> > It seems that switching to different versions of Solr has not helped
> > (6.3.0, 5.5.3, 5.4.1). FWIW, I have same version of Java on both
> machines.
> >
> > OpenJDK Runtime Environment (IcedTea 2.6.8)
> (7u121-2.6.8-1ubuntu0.14.04.1)
> > OpenJDK 64-Bit Server VM (build 24.121-b00, mixed mode)
> >
> >
> >
> >      From: Michael Coffey <mcoffey@yahoo.com.INVALID>
> >  To: "user@nutch.apache.org" <user@nutch.apache.org>; Michael Coffey <
> > mcoffey@yahoo.com>
> >  Sent: Saturday, November 19, 2016 8:05 AM
> >  Subject: Re: nutch 1.12 and Solr 6.3.0
> >
> > I think this is what Lewis and Furkan know as NUTCH-2267. I get the same
> > problem with Solr 5.5.3.
> >
> > I really would like to know which versions of nutch/solar work together
> > "out of the box".
> >
> >      From: Michael Coffey <mcoffey@yahoo.com.INVALID>
> >  To: "user@nutch.apache.org" <user@nutch.apache.org>
> >  Sent: Friday, November 18, 2016 2:04 PM
> >  Subject: nutch 1.12 and Solr 6.3.0
> >
> > I decided to plunge ahead with Solr indexing, but so far it doesn't work.
> > The first error I got is listed below. Could it be that I am running JDK
> 7
> > on the nutch server and JDK 8 on the Solr server. As far as I know Nutch
> > 1.x won't work with JDK 8 and Solr 6.3 wont work with JDK less than 8.
> Any
> > suggestions or advice?
> >
> > 16/11/18 13:59:52 INFO mapreduce.Job: Task Id :
> > attempt_1479499237600_0021_r_ 000000_0, Status : FAILED
> > Error: Bad return type
> > Exception Details:
> >  Location:
> >    org/apache/solr/client/solrj/ impl/HttpClientUtil.
> > createClient(Lorg/apache/solr/ common/params/SolrParams;Lorg/
> > apache/http/conn/ ClientConnectionManager;)Lorg/ apache/http/impl/client/
> > CloseableHttpClient; @58: areturn
> >  Reason:
> >    Type 'org/apache/http/impl/client/ DefaultHttpClient' (current frame,
> > stack[0]) is not assignable to 'org/apache/http/impl/client/
> > CloseableHttpClient' (from method signature)
> >  Current Frame:
> >    bci: @58
> >    flags: { }
> >    locals: { 'org/apache/solr/common/ params/SolrParams',
> > 'org/apache/http/conn/ ClientConnectionManager', 'org/apache/solr/common/
> > params/ModifiableSolrParams', 'org/apache/http/impl/client/
> > DefaultHttpClient' }
> >    stack: { 'org/apache/http/impl/client/ DefaultHttpClient' }
> >  Bytecode:
> >    0000000: bb00 0359 2ab7 0004 4db2 0005 b900 0601
> >    0000010: 0099 001e b200 05bb 0007 59b7 0008 1209
> >    0000020: b600 0a2c b600 0bb6 000c b900 0d02 002b
> >    0000030: b800 104e 2d2c b800 0f2d b0
> >  Stackmap Table:
> >    append_frame(@47,Object[#143])
> >
> > Container killed by the ApplicationMaster.
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
>
>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message