Mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael Coffey <mcof...@yahoo.com.INVALID>
Subject Re: nutch 1.12 and Solr 5.4.1
Date Wed, 21 Dec 2016 04:41:37 GMT
This should work, shouldn't it? But it is not working. I am using Nutch 1.12 with the recommended
version of Solr (5.4.1) and Hadoop 2.7.2. I haven't changed any Java code, but I get a low-level
Java error when trying to write to the index. Is this not a tested configuration? Based on
web searching, I know that others have had similar problems, going back several months, but
I haven't seen any solutions. I did try a couple of variations on the patch posted for NUTCH-2267
(a slightly different manifestation) and that did not help. I notice that the 2267 patch has
been reverted in the master branch.
I am willing to work on some Java code, if necessary, to help resolve this. At this point,
I don't know what to try next, other than switching to ElasticSearch.

      From: Michael Coffey <mcoffey@yahoo.com.INVALID>
 To: "user@nutch.apache.org" <user@nutch.apache.org>; Furkan KAMACI <furkankamaci@gmail.com>;
Michael Coffey <mcoffey@yahoo.com> 
 Sent: Monday, December 19, 2016 7:13 PM
 Subject: Re: nutch 1.12 and Solr 5.4.1
   
Some additional info: I am using solr.server.type=http, not cloud. I have tried plugins.include
with protocol-http and also with protocol-httpclient. My current settings are listed below.
Also, I am using hadoop 2.7.2, in case that matters.
<property>
  <name>plugin.includes</name>
  <value>protocol-http|urlfilter-regex|parse-(html|tika)|index-(basic|anchor)|indexer-solr|scoring-opic|urlnormalizer-(pass|regex|basic)</value>
</property>

<property>
  <name>solr.server.type</name>
  <value>http</value>
  <description>
    Specifies the SolrServer implementation to use. This is a string value
    of one of the following 'cloud', 'concurrent', 'http' or 'lb'.
    The values represent CloudSolrServer, ConcurrentUpdateSolrServer, 
    HttpSolrServer or LBHttpSolrServer respectively.
  </description>
</property>

<property>
  <name>solr.server.url</name>
  <value>http://solr5-00:8983/solr/nutch-0</value>
  <description>
      Defines the Solr URL into which data should be indexed using the
      indexer-solr plugin.
  </description>
</property>

      From: Michael Coffey <mcoffey@yahoo.com.INVALID>
 To: Furkan KAMACI <furkankamaci@gmail.com>; "user@nutch.apache.org" <user@nutch.apache.org>

 Sent: Monday, December 19, 2016 5:10 PM
 Subject: Re: nutch 1.12 and Solr 5.4.1
  
I'm not sure how to do that. According to a find command, I have more than one solrj on the
nutch machine../hadass/apache-nutch-1.12/runtime/local/plugins/indexer-solr/solr-solrj-5.4.1.jar./hadass/apache-nutch-1.12/build/plugins/indexer-solr/solr-solrj-5.4.1.jar./.ivy2/cache/org.apache.solr/solr-solrj./.ivy2/cache/org.apache.solr/solr-solrj/jars/solr-solrj-5.4.1.jar./.ivy2/cache/org.apache.solr/solr-solrj/jars/solr-solrj-4.6.0.jar On
the solr machine, I have./solr-5.4.1/dist/solrj-lib
./solr-5.4.1/server/solr-webapp/webapp/WEB-INF/lib/solr-solrj-5.4.1.jar
./solr-5.4.1/docs/solr-solrj
./solr-5.4.1/docs/solr-solrj/org/apache/solr/client/solrj
./solr-5.4.1/docs/solr-core/org/apache/solr/client/solrj

Should I make the change to SolrUtils.java, mentioned in https://issues.apache.org/jira/browse/NUTCH-2267
Lewis and Stephen might know about this.

      From: Furkan KAMACI <furkankamaci@gmail.com>
 To: Michael Coffey <mcoffey@yahoo.com>; user@nutch.apache.org 
 Sent: Monday, December 19, 2016 4:13 PM
 Subject: Re: nutch 1.12 and Solr 5.4.1
  
Hi Michael,
Could you check the version of solrj at your Nutch and compare it with version of Solr at
your server?
Kind Regards,Furkan KAMACI
On Dec 20, 2016 1:01 AM, "Michael Coffey" <mcoffey@yahoo.com.invalid> wrote:

What is the recommended fix (or workaround) for the "bad return type" error related to "Type
'org/apache/http/impl/client/ DefaultHttpClient' (current frame, stack[0]) is not assignable
to 'org/apache/http/impl/client/ CloseableHttpClient'"
It seems that switching to different versions of Solr has not helped (6.3.0, 5.5.3, 5.4.1).
FWIW, I have same version of Java on both machines.

OpenJDK Runtime Environment (IcedTea 2.6.8) (7u121-2.6.8-1ubuntu0.14.04.1)
OpenJDK 64-Bit Server VM (build 24.121-b00, mixed mode)



      From: Michael Coffey <mcoffey@yahoo.com.INVALID>
 To: "user@nutch.apache.org" <user@nutch.apache.org>; Michael Coffey <mcoffey@yahoo.com>
 Sent: Saturday, November 19, 2016 8:05 AM
 Subject: Re: nutch 1.12 and Solr 6.3.0

I think this is what Lewis and Furkan know as NUTCH-2267. I get the same problem with Solr
5.5.3.

I really would like to know which versions of nutch/solar work together "out of the box".

      From: Michael Coffey <mcoffey@yahoo.com.INVALID>
 To: "user@nutch.apache.org" <user@nutch.apache.org>
 Sent: Friday, November 18, 2016 2:04 PM
 Subject: nutch 1.12 and Solr 6.3.0
 
I decided to plunge ahead with Solr indexing, but so far it doesn't work. The first error
I got is listed below. Could it be that I am running JDK 7 on the nutch server and JDK 8 on
the Solr server. As far as I know Nutch 1.x won't work with JDK 8 and Solr 6.3 wont work with
JDK less than 8. Any suggestions or advice?

16/11/18 13:59:52 INFO mapreduce.Job: Task Id : attempt_1479499237600_0021_r_ 000000_0, Status
: FAILED
Error: Bad return type
Exception Details:
  Location:
    org/apache/solr/client/solrj/ impl/HttpClientUtil. createClient(Lorg/apache/solr/ common/params/SolrParams;Lorg/
apache/http/conn/ ClientConnectionManager;)Lorg/ apache/http/impl/client/ CloseableHttpClient;
@58: areturn
  Reason:
    Type 'org/apache/http/impl/client/ DefaultHttpClient' (current frame, stack[0]) is
not assignable to 'org/apache/http/impl/client/ CloseableHttpClient' (from method signature)
  Current Frame:
    bci: @58
    flags: { }
    locals: { 'org/apache/solr/common/ params/SolrParams', 'org/apache/http/conn/ ClientConnectionManager',
'org/apache/solr/common/ params/ModifiableSolrParams', 'org/apache/http/impl/client/ DefaultHttpClient'
}
    stack: { 'org/apache/http/impl/client/ DefaultHttpClient' }
  Bytecode:
    0000000: bb00 0359 2ab7 0004 4db2 0005 b900 0601
    0000010: 0099 001e b200 05bb 0007 59b7 0008 1209
    0000020: b600 0a2c b600 0bb6 000c b900 0d02 002b
    0000030: b800 104e 2d2c b800 0f2d b0
  Stackmap Table:
    append_frame(@47,Object[#143])

Container killed by the ApplicationMaster.


 

   


  

  

   
Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message