Mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Markus Jelsma <markus.jel...@openindex.io>
Subject RE: Dynamic Crawling, URL with query parameters.
Date Wed, 04 Jan 2017 20:25:44 GMT
Hello Vicky - i think i know what you mean, but i am not too sure of it either. Can you give
examples of which URL's you want and which you don't?

Markus 
 
-----Original message-----
> From:vickyk <vickykak@gmail.com>
> Sent: Wednesday 4th January 2017 18:20
> To: user@nutch.apache.org
> Subject: Dynamic Crawling, URL with query parameters.
> 
> Hey Guys,
> 
> I am crawling the URL which contains few query parameters e.g 
> myurl.com?q=key&q1=key1
> 
> I got the crawling working in nutch with various combinations of the query
> parameters, I simply injected the urls as the new URL when the parameter
> value is changing. There is plenty of possibilities of having various
> combinations for the query parameters, having said that there could be the
> explosion of the URL's ingested.
> Is there a possibility I can avoid entering multiple URL's with different
> query parameters, this should be available out of box? 
> 
> It would be great if any one had the similar use case and share the
> experience in handling such scenario? I am particular about the scale as we
> anticipate the query parameters can increases over the period of time.
> 
> Thanks,
> Vicky
> 
> 
> 
> 
> 
> 
> --
> View this message in context: http://lucene.472066.n3.nabble.com/Dynamic-Crawling-URL-with-query-parameters-tp4312316.html
> Sent from the Nutch - User mailing list archive at Nabble.com.
> 

Mime
View raw message