Mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sebastian Nagel <>
Subject Re: Need help on getting HTML content
Date Fri, 16 Dec 2016 15:57:34 GMT

the only way is to transform the DOM subtree below the <math> element
back to HTML and then save this HTML string in parse metadata and write
it via an indexing filter as an extra field to the index.

See, e.g., o.a.n.util.DomUtil.saveDom(OutputStream, Element)
for how to "serialize" a DOM subtree.


On 12/16/2016 07:27 AM, wrote:
> Hi,
> For a particular tag (<math>), I need to save the entire HTML of the tag.
> Now I am able to save only the text content in getText() called in 
> But there is no way to store the HTML content.
> Please share your thoughts on this.
> [math tag.png]
> Thanks in advance,
> -Ashok.
> This e-mail and any files transmitted with it are for the sole use of the intended recipient(s)
> may contain confidential and privileged information. If you are not the intended recipient(s),
> please reply to the sender and destroy all copies of the original message. Any unauthorized
> use, disclosure, dissemination, forwarding, printing or copying of this email, and/or
any action
> taken in reliance on the contents of this e-mail is strictly prohibited and may be unlawful.
> permitted by applicable law, this e-mail and other e-mail communications sent to and
from Cognizant
> e-mail addresses may be monitored.

View raw message