BARBARA: ContentExtraction of PDF file in solr using Apache Tika

Thursday, 12 September 2013

ContentExtraction of PDF file in solr using Apache Tika

ContentExtraction of PDF file in solr using Apache Tika

I am trying to index the PDF file in the solr using the following tutorial
http://wiki.apache.org/solr/ExtractingRequestHandler But everytime i am
firing the command
java -jar post.jar *.pdf
it says some org.apache.solr.common.SolrException: Invalid UTF-8 middle
byte 0xe3 Error Kindly help me in indexing the PDF to solr server.Is there
any other integration then tika which can help me.

BARBARA

Thursday, 12 September 2013

ContentExtraction of PDF file in solr using Apache Tika

No comments:

Post a Comment