converting pdf files...

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

converting pdf files...

Jeff Brown
Hi. I have been using OpenOffice for quite some time, and I am wondering if there is an Apache product that would be enable me to convert .pdf files to .odt. Thanks! Jeff

Sent from Mail<https://go.microsoft.com/fwlink/?LinkId=550986> for Windows 10

Reply | Threaded
Open this post in threaded view
|

Re: converting pdf files...

Rory O'Farrell
On Tue, 23 Jul 2019 15:11:44 +0000
Jeff Brown <[hidden email]> wrote:

> Hi. I have been using OpenOffice for quite some time, and I am wondering if there is an Apache product that would be enable me to convert .pdf files to .odt. Thanks! Jeff
>
> Sent from Mail<https://go.microsoft.com/fwlink/?LinkId=550986> for Windows 10
>

If these are PDF files of any size, the best method is to use an OCR application (Optical Character Recognition) which will convert them into a text format.  You will then have access to the t ext and can reformat and edit it as you require.  Many scanners come with OCR applications, but there are some free applications downloadable from the Internet.  I have used Tesseract, with gimageReader as a front end, running on linux, but also available for Windows, with good success.

It is important to be aware that the accuracy of the OCR process varies according to the quality of the scan and of the original; typically there will be a small number of recognition errors, so careful proofreading is essential, particularly if the information is numeric.


--
Rory O'Farrell <[hidden email]>

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]