I have received a lot of PDF documents that I wish to convert to text formats such as docx/doc/odt.

I know there are some online tools that will do it for you, but some content may be sensitive with people’s names and addresses and I’m not sure I can trust these websites.

Are there software that will convert a PDF to odt?

Things I know and tried:

  1. Asked a friend to open PDF in Microsoft Word: Their license expired last month, so it doesn’t let you save the file!

  2. Tried to do the same on my LibreWriter: It doesn’t support that format.

  3. Tried to open in LibreDraw: untenable as I want to type more things in the document.

P.S: I use Linux, but reckon solutions for platforms would be fine.

  • Walking Coffin@lemmy.dbzer0.com
    link
    fedilink
    English
    arrow-up
    7
    ·
    edit-2
    4 months ago

    If the pdf files are properly formatted (no compression/all text selectable), you should be able to open a terminal and do (I know it works the other way around, not sure if libreoffice can actually do the reverse but it doesn’t hurt to try)

    libreoffice --headless --convert-to docx *.pdf
    

    Just know that since docx is a proprietary format by microsoft, the results may be flawed. As a last resort I guess you could run a windows VM and try to convert your files with any big software known to be able to handle such files.

  • INeedMana@lemmy.world
    link
    fedilink
    arrow-up
    6
    ·
    edit-2
    4 months ago

    I think it will depend on what exactly is in the PDF. If these are text, you can in a pinch just copy and paste it but I’d expect libreoffice to be able to open it. If these are images, you’ll have to use some OCR