Java - Convert PDF To Word, Excel, Or Image

Overview

 
The main benefit of converting PDFs to Word documents is the ability to edit the text directly within the file. This is especially helpful if you want to make significant changes to your PDF. If most data of your PDF are in tabular form, you can choose to convert it to an Excel spreadsheet. In the following sections, I will introduce how to convert searchable PDF to Word and Excel, and how to convert PDF to images as well by using Spire.PDF for Java.
 

Installing Spire.Pdf.jar

 
If you create a Maven project, you can easily import the jar in your application using the following configurations. For non-Maven projects, download the jar file from this link and manually add it as a dependency in your application.
  1. <repositories>  
  2.     <repository>  
  3.         <id>com.e-iceblue</id>  
  4.         <name>e-iceblue</name>  
  5.         <url>http://repo.e-iceblue.com/nexus/content/groups/public/</url>  
  6.     </repository>  
  7. </repositories>  
  8. <dependencies>  
  9.     <dependency>  
  10.         <groupId> e-iceblue </groupId>  
  11.         <artifactId>spire.pdf</artifactId>  
  12.         <verson>4.1.2</version>  
  13.     </dependency>  
  14. </dependencies>  

Convert PDF to DOC or DOCX

 
Conversion from PDF to Word or Excel is quite straightforward by using this library. Create a PdfDocument object to load the original PDF document, and then call saveToFile() method to save PDF in .doc, .docx, .xls, or .xlsx file format.
  1. import com.spire.pdf.FileFormat;  
  2. import com.spire.pdf.PdfDocument;  
  3.   
  4. public class ConvertPdfToWord {  
  5.     public static void main(String[] args) {  
  6.         //Create a PdfDocument instance  
  7.         PdfDocument pdf = new PdfDocument();  
  8.         //Load a PDF file  
  9.         pdf.loadFromFile("C:\\Users\\Administrator\\Desktop\\original.pdf");  
  10.         //Save to .docx file  
  11.         pdf.saveToFile("ToWord.docx", FileFormat.DOCX);  
  12.         pdf.close();  
  13.     }  
  14. }  

Convert PDF to XLS or XLSX

  1. import com.spire.pdf.FileFormat;  
  2. import com.spire.pdf.PdfDocument;  
  3.   
  4. public class ConvertPdfToExcel {  
  5.     public static void main(String[] args) {  
  6.         //Create a PdfDocument instance  
  7.         PdfDocument pdf = new PdfDocument();  
  8.         //Load a PDF file  
  9.         pdf.loadFromFile("C:\\Users\\Administrator\\Desktop\\original.pdf");  
  10.         //Save to .xlsx file  
  11.         pdf.saveToFile("ToExcel.xlsx", FileFormat.XLSX);  
  12.         pdf.close();  
  13.     }  
  14. }   

Convert PDF to PNG

 
Converting PDF to images requires a little more code, but it's not complicated at all. After a PDF file is loaded, call saveAsImage() method to save the specific page as image data. Then, write the data into a .png file by using the ImageIO.write() method.
  1. import com.spire.pdf.PdfDocument;  
  2. import javax.imageio.ImageIO;  
  3. import java.awt.image.BufferedImage;  
  4. import java.io.File;  
  5. import java.io.IOException;  
  6.   
  7. public class ConvertPdfToImage {  
  8.   
  9.     public static void main(String[] args) throws IOException {  
  10.   
  11.         //Create a PdfDocument instance  
  12.         PdfDocument pdf = new PdfDocument();  
  13.           
  14.         //Load a PDF file  
  15.         pdf.loadFromFile("C:\\Users\\Administrator\\Desktop\\original.pdf");  
  16.   
  17.         //Declare a BufferedImage variable  
  18.         BufferedImage image;  
  19.           
  20.         //Loop through the pages  
  21.         for (int i = 0; i < pdf.getPages().getCount(); i++) {  
  22.               
  23.             //Save the specific page as image data  
  24.             image = pdf.saveAsImage(i);  
  25.               
  26.             //Write image data to png file  
  27.             File file = new File(String.format("out/ToImage-%d.png", i));  
  28.             ImageIO.write(image, "PNG", file);  
  29.         }  
  30.         pdf.close();  
  31.     }  
  32. }  

Conclusion

 
There are many solutions out there on the internet that can do the file format conversion programmatically. This scenario has proven to be a reliable one. The converted document retains the layout and almost everything of the original file. Apart from the formats mentioned above, Spire.PDF also supports converting PDF to HTML, SVG, PDF/A, etc.