How to Convert Pdf to image using python.

Converting huge count of PDF’s to image is always a tedious job to do. As a result it takes much time to code or to convert them manually. Using python you can do it in an easy way.

Install the given below package using pip command :-

pip install pdf2image

from multiprocessing.pool import ThreadPool, Pool
from pdf2image import convert_from_path
import os

#get all the files of a directory
files=os.listdir("../pdfFiles/Old Server Reports/")

def processPDF(file):
name=(file.split(".")[0])
counter = 1
pages=convert_from_path('../pdfFiles/Old Server Reports/'+file)
for page in pages:
page.save("../output/"+name+str(counter)+".png",'png')
counter+=1

def main():
#processing files parallely using threading
pool=Pool(4)
pool.map(processPDF,files)

if _name_ == '__main__':
main()

Above code snippet will iterate each page of pdf and will convert it to image of your type. In above example we have convered pdf into png format. We can also do it in jpg format. FYI we have used pool for parallel processing of pdf.

Happy Coding!!!