Convert PDF To Image and Vice Versa Using Python

Many times we have come across a situation where we desire an image instead of an existing PDF. It maybe for the purpose of easy viewing or rather editing or modifying using a basic tool like paint or photoshop. Nowadays people send images via whatsapp, in the form of pdf to retain the picture quality, these pictures need to be in either jpg, jpeg or png format to be edited in one of the aforementioned editors.

In this article, we will see how can we convert a PDF to an image and vice versa.

For this particular blog, I will be using google colab since it has an easy to use interface and has pre-installed libraries, you can use any editor of your choice

PDF TO IMAGE: In the first method, for the conversion of pdf to image, I’ve used the pdf2image package. To install the package just run the below command:

pip install pdf2image

Once the module is installed, run the following command

from pdf2image import convert_from_path

Let’s add our PDF to our notebook.

Now we will make use of poppler-utils. Poppler is a necessary module that works with pdf2image and we need to install it for pdf2image to work.

! apt-get install poppler-utils

Further, The convert_from_path module does this conversion and stores the images in the variable “images”

images = convert_from_path('/content/sample.pdf')

Finally a for loop is used to save the pages of the pdf as images in jpg format.

for i in range(len(images)):
    images[i].save('page'+ str(i+1) +'.jpg')

Once you run the above command, you will see the pages from PDF will be converted and saved

Page1.jpg and Page2.jpg are the pages of PDF as images.

Moving further, We all know someone who had submitted their assignments and projects in the form of pdfs during the tough covid times. For those who didn’t own a scanner, had to click photos through their phones and then compile them into a pdf. Python has made this task super easy! With just a few lines of code, an image(s) can be easily converted into a pdf:

IMAGE TO PDF: Begin with installing the pillow package using the pip command (pre-installed in colab) and then import Image package.

from PIL import Image

I’ll convert the two images that I created in the pdf to image section back into a pdf. For this, open the images and store them in respective variables.

image1 = Image.open('/content/page1.jpg')
image2 = Image.open('/content/page2.jpg')

Next convert the images

image1_con = image1.convert('RGB')
image2_con = image2.convert('RGB')

Compile the images into a list

image_list = [image2_con]

Finally save the pdf with the first image as the first page

image1_con.save('new_pdf.pdf', save_all=True, append_images=image_list)

You will see your pdf in the file section

Submit a Comment

Your email address will not be published. Required fields are marked *

Subscribe

Select Categories