Blog

Musings of a Developer

Image processing using Python & Open-CV part-1

Hello, and welcome to the “Image processing with Python & Open-CV tutorial series”. This series aims at providing an overview of the Open-CV library, Its functions, application & capabilities along with getting your hands adept with it.

At first glance, the word “Image processing” might seem something of a wizard’s stuff. As the name suggests, Image processing basically means performing processes on an image with the help of software. The goal of applying processes like smoothing, sharpening, contrasting, stretching etc on an image can be to increase its readability or to enhance its quality or even transform the image. That being said, image processing is a part of computer vision. Computer vision is an extraordinarily powerful field in the world of artificial intelligence and it has got enormous amounts of applications in real time. These include detecting license plates, scanning whiteboard contents, detecting text in still images, re-scaling images, detecting templates in images, image recognition, image retrieval, image restoration etc and so many more.

Open-CV is an open source computer vision library developed by Intel for real-time image & video analysis and processing. Primarily written in C++, This library has bindings for Python, Java, Matlab, Octave etc. Open-CV combined with python makes image/video analysis and processing astonishingly simple and for many, it can also be the first step in the world of Computer Vision.

This part of the tutorial will show us the basics of using open-CV with Python and also that just how simple it is to get started with Image processing. Let’s get started and get our hands dirty.

First program will be to convert a video to its Gray scaled form. The video is captured from the webcam.

#BASIC GREY-SCALING
import cv2

cap = cv2.VideoCapture(0)

while True:
    frame, ret = cap.read()
    gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)

    cv2.imshow('frame', frame)
    cv2.imshow('gray_frame', gray)

    if cv2.waitKey(20) and 0xFF == ord('q'):
        break

cap.release()
cv2.destroyAllWindows()

Breaking down the code step by step :

  1. we import the necessary libraries.
  2. Then ‘cv2.VideoCapture(0)’ starts capturing from the default camera(here 0 means default camera).
  3. Capturing frames in an infinite loop will be next step. After that, we convert the captured video into a gray-scale via ‘cv2.COLOR_BGR2GRAY’ command(Open -CV captures in Blue-Green-Red as opposed to Red-Green-Blue).
  4. We output both the frames(original video and gray-scaled video) by ‘cv2.imshow’ command
  5. To quit from the setup, we need to define a key. In our case “q” is the key we have defined to break the loop.
  6. Lastly we release all the cameras from the open-CV and destroy all the windows opened in the background. This is done in order to release all the background process happening after quitting the setup.

This stage is crucial to understand as all the programs here onward will be based on this particular setup. Most of the programs will be built upon this code. Also, the logic behind all the video analysis applications will be pretty similar. First we need to capture the video/images, then to apply various process to them and lastly to manipulate the image/video into a desirable output.

The second program will be to re-scale the video to a certain percentage of the the original frame. Re-scaling is not done on the video/capture, but on the frame output itself.

#RE-SCALING
import cv2

cap = cv2.VideoCapture(0)

def rescale_frame(frame, percentage = 75):
    width = int(frame.shape[1] * percent / 100)
    height = int(frame.shape[0] * percent / 100)
    dim = (width, height)
    return cv2.resize(frame, dim, interpolation = cv2.INTER_AREA)

while True:
    frame, ret = cap.read()
    frame75 = rescale_frame(frame, percentage = 75)
    cv2.imshow('frame75', frame75)
    frame150 = rescale_frame(frame, percentage = 150)
    cv2.imshow('frame150', frame150)
    if cv2.waitKey(0) and 0xFF == ord('q'):
        break

cap.release()
cv2.destroyAllWindows()

As we can see all of the code is pretty similar to the very first program of Grey-scaling with the only difference being the function re-scale. All this function does is it takes the dimensions of the frame and scales those dimensions to the percentage value we desire.

After the topic of re-scaling, we will be moving towards actually changing the resolution of the video and not just scaling the frame to a certain percentage value.

#CHANGING RESOLUTION
import cv2

cap = cv2.VideoCapture(0)

def make_1080():
    cap.set(3, 1920)
    cap.set(4, 1080)

def make_720():
    cap.set(3, 1280)
    cap.set(4, 720)

def make_480():
    cap.set(3, 640)
    cap.set(3, 480)

def change_resolution(width, height):
    cap.set(3, width)
    cap.set(4, height)

make_720()

while True:
    ret, frame = cap.read()
    cv2.imshow('frame', frame)
    if cv2.waitKey(20) and 0xFF == ord('q'):
        break

cap.release()
cv2.destroyAllWindows()

This code is fairly simple and straight forward as well. We make a function that sets the width and height of the video capture ’cap’ to the desired resolution values. For a resolution of 480p, the height will be equal to 480 and the width will be equal to 640. Similarly for the resolutions of 720p and 1080p the heights and widths will be equal to 720 & 1280 and 1080 & 1920 respectively. One thing here to note is that as the resolution of the video increases, the size of the video will increase as well. There will be some cases when you can set an even bigger resolution than the camera hardware can support but this will only result in a very lagged and choppy video.

Smoothing and Blurring techniques help us in eliminating noises from our image. There are various types of smoothing and blurring techniques available at direct disposal from open-cv. Let’s explore some of them to get adept with them. Keep in mind that each and every technique has its own advantages as well as disadvantages.

#SMOOTHING AND BLURRING
import numpy as np
import cv2

cap = cv2.VideoCapture(0)

while True:
    _, frame = cap.read()

    hsv = cv2.cvtColor(frame, cv2.COLOR_BGR2HSV)

    color_value_1 = np.arrays([0, 0, 0])
    color_value_2 = np.arrays([255, 255, 255])

    mask = cv2.inRange(hsv, color_value_1, color_value_2)

    result = cv2.bitwise_and(frame, frame, mask = mask)
    kernel = np.ones((10,10), np.float32)/100
    smooth_result = cv2.filter2D(result, -1, kernel)
    gaussian_blur = cv2.GaussianBlur(result, (10,10), 0)
    median_blur = cv2.medianBlur(result, 10)
    bilateral_blur = cv2.bilateralFilter(result, 10, 75, 75)

    cv2.imshow('frame', frame)
    cv2.imshow('mask', mask)
    cv2.imshow('result', result)
    cv2.imshow('smooth_result', smooth_result)
    cv2.imshow('median_blur', median_blur)
    cv2.imshow('gaussian_blur', gaussian_blur)
    cv2.imshow('bilateral_blur', bilateral_blur)

    if waitKey(20) and 0xFF == ord('q'):
    break

cv2.destroyAllWindows()
cap.release()

  1. As always we start by capturing the video frames from primary camera, reading in the frames, converting it from BGR (Blue, Green, Red) colour format to HSV (Hue, Saturation, Value) colour format.
  2. We then define a range of numpy arrays as parameters to create a new mask frame.
  3. Performing a ‘BITWISE AND’ operation between original frame and masked frame.
  4. Next we create a Kernel window, a block of an average 10 * 10 pixel (i.e 100 pixels) and apply all over averaging.
  5. Lastly we look at different results by applying different blurring techniques such as Gaussian blur, median blur etc. Hence we can choose which style best suits our specific needs.

After having a clear idea on the concept of Smoothing and Blurring we move towards our next topic. Canny edge detection does exactly as it sounds. It basically detects the edges from any given image. We are going to detect the edges from the live video input from the primary webcam.

#CANNY EDGE DETECTION
import cv2

cap = cv2.VideoCapture(0)

while True:
    _, frame = cap.read()
    hsv = cv2.cvtColor(frame, cv2.COLOR_BGR2HSV)

    lower_threshold = int(max(0, (1.0 - 0.33) * median))
    upper_threshold = int(min(255, (1.0 + 0.33) * median))

    edges = cv2.Canny(frame,lower_threshold,upper_threshold)
    cv2.imshow('Original',frame)

    cv2.imshow('Edges',edges)

    if cv2.waitKey(20) and 0xFF == ord('q'):
        break

cap.release()
cv2.destroyAllWindows()

As all the above codes, this code is also quite straight forward.

  1. We first capture the video input from the primary camera.
  2. Converting the captured BGR (Blue, Green, Red) video into an HSV (Hue, Saturation, Value) colour format.
  3. Defining the upper and lower threshold for the edge detection and passing those into the cv2.Canny() function will be the next step.
  4. We then show the original image as well as the processed image.

We will conclude this part here. This introductory part of the OPEN-CV series will be enough to get the basic understanding of how the Open-CV library, combined with python, works and it will also provide the grounds of understanding the field of Computer Vision itself. I have a Github repository which contains all of the above code in a very well commented structure. The repository also contains all of the resources that I have used.

Stay tuned. Until next time…!