Hello, and welcome to the “Image processing with Python & Open-CV tutorial series”. This series aims at providing an overview of the Open-CV library, Its functions, application & capabilities along with getting your hands adept with it.
At first glance, the word “Image processing” might seem something of a wizard’s stuff. As the name suggests, Image processing basically means performing processes on an image with the help of software. The goal of applying processes like smoothing, sharpening, contrasting, stretching etc on an image can be to increase its readability or to enhance its quality or even transform the image. That being said, image processing is a part of computer vision. Computer vision is an extraordinarily powerful field in the world of artificial intelligence and it has got enormous amounts of applications in real time. These include detecting license plates, scanning whiteboard contents, detecting text in still images, re-scaling images, detecting templates in images, image recognition, image retrieval, image restoration etc and so many more.
Open-CV is an open source computer vision library developed by Intel for real-time image & video analysis and processing. Primarily written in C++, This library has bindings for Python, Java, Matlab, Octave etc. Open-CV combined with python makes image/video analysis and processing astonishingly simple and for many, it can also be the first step in the world of Computer Vision.
This part of the tutorial will show us the basics of using open-CV with Python and also that just how simple it is to get started with Image processing. Let’s get started and get our hands dirty.
First program will be to convert a video to its Gray scaled form. The video is captured from the webcam.
#BASIC GREY-SCALING
import cv2
cap = cv2.VideoCapture(0)
while True:
frame, ret = cap.read()
gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
cv2.imshow('frame', frame)
cv2.imshow('gray_frame', gray)
if cv2.waitKey(20) and 0xFF == ord('q'):
break
cap.release()
cv2.destroyAllWindows()
Breaking down the code step by step :
This stage is crucial to understand as all the programs here onward will be based on this particular setup. Most of the programs will be built upon this code. Also, the logic behind all the video analysis applications will be pretty similar. First we need to capture the video/images, then to apply various process to them and lastly to manipulate the image/video into a desirable output.
The second program will be to re-scale the video to a certain percentage of the the original frame. Re-scaling is not done on the video/capture, but on the frame output itself.
#RE-SCALING
import cv2
cap = cv2.VideoCapture(0)
def rescale_frame(frame, percentage = 75):
width = int(frame.shape[1] * percent / 100)
height = int(frame.shape[0] * percent / 100)
dim = (width, height)
return cv2.resize(frame, dim, interpolation = cv2.INTER_AREA)
while True:
frame, ret = cap.read()
frame75 = rescale_frame(frame, percentage = 75)
cv2.imshow('frame75', frame75)
frame150 = rescale_frame(frame, percentage = 150)
cv2.imshow('frame150', frame150)
if cv2.waitKey(0) and 0xFF == ord('q'):
break
cap.release()
cv2.destroyAllWindows()
As we can see all of the code is pretty similar to the very first program of Grey-scaling with the only difference being the function re-scale. All this function does is it takes the dimensions of the frame and scales those dimensions to the percentage value we desire.
After the topic of re-scaling, we will be moving towards actually changing the resolution of the video and not just scaling the frame to a certain percentage value.
#CHANGING RESOLUTION
import cv2
cap = cv2.VideoCapture(0)
def make_1080():
cap.set(3, 1920)
cap.set(4, 1080)
def make_720():
cap.set(3, 1280)
cap.set(4, 720)
def make_480():
cap.set(3, 640)
cap.set(3, 480)
def change_resolution(width, height):
cap.set(3, width)
cap.set(4, height)
make_720()
while True:
ret, frame = cap.read()
cv2.imshow('frame', frame)
if cv2.waitKey(20) and 0xFF == ord('q'):
break
cap.release()
cv2.destroyAllWindows()
This code is fairly simple and straight forward as well. We make a function that sets the width and height of the video capture ’cap’ to the desired resolution values. For a resolution of 480p, the height will be equal to 480 and the width will be equal to 640. Similarly for the resolutions of 720p and 1080p the heights and widths will be equal to 720 & 1280 and 1080 & 1920 respectively. One thing here to note is that as the resolution of the video increases, the size of the video will increase as well. There will be some cases when you can set an even bigger resolution than the camera hardware can support but this will only result in a very lagged and choppy video.
Smoothing and Blurring techniques help us in eliminating noises from our image. There are various types of smoothing and blurring techniques available at direct disposal from open-cv. Let’s explore some of them to get adept with them. Keep in mind that each and every technique has its own advantages as well as disadvantages.
#SMOOTHING AND BLURRING
import numpy as np
import cv2
cap = cv2.VideoCapture(0)
while True:
_, frame = cap.read()
hsv = cv2.cvtColor(frame, cv2.COLOR_BGR2HSV)
color_value_1 = np.arrays([0, 0, 0])
color_value_2 = np.arrays([255, 255, 255])
mask = cv2.inRange(hsv, color_value_1, color_value_2)
result = cv2.bitwise_and(frame, frame, mask = mask)
kernel = np.ones((10,10), np.float32)/100
smooth_result = cv2.filter2D(result, -1, kernel)
gaussian_blur = cv2.GaussianBlur(result, (10,10), 0)
median_blur = cv2.medianBlur(result, 10)
bilateral_blur = cv2.bilateralFilter(result, 10, 75, 75)
cv2.imshow('frame', frame)
cv2.imshow('mask', mask)
cv2.imshow('result', result)
cv2.imshow('smooth_result', smooth_result)
cv2.imshow('median_blur', median_blur)
cv2.imshow('gaussian_blur', gaussian_blur)
cv2.imshow('bilateral_blur', bilateral_blur)
if waitKey(20) and 0xFF == ord('q'):
break
cv2.destroyAllWindows()
cap.release()
After having a clear idea on the concept of Smoothing and Blurring we move towards our next topic. Canny edge detection does exactly as it sounds. It basically detects the edges from any given image. We are going to detect the edges from the live video input from the primary webcam.
#CANNY EDGE DETECTION
import cv2
cap = cv2.VideoCapture(0)
while True:
_, frame = cap.read()
hsv = cv2.cvtColor(frame, cv2.COLOR_BGR2HSV)
lower_threshold = int(max(0, (1.0 - 0.33) * median))
upper_threshold = int(min(255, (1.0 + 0.33) * median))
edges = cv2.Canny(frame,lower_threshold,upper_threshold)
cv2.imshow('Original',frame)
cv2.imshow('Edges',edges)
if cv2.waitKey(20) and 0xFF == ord('q'):
break
cap.release()
cv2.destroyAllWindows()
As all the above codes, this code is also quite straight forward.
We will conclude this part here. This introductory part of the OPEN-CV series will be enough to get the basic understanding of how the Open-CV library, combined with python, works and it will also provide the grounds of understanding the field of Computer Vision itself. I have a Github repository which contains all of the above code in a very well commented structure. The repository also contains all of the resources that I have used.
Stay tuned. Until next time…!