Hello and welcome again to another part of the OPEN-CV with Python tutorial series and in the previous part, we saw some applications of Computer vision as well as how to get started with Intel’s Open-CV library and perform some basic operations on image frames.
In this part of the tutorial, we are going to learn some more challenging functionalities that Open-CV has to offer to us.
Let’s start with the corner detection method. As the name suggests, this technique detects all the corners inside any given image, Duh!
#CORNER DETECTION
import numpy as np
import cv2
img = cv2.imread('heregoesyouramazingimage')
gray = cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)
gray = np.float32(gray)
corners_to_detect = 100
minimum_quality_score = 0.01
minimum_distance = 10
ST_corners = cv2.goodFeaturesToTrack(gray, corners_to_detect, minimum_quality_score, minimum_distance)
ST_corners = np.float32(corners)
for corner in corners:
x,y = corner.ravel()
cv2.circle(img,(x,y),3,255,-1)
cv2.imshow('ST_corners', ST_corners)
This is not the only method available in the Open-CV library to detect corners. The other method to achieve similar results is the Harris Corner Detection method.
After getting to understand the concept of corner detection we now move on to learning what is foreground extraction. The idea of foreground extraction is to find the foreground, and remove the background. Same as it sounds. This is much like what a green screen does except here we will not require any green screen!
#FOREGROUND EXTRACTION
import numpy as np
import cv2
from matplotlib import pyplot as plt
img = cv2.imread('heregoesyouamazingimage')
mask = np.zeros(img.shape[:2],np.uint8)
background_model = np.zeros((1,65),np.float64)
foreground_model = np.zeros((1,65),np.float64)
rectangle = (161,79,150,150)
cv2.grabCut(img,mask,rectangle,background_model,foreground_model,5,cv2.GC_INIT_WITH_RECT)
mask2 = np.where((mask==2)|(mask==0),0,1).astype('uint8')
img = img*mask2[:,:,np.newaxis]
plt.imshow(img)
plt.colorbar()
plt.show()
Breaking down the code bit by bit we get:
Note: Find the proper coordinates for your image.
Foreground extraction was cool, right? Something similar to that is the Background reduction. It is nothing but reducing the background to the minimum by detecting motion. This is going to require us to use a video, or to have two images (One, with the absence of objects you want to track, and another with the presence of those same objects).
#BACKGROUND REDUCTION
import numpy as np
import cv2
cap = cv2.VideoCapture(0)
foreground_model = cv2.createBackgroundSubtractorMOG2()
while(1):
ret, frame = cap.read()
foregroundmask = foreground_model.apply(frame)
cv2.imshow('frame',frame)
cv2.imshow('foreground',foregroundmask)
if cv2.waitKey(20) and 0xFF == ord('q'):
break
cap.release()
cv2.destroyAllWindows()
The code to do this is pretty straightforward and easy to understand. All we do is invoke the inbuilt ‘createBackgroundSubtractorMOG2()’ function and pass in the frame, of which we want to reduce the background.
Morphological transformations are simple operations that can be performed on an image based on its shape. It needs two inputs, one is our original image, the second one is called kernel which decides the nature of the operation. Two basic morphological operators are Erosion and Dilation. Then its variant forms like Opening, Closing etc also comes into play. We will see them one-by-one.
#MORPHOLOGICAL TRANSFORMATION
import cv2
import numpy as np
cap = cv2.VideoCapture(0)
while True:
_, frame = cap.read()
hsv = cv2.cvtColor(frame, cv2.COLOR_BGR2HSV)
color_value_1 = np.arrays([0, 0, 0])
color_value_2 = np.arrays([255, 255, 255])
mask = cv2.inRange(hsv, color_value_1, color_value_2)
result = cv2.bitwise_and(frame, frame, mask = mask)
kernel = np.ones((5,5), np.uint8)
eroded_img = cv2.erode(mask, kernel, iterations = 1)
dilated_img = cv2.dilate(mask, kernel, iterations = 1)
opening = cv2.morphologyEx(mask, cv2.MORPH_OPEN, kernel)
closing = cv2.morphologyEx(mask, cv2.MORPH_CLOSE, kernel)
cv2.imshow('frame', frame)
cv2.imshow('result', result)
cv2.imshow('eroded_img', eroded_img)
cv2.imshow('dilated_img', dilated_img)
cv2.imshow('opening', opening)
cv2.imshow('closing', closing)
if waitKey(20) and 0xFF == ord('q'):
break
cv2.destroyAllWindows()
cap.release()
Erosion is done by “eroding” the edges. We first create a slider and provide it with a window(5 x 5 pixels). Now as the slider slides on the image, if all of the pixels are white, then we get white, otherwise black. This helps in eliminating some of the white noise inside the image. Dilation, on the other hand, is doing the exact opposite of Erosion.
Next comes “opening” and “closing”. With Opening, all we do is remove “false positives”. The idea of “closing” is doing the exact opposite of “opening” i.e to remove “false negatives”.
well well well! Congratulations to you on reaching up till here. You now have a good deal of understanding and overview of how Open-CV works. The next topic(and the last topic of this post) is on recording videos. Yes, Recording videos through Open-CV might seem like a trivial task at first but unfortunately, it isn’t. So… Let’s dive in.
#RECORDING VIDEOS
import os
import numpy as numpy
import cv2
filename = 'video.avi'
frames_per_seconds = 24.0
my_res = '720p'
cap = VideoCapture(0)
def change_resolution(cap, width, height):
cap.set(3, width)
cap.set(4, height)
STD_DIM = {
"480p" : (640, 480),
"720p" : (1280, 720),
"1080p" : (1920, 1080),
"4k" : (3840, 2160),
}
def set_dimensions(cap, res = '1080p'):
width, height = STD_DIM['480p']
if res in STD_DIM:
width, height = STD_DIM[res]
change_resolution(cap, width, height)
return width, height
'''
Video Encoding, might require additional installs
Types of Codes: http://www.fourcc.org/codecs.php
also OpenCV 2 doesn't support cv2.VideoWriter_fourcc. Instead use cv2.cv.CV_FOURCC(*'XVID')
'''
VIDEO_TYPE = {
'avi' : cv2.VideoWriter_fourcc(*'XVID')
'mp4' : cv2.VideoWriter_fourcc(*'XVID')
'mp4' : cv2.VideoWriter_fourcc(*'H264')
}
def set_videotype(filename):
filename, ext = os.path.splitext(filename)
if ext in VIDEO_TYPE:
return VIDEO_TYPE[ext]
return (VIDEO_TYPE['avi'])
dims = set_dimensions(cap, res = my_res)
video_type_cv2 = set_videotype(filename)
out = cv2.VideoWriter(filename, video_type_cv2, frames_per_seconds, dims)
while True:
ret, frame = cap.read()
out.write(frame)
cv2.imshow('frame', frame)
if cv2.waitKey(20) and 0xFF == ord('q'):
break
cap.release()
out.release()
cv2.destroyAllWindows()
Breaking down the code piece by piece we get:
That’ll be it for this blog post, folks !!! Congratulations for making it this far.
I have a Github repository which contains all of the above code in a very well commented structure. The repository also contains all of the resources that I have used to learn Open-CV.
Stay tuned. Until next time…!