Lane detection with NumPy
Intro
I can’t wait to start Udacity’s Self Driving Engineer course — but fortunately there is already so much stuff to do in advance. Sebastian Thrun created the brilliant AI for Robotics course, or there is also an exciting Introduction to CV course to attend.
The folks from previous cohorts are also active — Mehdi Squalli shows us how to solve the “Detect Lane Lines project”.
Plan
I’m going to solve here a similar task — detect lanes on video frames, using NumPy and SciPy. My goal is not to achieve better performance or speed then with OpenCV. Rather, I’m going to implement some techniques learned at the Computer Vision course. This is the plan for the articles:
- Image filtering
- Finding lines on an image
- Detect and show the lanes in a video
Filtering
In order to understand any picture, we need to convert them to some convenient form. When looking for lanes, we are trying to get rid of every detail, which is not relevant.
Blur filters
First of all, we need to do some blurring. It is a good way to get rid of small annoying details, but keep the larger ones (like lanes).
As images are stored in computers as matrices, we can filter them using matrices, too. A blur filter works like this:
- Take a small matrix (a filter or a kernel), like this:
my_filter = np.array([[1,1,1,1,1],
[1,1,1,1,1],
[1,1,1,1,1],
[1,1,1,1,1],
[1,1,1,1,1]])
- Loop through all the pixels of your image and multiply the 5x5 piece (starting from the i,j position) of your image with the filter matrix
- Insert the result to the i,j position of your resulting image
Basically, you are doing nothing else than a moving average in 2 dimensions. How does it make sense? The pixels with high values (the peaks) will get lower, and the extreme lows will get a bit higher — so you’ll end up with a lovely blurred image.
How intense is your blur? It depends on the size of your filter (the ‘window’). The larger is the window, the more intense is the blur.
And here comes the cool thing with NumPy: you don’t need to implement yourself the multiplication loops. Just use the handy signal.correlate function.
from scipy import signal
import numpy as np#the 5x5 blur filtermy_filter = np.array([[1,1,1,1,1],
[1,1,1,1,1],
[1,1,1,1,1],
[1,1,1,1,1],
[1,1,1,1,1]])# the multiplication which will yield you a blurred image from the original image ("img")img_blurred = signal.correlate(img, my_filter)
Gaussian Blur
Using a matrix with only ones for blurring works, but it has some problems — the blur is not so nice, because it counts every pixel in the window to the final blurred (averaged) pixel value with the same weight. Your blur will be nicer and more smooth, if you count to the final value the closer pixels with a higher weight. So, instead of only ones, the values in your filter matrix will be higher in the center, and lower towards the edges. Something like this:
[[ 0.0005855 0.0008519 0.00096532 0.0008519 0.0005855 ]
[ 0.0008519 0.0012395 0.00140454 0.0012395 0.0008519 ]
[ 0.00096532 0.00140454 0.00159155 0.00140454 0.00096532]
[ 0.0008519 0.0012395 0.00140454 0.0012395 0.0008519 ]
[ 0.0005855 0.0008519 0.00096532 0.0008519 0.0005855 ]]
This makes much more sense once plotted:
Probably you’ve already guessed — it’s called Gaussian Blur because the filter is a 2d Gaussian distribution. The intensity of your blur depends on 2 things: on the size of the filter, and on the sigma value of the Gaussian. There are a lot of ways to create this filter, that’s how I do it:
import numpy as np
from scipy import stats
# This is the gaussian function — you are setting the MU and the Sigma.
# A 0-MU makes sense, as we want a centered blur filter
pdf = stats.norm(0, 10).pdf
lin = np.linspace(-10, 10, 10)
gaussian_filter = np.array([[pdf(x) * pdf(y) for x in lin] for y in lin])
Edge Detection
What is an edge? It can be a limit of an object; changing of some property (like color, texture, light and shadow); or some information about the position in the 3d space (like the end of the visible part of a ball).
On an image, these edges usually marked as a sudden change in pixel value (color property edge is the most trivial example of this). Fortunately, there is a mathematic operation to find them: derive. A derivative of a function is another function which shows the slope of the original function on every position; so a derivative of an 2-d image will show us, how intense is the changing of the pixel value on a given position, compared to it’s neighbouring pixels.
Again, we don’t need to implement this operation manually (however it’s not a big deal), but we can use filters. This is a horizontal edge filter:
my_filter = np.array([
[-1, 0, 1],
[-1, 0, 1],
[-1, 0, 1]
])
This filter takes the left-hand side pixel values, and subtract them from the right-hand side value. This way, we end up with the difference (or the pace of the changing, i.e the slope) for the given position.
There is one caveat. We can make the correlation 2 ways:
vertical_edges = signal.correlate(img, filter)
horizontal_edges = signal.correlate(img, filter.T)
The first filter will show us the vertical edges and the second the horizontal ones. So in order to get all the edges on one image, we need to combine them:
edges = np.sqrt(signal.correlate(img, my_kernel3)**2 + signal.correlate(img, my_kernel3.T)**2)
What’s next
In the next post I’ll try to figure out, how to tell about an edge that it’s a lane. Stay tuned ;)