Computer Vision Model: Vehicle Detection in Images with Python PART 1

Computer Vision Model: Vehicle Detection in Images with Python PART 1

Greetings, brave souls! Today, we embark on a noble quest to teach our computers how to see. Prepare yourselves for an epic adventure as we delve into the realms of computer vision and unlock the secrets of creating our own computer vision models. Now, I must warn you, this tutorial won't be a mere leisurely stroll in the park. My friends, we shall tread through the murky swamps of complexity. But fear not, for I have faith in your intrepid spirit, forged through countless coding challenges.

For those of you well-versed in the ways of Python, consider yourselves equipped with a mighty, fully-built rig, ready to conquer any trail. However, if you're new to programming, think of yourself as cruising along in a trusty Crosstrek. While you may lack the clearance for this particular trail, fear not! You still possess the power of a dependable 4WD. This tutorial aims to guide you through the fascinating inner workings of computer vision models and how these marvelous creations learn to "see."


Now, let us discuss the requirements for this grand journey. There are two modes you can choose from. In the first mode, we have the comfortable 2WD option. Join us on this mode to learn about the development of simple computer vision models. Alternatively, for the daring souls seeking an exhilarating challenge, we have the 4X4 Low mode. Brace yourselves as we delve into the nitty-gritty of coding and craft your very own computer vision model.

Requirements for 2WD:

  • Courage: Prepare your heart for the thrilling ride ahead.
  • Curiosity: Let your insatiable curiosity fuel your quest for knowledge.
  • A little bit of patience: Remember, great things come to those who wait (and debug).

Requirements for 4X4 Low:

  • Python knowledge: Familiarize yourself with the Python programming language. It shall be your trusty companion throughout this expedition.
  • Python environment ready to go: Ensure your coding vehicle is well-prepared by setting up a suitable Python environment. Check out these resources to help you get started:

Perfect Your Python Development Setup
Python Environment Setup Video


The project shall be divided into several tasks, but remember, dear adventurers, the most crucial lesson to carry with you is the supreme importance of data. Yes, data shall reign supreme! Without the right quality data, your model shall crumble like a sandcastle in the face of a mighty wave. So, let's outline the overarching steps we shall undertake together:

  1. Acquire the data.
  2. Set up your Python environment.
  3. Prepare your data for the model.
  4. Train your model.
  5. Evaluate your model.

Acquire The Data:

The enigmatic entity known as "data." It seems to be the talk of the town, doesn't it? But fear not, for I shall unravel its mysteries for you. Data, my dear adventurers, is simply a collection of files that shall train our model. The format of these files depends on the type of model we aim to create. For instance, a language model requires a grand corpus of textual data, while a gaming model needs input from the game itself. In our case, we seek to build a computer vision model capable of detecting our magnificent rigs within images. Hence, we shall need images of our beloved machines.

Training a computer vision model from scratch is no small feat. It demands an army of thousands of images and a formidable amount of computing power. Unless you have a supercomputer hidden away in your garage (and if you do, kudos to you), this might pose a problem. But fear not, for benevolent researchers have graced us with their computational prowess, sharing pre-trained models capable of detecting general items in images. We shall harness the power of one such model, fine-tuning it to recognize our own rigs. Therefore, my fellow adventurers, you won't need a supercomputer in your garage. Any modern computer shall suffice for our noble cause.

Our Images:

In our case, we shall start by capturing images of your rigs. We need images from various angles and in different environments. The more variation, the better! This diversity shall grant our model the ability to recognize your rigs in various environments. For our purposes, we used 30 images for training, including a separate test split. Additionally, we set aside four images exclusively for validation, to assess the model's performance. Of course, the more images you can gather, the merrier! However, fear not if you can only muster 30. It should suffice to develop a decent computer vision model. As we mentioned before, we are taking advantage of that pre-trained model (this means the model we are using has already been trained to detect certain objects in other images). 

Prepping the Images:

When it comes to building a computer vision model, 80% of your time shall be spent on preparing the data. Alas, it's not all about wild keyboard shenanigans. No, my friends, you must dedicate some time to ensure you're feeding your model the right kind of data. Our images require some preparation, and here's where the real magic begins. The first task shall be creating training masks for all your collected images. These masks shall highlight the precise object our model shall "look" for. This can be accomplished with most image editing applications. Load your image, seize the power of the color white, and paint over your rig with a stroke of brilliance. Thus, you shall end up with two images—a vibrant full-color image and a mask, with your rig triumphantly shining in glorious white. You should end up with two sets of the following images:

 Wrangler in the snow

Alright, brave adventurer! Prepare yourself for the next step in your heroic journey: setting up your Python environment. It's time to unleash the power of code and conquer the realm of image processing! But fear not, for I shall guide you with wit and a boat load of caffeine along the way.

First things first, we need to gather all your images and training masks. But wait! We can't just scatter them randomly on your computer like a chaotic treasure hunt. No, no! We shall create a sacred directory structure, a labyrinth of folders within folders, to keep everything organized and under control. It's like building our own secret fortress, but for data!

If you're using PyCharm as your trusty IDE companion, create a new project with a bold name like "vehicle_detection" (I suggest avoiding spaces in folder names, as computers can be grumpy about spaces in paths). Imagine your folders standing tall, like a majestic castle, ready to hold your precious data. Oh, the sight!

Python directory

Create a Data folder with the following folders inside: images_png, masks, validation. Then another images and masks folders within validation.

Now, let's bring order to this digital kingdom. Move your training images and validation images into their rightful folders, with the precision of a skilled archer hitting the bullseye. Feel the satisfaction as each file finds its place, like a puzzle piece completing the grand picture.

But hold your horses, intrepid coder! Before we proceed, we need to gather our arsenal of libraries, the mystical artifacts that grant us the power to weave magic with code. Prepare yourself for the installation quest, where we shall conquer these essential libraries one by one!

First on our list is TensorFlow, a powerful ally in the land of machine learning. Installing it can be a bit of a challenge, but fear not! Arm yourself with determination and head to the official tutorial here: [link to tensorflow installation]. Follow the steps with unwavering focus, and soon TensorFlow shall be tamed!

Next, we seek the aid of Keras, a loyal companion in our quest for neural network greatness. Open a terminal, dear adventurer, and type the sacred incantation:

python terminal
pip install keras

Watch as the command executes, summoning Keras to join our forces. Ah, the wonders of modern sorcery!

But our journey is not yet complete. We require the wisdom of NumPy, a versatile tool for array manipulation. In the same terminal, after the successful installation of Keras, chant the following words:

python terminal 
pip install numpy

Feel the power of arrays flowing through your fingertips, as NumPy becomes a loyal servant in your Python kingdom.

Lastly, we call upon OpenCV, the mystical library of computer vision. With its aid, we shall unravel the secrets hidden within images. Speak the final incantation:

python terminal
pip install opencv-python

Behold as OpenCV answers your call, ready to assist you in your vision-related endeavors. The stage is set, and our Python environment stands fortified with the mightiest of libraries!

Brave coder, take a moment to appreciate your progress. You have prepared the battlefield, armed yourself with essential libraries, and created order amidst the chaos of data. Now, onward to the next chapter of our adventure, where we shall tame the wild beasts of image processing!


On to the image pre-processing step. Right click on your main folder for your project. IN our case, its "vehicle_detection" and select new, then create a python file. We will go through each line of code up ahead, in the meantime, our first python script will look like this:

Python script 

Lovely to see, right? Lets dig into it.

# imports
import cv2
import numpy as np
import glob as gb
import os

We begin by importing the necessary libraries and modules for our adventure. cv2 (OpenCV) provides computer vision functionalities, numpy offers powerful array operations, glob helps us search for file paths, and os aids in handling file paths and directories.

images = (os.path.dirname(__file__) + "/Data/images_png/*.png")
masks = (os.path.dirname(__file__) + "/Data/validation/masks/*.png")

We create two variables, images and masks, which hold the file paths for our image and mask files. These paths are obtained by joining the directory of the current script (__file__) with the specific file patterns we're interested in. The "*" tells our trusty computer to grab all the files in the directory, and the ".png" ending further tells it to grab all the files that end in ".png".

image_names = gb.glob(images)
image_dataset = []
print(f" Loading images: {image_names}"

Here, we use glob to search for all file paths that match the images pattern. We sort the obtained file paths for consistency. Then, we create an empty list, image_dataset, to store our loaded images. We print the image names to the console.

for idx, img in enumerate(image_names):
    img = cv2.imread(img, 1)
    img = cv2.resize(img, (320, 480))
train_images = np.array(image_dataset)
image_dataset = np.expand_dims(train_images, axis=3)

In this loop, we iterate over the image_names list using the enumerate function to get both the index and image path. We read each image using cv2.imread with a flag of 1 to load it in color format. Then, we resize each image to dimensions of 320x480 using cv2.resize. The resized image is added to the image_dataset list. Finally, we convert image_dataset to a NumPy array (train_images) and expand its dimensions to include an extra axis at index 3 using np.expand_dims. This additional axis represents the channel dimension for the images.

mask_names = gb.glob(masks)
print(f" Loading masks: {mask_names}")
mask_dataset = []
for idx, mask in enumerate(mask_names):
mask = cv2.imread(mask, 0)
mask = cv2.resize(mask, (320, 480))
for row in range(0, 480, 1):
    for col in range(0, 320, 1):
        if mask[row, col] < 255:
            mask[row, col] = 0
            mask[row, col] = 255
cv2.imwrite(mask_names[idx], mask)
mask_dataset.append(mask / 255)
train_masks = np.array(mask_dataset)

Similar to the image loading process, we use glob to find file paths matching the masks pattern. The obtained paths are sorted, and the names are printed to the console. We create an empty list, mask_dataset, to store the processed masks.

In a loop, we iterate over each mask file path and load it using cv2.imread with a flag of 0 to load it in grayscale format. Then, we resize the mask to dimensions of 320x480.

Following that, we loop over each pixel of the mask image using nested for loops. If a pixel's value is less than 255 (any color other than white), we set it to 0, making it black. Otherwise, if the pixel value is 255 (white), we set it to 255, maintaining its whiteness. This will turn all of our masks into the following: 

Jeep training mask

After processing the mask, we save it using cv2.imwrite with the original mask file path. We append the normalized mask (values divided by 255) to the mask_dataset list. Finally, we convert mask_dataset to a NumPy array (train_masks).

print(f"Image data shape is: {train_images.shape}") print(f"Mask data shape is: {train_masks.shape}") print(f"Max pixel value: {train_images.max()}") print(f"Labels in the mask are: {np.unique(train_masks)}")

We reach a point of enlightenment! Here, we print some insightful information about our data. We display the shape of train_images and train_masks, which reveals the dimensions and sizes of our image and mask datasets. We also showcase the maximum pixel value found in the images using train_images.max(). Additionally, we enlighten ourselves with the unique labels present in the mask dataset using np.unique(train_masks).

image_dataset = train_images / 255
mask_images = train_masks / 255

Our journey concludes with normalization! We divide our image dataset (train_images) and mask dataset (train_masks) by 255, ensuring that all pixel values fall within the range of 0 to 1. This normalization step helps in achieving consistency and compatibility across our data.

And just like any great adventure, we must set camp and rest. This marks the end of part 1 of our project. For our second part, we will be finishing up with training our model now that we have our data ready to go. Then, we will be dealing a killer blow with evaluating our model. Stay tuned, and until next adventure!

If you need any help or have any questions, please feel free to reach out through contact page. We would be more than happy to help.


-Kora Nexus

Back to blog

Leave a comment

Please note, comments need to be approved before they are published.