Recognition of green apples using HSV color model and region-based image processing

October 26, 2025  opencv  computervision 

This blog post is an adapted version of a Jupyter notebook I used for teaching at NTNU some years ago. It demonstrates using the HSV color model for recognition of objects in an image (in this case, green apples on a table). Additionally, it shows an application of some basic region-based image processing techniques (erosion, dilation, and detection of connected components) to facilitate a better segmentation.

We start by importing some modules for data representation, processing and visualization. Then we define some helper functions:

import cv2
import numpy as np
import pandas as pd
from matplotlib import pyplot as plt
from matplotlib.colors import LinearSegmentedColormap


def open_image(fname, read_flag=cv2.IMREAD_ANYCOLOR, color_transform=None):

    im = cv2.imread(fname, read_flag)
    if color_transform is None:
        return im

    return cv2.cvtColor(im, color_transform)


def show_single_image(im):
    _, ax = plt.subplots()
    ax.imshow(im)
    ax.axis('off')
    plt.show()


def show_channels(im, channel_names, cmaps):
    fig, axes = plt.subplots(1, im.shape[2])

    for ch, (ax, cmap, channel_name) in enumerate(zip(axes, cmaps, channel_names)):
        ax.imshow(im[:, :, ch], cmap=cmap, vmin=0, vmax=255)
        ax.axis('off')
        ax.set_title(f'{channel_name}')

We open the image with five green apples on a wooden table:

im_apples = open_image('apples_top.jpg', cv2.IMREAD_COLOR, cv2.COLOR_BGR2RGB)
show_single_image(im_apples)

png

The most basic way to look at the data of this color image is by separating the three color channels, namely Red, Green, and Blue:

black_red = LinearSegmentedColormap.from_list('black_red', ['black', 'red'])
black_green = LinearSegmentedColormap.from_list('black_green', ['black', 'green'])
black_blue = LinearSegmentedColormap.from_list('black_blue', ['black', 'blue'])

show_channels(im_apples, channel_names='RGB', cmaps=(black_red, black_green, black_blue))

png

From the visualization above, one can notice that the green channel is potentially interesting for automatically detecting the apples, as they appear lighter (with higher value) in it.

Let’s see if a simple thresholding of the green channel will work:

def threshold_binary(im, t):
    _, im_t = cv2.threshold(im, t, 255, cv2.THRESH_BINARY)
    return im_t


im_apples_green = im_apples[:, :, 1]
green_mask_from_rgb = threshold_binary(im_apples_green, 200)

show_single_image(green_mask_from_rgb)

png

As you can see, the more illuminated parts of the apples are indeed segmentable using this simple method. However, the masked regions don’t cover the apples entirely, as well as a lot of noise is picked up on the table surface.

Let’s do better by transforming the image from RGB to HSV, with the channels of Hue, Saturation, and Value. The neat property of HSV is that the hue channel represents the color irrespective of the illumination, so e.g. lighter greens and darker greens will result in a similar hue. This can clearly be seen with our image:

im_hsv = cv2.cvtColor(im_apples, cv2.COLOR_RGB2HSV)

show_channels(im_hsv, channel_names='HSV', cmaps=('hsv', 'viridis', 'viridis'))

png

When plotting the histogram of the hue values in the image under consideration, the region between 35 and 75 seems to be what we are looking for:

green_from = 35
green_to = 75

_, ax = plt.subplots()
ax.hist(im_hsv[:, :, 0].ravel(), bins=100)
ax.axvline(green_from, color='tab:green')
ax.axvline(green_to, color='tab:green')
plt.show()

png

Let’s now threshold the hue channel to include pixels within the region defined above:

def mask_threshold_range(im, thresh_min, thresh_max):
    binary_output = (im >= thresh_min) & (im < thresh_max)
    return np.uint8(binary_output)

green_mask_from_hsv = mask_threshold_range(im_hsv[:, :, 0], green_from, green_to)

show_single_image(green_mask_from_hsv)

png

Now the segmentation is much more complete, and with less noise.

To make the result even better, we can apply the following sequence of image processing operations:

  1. Erosion: removing boundary pixels of the objects (in our case, removing a lot of the remaining noise)
  2. Dilation: enlarging the existing objects (leading to the apples’ regions getting closer to their original size)
def erode(im, kernel_size, n_iter=1):
    kernel = np.ones((kernel_size, kernel_size), np.uint8)
    return cv2.erode(im, kernel, iterations=n_iter)


def dilate(im, kernel_size, n_iter=1):
    kernel = np.ones((kernel_size, kernel_size), np.uint8)
    return cv2.dilate(im, kernel, iterations=n_iter)


im_eroded = erode(green_mask_from_hsv, kernel_size=7)
im_dilated = dilate(im_eroded, kernel_size=11, n_iter=2)

Here is the visualization of the two steps above:

_, (ax_eroded, ax_dilated) = plt.subplots(1, 2)
ax_eroded.imshow(im_eroded)
ax_eroded.axis('off')
ax_dilated.imshow(im_dilated)
ax_dilated.axis('off')
plt.show()

png

The final binary image looks pretty good. Let’s apply it as a mask given the original image:

def apply_mask(im, mask):
    return cv2.bitwise_and(im, im, mask=mask)

im_masked = apply_mask(im_apples, im_dilated)

show_single_image(im_masked)

png

At last, let’s use the same binary image to detect the connected components in the image:

def find_ccomp(im, *args, **kwargs):
    num, labels, stats, centroids = cv2.connectedComponentsWithStats(im, *args, **kwargs)
    
    stats_df = pd.DataFrame(stats, columns=['LeftX', 'TopY', 'Width', 'Height', 'Area'])
    stats_df['CenterX'] = centroids[:, 0]
    stats_df['CenterY'] = centroids[:, 1]
    
    return labels, stats_df


ccomp_labels, ccomp_stats = find_ccomp(im_dilated)

show_single_image(ccomp_labels)

png

The first connected component represents the background, while the rest, in our case, correspond to each individual apple:

pd.options.display.float_format = '{:.3f}'.format
print(ccomp_stats)
   LeftX  TopY  Width  Height     Area  CenterX  CenterY
0      0     0   1280     945  1010446  646.503  466.151
1    591   184    218     206    36198  698.483  287.159
2    159   209    230     223    41703  272.268  318.972
3    933   484    229     212    39383 1047.202  589.960
4    218   506    237     225    42742  333.928  617.393
5    610   569    221     219    39128  718.932  679.581

Some related resources:

OpenCV: Eroding and Dilating

OpenCV Connected Component Labeling and Analysis