Day 1: Linear Algebra & OpenCV Fundamentals

A Complete Guide to Vectors, Matrices, and Image Processing

Hey there! Welcome to KnowledgeKnot! Don't forget to share this with your friends and revisit often. Your support motivates us to create more content in the future. Thanks for being awesome!

What is Linear Algebra and Why Does it Matter for Computer Vision?

Linear algebra is the branch of mathematics dealing with vectors, matrices, and linear transformations. In computer vision and image processing, it's the mathematical foundation that makes everything possible.

In image processing, linear algebra helps us:

→ Represent images as mathematical objects (matrices)
→ Transform images (rotation, scaling, translation)
→ Extract features and patterns
→ Apply filters and effects
→ Compress and analyze visual data

Loading diagram...

What are Vectors and How Do They Work in Image Processing?

A vector is a mathematical object that has both magnitude (length) and direction. In image processing, vectors represent:

→ Pixel coordinates (x, y) - specify the location of a pixel in the image grid, essential for identifying and manipulating image regions.
→ Color values (R, G, B) - represent the intensity of Red, Green, and Blue channels for each pixel, forming the color information in images.
→ Feature descriptors - numerical representations that capture important patterns or characteristics (like edges, corners, or textures) for tasks such as object detection and recognition.
→ Motion vectors - indicate the direction and magnitude of movement of pixels or objects between consecutive frames, used in video analysis and tracking.

What are Scalar and Dot Products?

Scalar multiplication means multiplying a vector by a single number (scalar), which changes its length but not its direction. This is useful for scaling pixel values or adjusting brightness in images.

Dot product is an operation between two vectors that results in a single number. It measures how much two vectors point in the same direction. In image processing, the dot product is fundamental for comparing patterns, applying filters, and measuring similarity.

What are i, j, and k in Vectors?

In mathematics, i, j, and k are unit vectors along the x, y, and z axes, respectively. For 2D image processing, i and j represent directions along the horizontal and vertical axes. In 3D (such as color or spatial data), k is used for the third dimension.

Vector Operations Example

Vector Addition:

v_1 = [3, 2], \quad v_2 = [1, 4]

v_1 + v_2 = [3+1, 2+4] = [4, 6]

Scalar Multiplication:

v = [3, 2], \quad 2 \times v = [6, 4]

Dot Product:

v_1 \cdot v_2 = (3 \times 1) + (2 \times 4) = 3 + 8 = 11

import numpy as np

# Create vectors
v1 = np.array([3, 2])
v2 = np.array([1, 4])

# Vector operations
addition = v1 + v2
scalar_mult = 2 * v1
dot_product = np.dot(v1, v2)

print(f"Addition: {addition}")
print(f"Scalar multiplication: {scalar_mult}")
print(f"Dot product: {dot_product}")

Interactive Vector Visualization

Visualizing Vector Operations: Here's an interactive plot showing the vectors and their operations:

Loading interactive visualization...

This interactive plot shows vector v₁ (red), v₂ (blue), their sum v₁+v₂ (green), and scalar multiplication 2v₁ (purple). You can zoom, pan, and hover over the vectors for more details.

What are Convolution Operations, Correlation Matching, and Feature Detection?

Convolution operations apply a small matrix (kernel) to an image to highlight features like edges or blur. This is done by sliding the kernel over the image and computing dot products at each position.

Correlation matching compares patterns in images by measuring similarity between regions, often using dot products or other metrics. It's used for template matching and finding objects.

Feature detection identifies important points or regions in an image, such as corners, edges, or textures. These features are crucial for tasks like object recognition, tracking, and image alignment.

What are Matrices and How Do They Represent Images?

A matrix is a rectangular array of numbers arranged in rows and columns. In image processing:

→ Images are matrices where each element represents pixel intensity
→ Color images are 3D matrices (height × width × channels)
→ Transformation operations use matrices

Matrix Structure:

A = \begin{bmatrix} a_{11} & a_{12} & a_{13} \\ a_{21} & a_{22} & a_{23} \\ a_{31} & a_{32} & a_{33} \end{bmatrix}

Matrix Addition:

A = \begin{bmatrix} 1 & 2 \\ 3 & 4 \end{bmatrix}, \quad B = \begin{bmatrix} 5 & 6 \\ 7 & 8 \end{bmatrix}

A + B = \begin{bmatrix} 6 & 8 \\ 10 & 12 \end{bmatrix}

How Does Matrix Multiplication Work and Where is it Used?

Matrix multiplication is crucial in image processing. For matrices A (m×n) and B (n×p):

A = \begin{bmatrix} 1 & 2 \\ 3 & 4 \end{bmatrix}, \quad B = \begin{bmatrix} 5 & 6 \\ 7 & 8 \end{bmatrix}

A \times B = \begin{bmatrix} 19 & 22 \\ 43 & 50 \end{bmatrix}

Step-by-step calculation:

$C_{11} = (1 \times 5) + (2 \times 7) = 5 + 14 = 19$
$C_{12} = (1 \times 6) + (2 \times 8) = 6 + 16 = 22$
$C_{21} = (3 \times 5) + (4 \times 7) = 15 + 28 = 43$
$C_{22} = (3 \times 6) + (4 \times 8) = 18 + 32 = 50$

Matrix multiplication is used in:

1. Image Transformations

# Rotation matrix (45 degrees)
angle = np.pi/4
rotation_matrix = np.array([
    [np.cos(angle), -np.sin(angle)],
    [np.sin(angle),  np.cos(angle)]
])

# Apply to point [x, y]
point = np.array([100, 50])
rotated_point = rotation_matrix @ point

2. Convolution Operations

# Edge detection kernel
kernel = np.array([[-1, -1, -1],
                   [-1,  8, -1],
                   [-1, -1, -1]])

3. Color Space Transformations

# RGB to YUV conversion using matrix multiplication
rgb_to_yuv = np.array([
    [0.299,  0.587,  0.114],
    [-0.147, -0.289,  0.436],
    [0.615,  -0.515, -0.100]
])

How Do I Set Up OpenCV and Handle Basic Image Operations?

Installation Methods:

Method 1: pip install

pip install opencv-python
pip install numpy
pip install matplotlib

Method 2: conda install

conda install -c conda-forge opencv
conda install numpy matplotlib

Loading and Displaying Images:

import cv2
import numpy as np
import matplotlib.pyplot as plt

# Load an image
image = cv2.imread('image.jpg')

# OpenCV loads in BGR format, convert to RGB for matplotlib
image_rgb = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)

# Display image
plt.figure(figsize=(10, 6))
plt.imshow(image_rgb)
plt.axis('off')
plt.title('Original Image')
plt.show()

# Print image properties
print(f"Image shape: {image.shape}")
print(f"Image dtype: {image.dtype}")
print(f"Image size: {image.size}")

How Do Images Connect to Mathematical Matrices?

Grayscale Images: 2D matrix where each element represents pixel intensity (0-255)

# Create a gradient image using matrix operations
height, width = 100, 200
gradient = np.zeros((height, width), dtype=np.uint8)

for i in range(height):
    for j in range(width):
        gradient[i, j] = int(255 * j / width)  # Horizontal gradient

plt.imshow(gradient, cmap='gray')
plt.title('Gradient Image Created with Matrix')
plt.show()

Color Images: 3D matrix (Height × Width × Channels)

# Create RGB color image
height, width = 100, 100
color_image = np.zeros((height, width, 3), dtype=np.uint8)

# Red channel
color_image[:, :, 0] = np.linspace(0, 255, width).astype(np.uint8)
# Green channel
color_image[:, :, 1] = np.linspace(0, 255, height).reshape(-1, 1).astype(np.uint8)
# Blue channel constant
color_image[:, :, 2] = 128

plt.imshow(color_image)
plt.title('Color Image Created with 3D Matrix')
plt.show()

What are Practical Applications of Matrix Operations in Images?

1. Image Brightening (Scalar Addition):

# Load image
img = cv2.imread('image.jpg', cv2.IMREAD_GRAYSCALE)

# Brighten image by adding scalar
brightened = cv2.add(img, np.ones(img.shape, dtype=np.uint8) * 50)

# Display results
fig, axes = plt.subplots(1, 2, figsize=(12, 5))
axes[0].imshow(img, cmap='gray')
axes[0].set_title('Original')
axes[1].imshow(brightened, cmap='gray')
axes[1].set_title('Brightened (+50)')
plt.show()

2. Image Blending (Matrix Addition):

# Create two simple images
img1 = np.zeros((100, 100), dtype=np.uint8)
img1[20:80, 20:80] = 255  # White square

img2 = np.zeros((100, 100), dtype=np.uint8)
cv2.circle(img2, (50, 50), 30, 255, -1)  # White circle

# Blend images
alpha = 0.5
blended = cv2.addWeighted(img1, alpha, img2, 1-alpha, 0)

3. Convolution (Matrix Multiplication Application):

# Define different kernels
kernels = {
    'Identity': np.array([[0, 0, 0], [0, 1, 0], [0, 0, 0]]),
    'Blur': np.array([[1, 1, 1], [1, 1, 1], [1, 1, 1]]) / 9,
    'Sharpen': np.array([[0, -1, 0], [-1, 5, -1], [0, -1, 0]]),
    'Edge': np.array([[-1, -1, -1], [-1, 8, -1], [-1, -1, -1]])
}

# Apply kernels to image
img = cv2.imread('image.jpg', cv2.IMREAD_GRAYSCALE)

for name, kernel in kernels.items():
    filtered = cv2.filter2D(img, -1, kernel)
    plt.imshow(filtered, cmap='gray')
    plt.title(f'{name} Kernel')
    plt.show()

How Can I Create and Visualize Different Types of Matrices?

import numpy as np
import matplotlib.pyplot as plt

# Create different types of matrices
size = 100

# 1. Checkerboard pattern
checkerboard = np.zeros((size, size))
checkerboard[::20, ::20] = 1
checkerboard[10::20, 10::20] = 1

# 2. Circular pattern
x, y = np.meshgrid(np.linspace(-1, 1, size), np.linspace(-1, 1, size))
circle = np.sqrt(x**2 + y**2)

# 3. Linear gradient
gradient = np.linspace(0, 1, size).reshape(1, -1).repeat(size, axis=0)

# 4. Random matrix
random_matrix = np.random.random((size, size))

# Display all matrices
matrices = {
    'Checkerboard': checkerboard,
    'Circle': circle, 
    'Gradient': gradient,
    'Random': random_matrix
}

fig, axes = plt.subplots(2, 2, figsize=(12, 12))
axes = axes.ravel()

for i, (name, matrix) in enumerate(matrices.items()):
    axes[i].imshow(matrix, cmap='gray')
    axes[i].set_title(f'{name} Matrix')
    axes[i].axis('off')

plt.tight_layout()
plt.show()

How Do Matrix Transformations Work in Practice?

# Create original points (a square)
points = np.array([
    [1, 1], [2, 1], [2, 2], [1, 2], [1, 1]
]).T

# Define transformation matrices
transformations = {
    'Original': np.eye(2),
    'Scale 2x': np.array([[2, 0], [0, 2]]),
    'Rotate 45°': np.array([[np.cos(np.pi/4), -np.sin(np.pi/4)],
                           [np.sin(np.pi/4), np.cos(np.pi/4)]]),
    'Shear': np.array([[1, 0.5], [0, 1]])
}

# Apply transformations and plot
fig, axes = plt.subplots(2, 2, figsize=(12, 12))
axes = axes.ravel()

for i, (name, transform) in enumerate(transformations.items()):
    transformed = transform @ points
    
    axes[i].plot(transformed[0], transformed[1], 'b-o', linewidth=2, markersize=8)
    axes[i].grid(True)
    axes[i].set_xlim(-3, 4)
    axes[i].set_ylim(-3, 4)
    axes[i].set_title(f'{name}')
    axes[i].set_aspect('equal')

plt.tight_layout()
plt.show()

Interactive Matrix Transformation Visualization

Matrix Transformations in Action: See how different matrices transform a square shape:

Loading interactive visualization...

This interactive plot shows how different matrix transformations affect a unit square. The original square (blue) is transformed through scaling (red), rotation (green), and shearing (orange). Each transformation demonstrates how matrices manipulate geometric shapes in image processing.

How Can I Build a Vector Visualization Tool?

import numpy as np
import matplotlib.pyplot as plt
from matplotlib.patches import FancyArrowPatch

class VectorVisualizer:
    def __init__(self, figsize=(10, 8)):
        self.fig, self.ax = plt.subplots(figsize=figsize)
        self.vectors = {}
        
    def add_vector(self, vector, name, color='blue', origin=(0, 0)):
        """Add a vector to visualize"""
        self.vectors[name] = {
            'vector': np.array(vector),
            'color': color,
            'origin': np.array(origin)
        }
        
    def plot_vector(self, vector_data, name):
        """Plot a single vector"""
        vec = vector_data['vector']
        origin = vector_data['origin']
        color = vector_data['color']
        
        # Draw arrow
        arrow = FancyArrowPatch(
            origin, origin + vec,
            arrowstyle='->', mutation_scale=20,
            color=color, linewidth=2
        )
        self.ax.add_patch(arrow)
        
        # Add label
        label_pos = origin + vec + np.array([0.1, 0.1])
        self.ax.text(label_pos[0], label_pos[1], name, 
                    fontsize=12, color=color, fontweight='bold')
        
    def visualize_operations(self):
        """Visualize vector operations"""
        # Clear previous plot
        self.ax.clear()
        
        # Set up the plot
        self.ax.set_xlim(-8, 8)
        self.ax.set_ylim(-8, 8)
        self.ax.grid(True, alpha=0.3)
        self.ax.axhline(y=0, color='k', linewidth=0.5)
        self.ax.axvline(x=0, color='k', linewidth=0.5)
        self.ax.set_aspect('equal')
        
        # Plot all vectors
        for name, vector_data in self.vectors.items():
            self.plot_vector(vector_data, name)
            
        self.ax.set_title('Vector Visualization', fontsize=16, fontweight='bold')
        self.ax.set_xlabel('X', fontsize=12)
        self.ax.set_ylabel('Y', fontsize=12)
        
        plt.tight_layout()
        plt.show()

# Example usage
visualizer = VectorVisualizer()

# Add vectors
v1 = [3, 2]
v2 = [1, 4]
v_sum = np.array(v1) + np.array(v2)
v_scaled = 2 * np.array(v1)

visualizer.add_vector(v1, 'v₁', 'red')
visualizer.add_vector(v2, 'v₂', 'blue')
visualizer.add_vector(v_sum, 'v₁ + v₂', 'green')
visualizer.add_vector(v_scaled, '2v₁', 'purple', origin=(0, -3))

# Visualize
visualizer.visualize_operations()

What Are the Key Takeaways from Day 1?

Mathematical Concepts:

→ Vectors represent quantities with magnitude and direction
→ Matrices are 2D arrays of numbers used for transformations
→ Matrix multiplication enables complex transformations and operations
→ Linear algebra provides the mathematical foundation for image processing

OpenCV Applications:

→ Images are matrices - pixel values stored in arrays
→ Color channels represented as 3D matrices
→ Transformations applied using matrix multiplication
→ Filtering achieved through convolution (matrix operations)

Practical Skills Gained:

→ Setting up OpenCV environment
→ Creating and manipulating matrices with NumPy
→ Converting between mathematical concepts and image operations
→ Visualizing vectors and transformations
→ Understanding the connection between linear algebra and computer vision

Next Steps for Day 2:

→ Eigenvalues and eigenvectors
→ Singular Value Decomposition (SVD)
→ Principal Component Analysis (PCA)
→ Advanced image transformations
→ Feature detection algorithms

This completes your Day 1 foundation in linear algebra and OpenCV. Practice these concepts with different images and experiment with the provided code examples to solidify your understanding.

Suggetested Articles