Window Vision Functions¶

class autocv.core.vision.Vision(hwnd: int = -1)[source]¶

Bases: WindowCapture

A class for image processing and optical character recognition (OCR).

Extends the WindowCapture class to provide methods for capturing a window, processing the image, and extracting text and color information.

set_backbuffer(image: ndarray[tuple[Any, ...], dtype[uint8]] | Image) → None[source]¶

Sets the image buffer to the provided NumPy array or PIL Image.

Parameters:: image – The image to set as the backbuffer. Can be a NumPy array (in OpenCV format) or a PIL Image.

refresh(*, set_backbuffer: bool = True) → ndarray[tuple[Any, ...], dtype[uint8]] | None[source]¶

Captures the current window image and converts it to an OpenCV-compatible format.

Parameters:: set_backbuffer – If True, sets the captured image as the window’s backbuffer. If False, returns the captured image.
Raises:: InvalidHandleError – If the window handle is not valid.
Returns:: The captured image as a NumPy array (height, width, 3) if set_backbuffer is False; otherwise, None.

save_backbuffer_to_file(file_name: str) → None[source]¶

Saves the backbuffer image to a file.

Parameters:: file_name – The name (and path) of the file to save the image to.

get_pixel_change(area: tuple[int, int, int, int] | None = None) → int[source]¶

Calculates the number of pixels that have changed between the current image and a newly captured image.

Parameters:: area – A tuple (x, y, w, h) specifying the region of the image to consider. If None, the entire image is used.
Raises:: InvalidImageError – If the image data is invalid.
Returns:: The number of pixels that have changed between the two images.

Extracts text from the backbuffer using Tesseract OCR.

Only text with confidence greater than or equal to the provided threshold is returned.

Parameters:

rect – A tuple (x, y, w, h) specifying the search region.
colors – An RGB tuple or list of RGB tuples to filter the image.
tolerance – Maximum allowed color difference for filtering.
confidence – Minimum confidence (0 to 1) for text to be included.

Returns:

A list of dictionaries, each containing text, its bounding rectangle, and confidence.

get_color(point: tuple[int, int]) → tuple[int, int, int][source]¶

Returns the color of the pixel at the specified coordinates.

Parameters:

point – A tuple (x, y) specifying the pixel coordinates.

Returns:

A tuple (R, G, B) representing the pixel color.

Raises:

InvalidImageError – If the image is invalid.
IndexError – If the coordinates are out of bounds.

find_color(color: tuple[int, int, int], rect: tuple[int, int, int, int] | None = None, tolerance: int = 0) → list[tuple[int, int]][source]¶

Finds all pixel coordinates matching the given color within the specified tolerance.

Parameters:

color – The target RGB color.
rect – A tuple (x, y, w, h) defining the region to search. If None, the whole image is searched.
tolerance – Maximum allowed difference per channel.

Returns:

A list of (x, y) tuples for pixels matching the color.

get_average_color(rect: tuple[int, int, int, int] | None = None) → tuple[int, int, int][source]¶

Calculates the average color within a specified region.

Parameters:: rect – A tuple (x, y, w, h) specifying the region to analyze. If None, the entire image is used.
Returns:: A tuple (R, G, B) representing the average color.

get_most_common_color(rect: tuple[int, int, int, int] | None = None, index: int = 1, ignore_colors: tuple[int, int, int] | list[tuple[int, int, int]] | None = None) → tuple[int, int, int][source]¶

Determines the most common color in the specified region.

Parameters:

rect – A tuple (x, y, w, h) specifying the region to analyze. If None, the entire image is used.
index – The rank of the common color (1 for most common, 2 for second most common, etc.).
ignore_colors – A color or list of colors to ignore.

Returns:

A tuple (R, G, B) of the most common color.

get_count_of_color(color: tuple[int, int, int], rect: tuple[int, int, int, int] | None = None, tolerance: int | None = 0) → int[source]¶

Counts the number of pixels matching a given color within a tolerance.

Parameters:

color – The target RGB color.
rect – A tuple (x, y, w, h) specifying the region to analyze. If None, the entire image is used.
tolerance – Allowed difference per channel (default is 0 for exact match).

Returns:

The count of pixels matching the specified color.

get_all_colors_with_counts(rect: tuple[int, int, int, int] | None = None) → list[tuple[tuple[int, int, int], int]][source]¶

Retrieves all colors in the specified region along with their pixel counts.

Parameters:: rect – A tuple (x, y, w, h) specifying the region to analyze. If None, the entire image is used.
Returns:: A list of tuples where each tuple contains an (R, G, B) color and its count.

get_median_color(rect: tuple[int, int, int, int] | None = None) → tuple[int, int, int][source]¶

Calculates the median color of the specified region.

Parameters:: rect – A tuple (x, y, w, h) specifying the region to analyze. If None, the entire image is used.
Returns:: A tuple (R, G, B) representing the median color.

maximize_color_match(rect: tuple[int, int, int, int], initial_tolerance: int = 100, tolerance_step: int = 1) → tuple[tuple[int, int, int], int][source]¶

Finds the color and tolerance that best match the region’s dominant color.

Parameters:

rect – A tuple (x, y, w, h) specifying the region to analyze.
initial_tolerance – The starting tolerance value.
tolerance_step – The decrement step for tolerance during search.

Returns:

A tuple containing the best matching RGB color and the tolerance value used.

erode_image(iterations: int = 1, kernel: ndarray[tuple[Any, ...], dtype[uint8]] | None = None) → None[source]¶

Applies morphological erosion to the backbuffer image.

Parameters:

iterations – Number of erosion iterations to apply. Defaults to 1.
kernel – The structuring element to use; defaults to a 3x3 matrix of ones.

dilate_image(iterations: int = 1, kernel: ndarray[tuple[Any, ...], dtype[uint8]] | None = None) → None[source]¶

Applies morphological dilation to the backbuffer image.

Parameters:

iterations – Number of dilation iterations to apply. Defaults to 1.
kernel – The structuring element to use; defaults to a 3x3 matrix of ones.

find_image(sub_image: ndarray[tuple[Any, ...], dtype[uint8]] | Image, rect: tuple[int, int, int, int] | None = None, confidence: float = 0.95, median_tolerance: int | None = None) → list[tuple[int, int, int, int]][source]¶

Finds occurrences of a subimage within the main image using template matching.

Parameters:

sub_image – The subimage to search for (as a NumPy array or PIL Image).
rect – A tuple (x, y, w, h) specifying the search region. If None, the entire image is used.
confidence – The matching confidence threshold (default 0.95).
median_tolerance – Maximum color difference allowed between the subimage and matched region (optional).

Returns:

A list of rectangles (x, y, w, h) where the subimage was found.

find_contours(color: tuple[int, int, int], rect: tuple[int, int, int, int] | None = None, tolerance: int = 0, min_area: int = 10, vertices: int | None = None) → list[ndarray[tuple[Any, ...], dtype[uint64]]][source]¶

Finds contours in the image that match the specified color.

Parameters:

color – The target RGB color.
rect – A tuple (x, y, w, h) specifying the search region. If None, the entire image is used.
tolerance – Color tolerance.
min_area – Minimum contour area to be considered.
vertices – If specified, only contours with this number of vertices are returned.

Returns:

A list of contours as NumPy arrays.

draw_points(points: Sequence[tuple[int, int]], color: tuple[int, int, int] = (255, 0, 0)) → None[source]¶

Draws points on the backbuffer image.

Parameters:

points – A sequence of (x, y) coordinates.
color – The drawing color as an RGB tuple. Defaults to red.

draw_contours(contours: tuple[tuple[tuple[tuple[int, int]]]], color: tuple[int, int, int] = (255, 0, 0)) → None[source]¶

Draws contours on the backbuffer image.

Parameters:

contours – The contours to draw.
color – The drawing color as an RGB tuple. Defaults to red.

draw_circle(circle: tuple[int, int, int], color: tuple[int, int, int] = (255, 0, 0)) → None[source]¶

Draws a circle on the backbuffer image.

Parameters:

circle – A tuple (x, y, r) specifying the center and radius.
color – The drawing color as an RGB tuple. Defaults to red.

draw_rectangle(rect: tuple[int, int, int, int], color: tuple[int, int, int] = (255, 0, 0)) → None[source]¶

Draws a rectangle on the backbuffer image.

Parameters:

rect – A tuple (x, y, w, h) defining the rectangle.
color – The drawing color as an RGB tuple. Defaults to red.

filter_colors(colors: tuple[int, int, int] | list[tuple[int, int, int]], tolerance: int = 0, *, keep_original_colors: bool = False) → None[source]¶

Filters the backbuffer image to retain only specified colors within a given tolerance.

Parameters:

colors – An RGB tuple or list of RGB tuples to retain.
tolerance – Allowed color deviation (0-255).
keep_original_colors – If True, non-matching pixels are set to black in a copy; otherwise, the backbuffer is updated.