Window Vision Functions

class autocv.core.vision.Vision(hwnd: int = -1)[source]

Bases: WindowCapture

A class for image processing and optical character recognition (OCR).

Extends the WindowCapture class to provide methods for capturing a window, processing the image, and extracting text and color information.

set_backbuffer(image: ndarray[tuple[Any, ...], dtype[uint8]] | Image) None[source]

Sets the image buffer to the provided NumPy array or PIL Image.

Parameters:

image – The image to set as the backbuffer. Can be a NumPy array (in OpenCV format) or a PIL Image.

refresh(*, set_backbuffer: bool = True) ndarray[tuple[Any, ...], dtype[uint8]] | None[source]

Captures the current window image and converts it to an OpenCV-compatible format.

Parameters:

set_backbuffer – If True, sets the captured image as the window’s backbuffer. If False, returns the captured image.

Raises:

InvalidHandleError – If the window handle is not valid.

Returns:

The captured image as a NumPy array (height, width, 3) if set_backbuffer is False; otherwise, None.

save_backbuffer_to_file(file_name: str) None[source]

Saves the backbuffer image to a file.

Parameters:

file_name – The name (and path) of the file to save the image to.

get_pixel_change(area: tuple[int, int, int, int] | None = None) int[source]

Calculates the number of pixels that have changed between the current image and a newly captured image.

Parameters:

area – A tuple (x, y, w, h) specifying the region of the image to consider. If None, the entire image is used.

Raises:

InvalidImageError – If the image data is invalid.

Returns:

The number of pixels that have changed between the two images.

get_text(rect: tuple[int, int, int, int] | None = None, colors: tuple[int, int, int] | list[tuple[int, int, int]] | None = None, tolerance: int = 0, confidence: float | None = 0.8) list[dict[str, str | int | float | list[int]]][source]

Extracts text from the backbuffer using Tesseract OCR.

Only text with confidence greater than or equal to the provided threshold is returned.

Parameters:
  • rect – A tuple (x, y, w, h) specifying the search region.

  • colors – An RGB tuple or list of RGB tuples to filter the image.

  • tolerance – Maximum allowed color difference for filtering.

  • confidence – Minimum confidence (0 to 1) for text to be included.

Returns:

A list of dictionaries, each containing text, its bounding rectangle, and confidence.

get_color(point: tuple[int, int]) tuple[int, int, int][source]

Returns the color of the pixel at the specified coordinates.

Parameters:

point – A tuple (x, y) specifying the pixel coordinates.

Returns:

A tuple (R, G, B) representing the pixel color.

Raises:
find_color(color: tuple[int, int, int], rect: tuple[int, int, int, int] | None = None, tolerance: int = 0) list[tuple[int, int]][source]

Finds all pixel coordinates matching the given color within the specified tolerance.

Parameters:
  • color – The target RGB color.

  • rect – A tuple (x, y, w, h) defining the region to search. If None, the whole image is searched.

  • tolerance – Maximum allowed difference per channel.

Returns:

A list of (x, y) tuples for pixels matching the color.

get_average_color(rect: tuple[int, int, int, int] | None = None) tuple[int, int, int][source]

Calculates the average color within a specified region.

Parameters:

rect – A tuple (x, y, w, h) specifying the region to analyze. If None, the entire image is used.

Returns:

A tuple (R, G, B) representing the average color.

get_most_common_color(rect: tuple[int, int, int, int] | None = None, index: int = 1, ignore_colors: tuple[int, int, int] | list[tuple[int, int, int]] | None = None) tuple[int, int, int][source]

Determines the most common color in the specified region.

Parameters:
  • rect – A tuple (x, y, w, h) specifying the region to analyze. If None, the entire image is used.

  • index – The rank of the common color (1 for most common, 2 for second most common, etc.).

  • ignore_colors – A color or list of colors to ignore.

Returns:

A tuple (R, G, B) of the most common color.

get_count_of_color(color: tuple[int, int, int], rect: tuple[int, int, int, int] | None = None, tolerance: int | None = 0) int[source]

Counts the number of pixels matching a given color within a tolerance.

Parameters:
  • color – The target RGB color.

  • rect – A tuple (x, y, w, h) specifying the region to analyze. If None, the entire image is used.

  • tolerance – Allowed difference per channel (default is 0 for exact match).

Returns:

The count of pixels matching the specified color.

get_all_colors_with_counts(rect: tuple[int, int, int, int] | None = None) list[tuple[tuple[int, int, int], int]][source]

Retrieves all colors in the specified region along with their pixel counts.

Parameters:

rect – A tuple (x, y, w, h) specifying the region to analyze. If None, the entire image is used.

Returns:

A list of tuples where each tuple contains an (R, G, B) color and its count.

get_median_color(rect: tuple[int, int, int, int] | None = None) tuple[int, int, int][source]

Calculates the median color of the specified region.

Parameters:

rect – A tuple (x, y, w, h) specifying the region to analyze. If None, the entire image is used.

Returns:

A tuple (R, G, B) representing the median color.

maximize_color_match(rect: tuple[int, int, int, int], initial_tolerance: int = 100, tolerance_step: int = 1) tuple[tuple[int, int, int], int][source]

Finds the color and tolerance that best match the region’s dominant color.

Parameters:
  • rect – A tuple (x, y, w, h) specifying the region to analyze.

  • initial_tolerance – The starting tolerance value.

  • tolerance_step – The decrement step for tolerance during search.

Returns:

A tuple containing the best matching RGB color and the tolerance value used.

erode_image(iterations: int = 1, kernel: ndarray[tuple[Any, ...], dtype[uint8]] | None = None) None[source]

Applies morphological erosion to the backbuffer image.

Parameters:
  • iterations – Number of erosion iterations to apply. Defaults to 1.

  • kernel – The structuring element to use; defaults to a 3x3 matrix of ones.

dilate_image(iterations: int = 1, kernel: ndarray[tuple[Any, ...], dtype[uint8]] | None = None) None[source]

Applies morphological dilation to the backbuffer image.

Parameters:
  • iterations – Number of dilation iterations to apply. Defaults to 1.

  • kernel – The structuring element to use; defaults to a 3x3 matrix of ones.

find_image(sub_image: ndarray[tuple[Any, ...], dtype[uint8]] | Image, rect: tuple[int, int, int, int] | None = None, confidence: float = 0.95, median_tolerance: int | None = None) list[tuple[int, int, int, int]][source]

Finds occurrences of a subimage within the main image using template matching.

Parameters:
  • sub_image – The subimage to search for (as a NumPy array or PIL Image).

  • rect – A tuple (x, y, w, h) specifying the search region. If None, the entire image is used.

  • confidence – The matching confidence threshold (default 0.95).

  • median_tolerance – Maximum color difference allowed between the subimage and matched region (optional).

Returns:

A list of rectangles (x, y, w, h) where the subimage was found.

find_contours(color: tuple[int, int, int], rect: tuple[int, int, int, int] | None = None, tolerance: int = 0, min_area: int = 10, vertices: int | None = None) list[ndarray[tuple[Any, ...], dtype[uint64]]][source]

Finds contours in the image that match the specified color.

Parameters:
  • color – The target RGB color.

  • rect – A tuple (x, y, w, h) specifying the search region. If None, the entire image is used.

  • tolerance – Color tolerance.

  • min_area – Minimum contour area to be considered.

  • vertices – If specified, only contours with this number of vertices are returned.

Returns:

A list of contours as NumPy arrays.

draw_points(points: Sequence[tuple[int, int]], color: tuple[int, int, int] = (255, 0, 0)) None[source]

Draws points on the backbuffer image.

Parameters:
  • points – A sequence of (x, y) coordinates.

  • color – The drawing color as an RGB tuple. Defaults to red.

draw_contours(contours: tuple[tuple[tuple[tuple[int, int]]]], color: tuple[int, int, int] = (255, 0, 0)) None[source]

Draws contours on the backbuffer image.

Parameters:
  • contours – The contours to draw.

  • color – The drawing color as an RGB tuple. Defaults to red.

draw_circle(circle: tuple[int, int, int], color: tuple[int, int, int] = (255, 0, 0)) None[source]

Draws a circle on the backbuffer image.

Parameters:
  • circle – A tuple (x, y, r) specifying the center and radius.

  • color – The drawing color as an RGB tuple. Defaults to red.

draw_rectangle(rect: tuple[int, int, int, int], color: tuple[int, int, int] = (255, 0, 0)) None[source]

Draws a rectangle on the backbuffer image.

Parameters:
  • rect – A tuple (x, y, w, h) defining the rectangle.

  • color – The drawing color as an RGB tuple. Defaults to red.

filter_colors(colors: tuple[int, int, int] | list[tuple[int, int, int]], tolerance: int = 0, *, keep_original_colors: bool = False) None[source]

Filters the backbuffer image to retain only specified colors within a given tolerance.

Parameters:
  • colors – An RGB tuple or list of RGB tuples to retain.

  • tolerance – Allowed color deviation (0-255).

  • keep_original_colors – If True, non-matching pixels are set to black in a copy; otherwise, the backbuffer is updated.