Window Vision Functions

class autocv.core.vision.Vision(hwnd: int = -1)[source]

Bases: WindowCapture

A class for image processing and optical character recognition (OCR).

Extends the WindowCapture class to provide methods for capturing a window, processing the image, and extracting text and color information.

set_backbuffer(image: ndarray[tuple[Any, ...], dtype[uint8]] | Image) None[source]

Sets the image buffer to the provided NumPy array or PIL Image.

Parameters:

image (npt.NDArray[np.uint8] | Image.Image) – Image data used to refresh the backbuffer.

refresh(*, set_backbuffer: bool = True) ndarray[tuple[Any, ...], dtype[uint8]] | None[source]

Captures the current window image and converts it to an OpenCV-compatible format.

Parameters:

set_backbuffer (bool) – When True, persist the capture to self.opencv_image. If False, returns the captured image.

Raises:

InvalidHandleError – Raised when self.hwnd is not a valid window handle.

Returns:

Captured frame when set_backbuffer is False; None otherwise.

Return type:

npt.NDArray[np.uint8] | None

save_backbuffer_to_file(file_name: str) None[source]

Saves the backbuffer image to a file.

Parameters:

file_name (str) – Path where the backbuffer snapshot is stored.

get_pixel_change(area: tuple[int, int, int, int] | None = None) int[source]

Calculates the number of pixels that have changed between the current image and a newly captured image.

Parameters:

area (tuple[int, int, int, int] | None) – Region of interest expressed as (x, y, width, height); None inspects the full frame. If None, the entire image is used.

Raises:

InvalidImageError – Raised when the capture buffer is empty.

Returns:

Count of pixels with different intensities between frames.

Return type:

int

get_text(rect: tuple[int, int, int, int] | None = None, colors: tuple[int, int, int] | list[tuple[int, int, int]] | None = None, tolerance: int = 0, confidence: float | None = 0.8) list[dict[str, str | int | float | list[int]]][source]

Extracts text from the backbuffer using Tesseract OCR.

Only text with confidence greater than or equal to the provided threshold is returned.

Parameters:
  • rect (tuple[int, int, int, int] | None) – Search region specified as (x, y, width, height).

  • colors (tuple[int, int, int] | list[tuple[int, int, int]] | None) – Colours (RGB) to isolate before OCR.

  • tolerance (int) – Per-channel tolerance when matching the colour filter.

  • confidence (float | None) – Minimum acceptable OCR confidence between 0 and 1.

Returns:

Text entries with bounding boxes and confidence levels.

Return type:

list[dict[str, str | int | float | list[int]]]

get_color(point: tuple[int, int]) tuple[int, int, int][source]

Returns the color of the pixel at the specified coordinates.

Parameters:

point (tuple[int, int]) – Pixel coordinates expressed as (x, y).

Returns:

Pixel colour as (R, G, B).

Return type:

tuple[int, int, int]

Raises:
find_color(color: tuple[int, int, int], rect: tuple[int, int, int, int] | None = None, tolerance: int = 0) list[tuple[int, int]][source]

Finds all pixel coordinates matching the given color within the specified tolerance.

Parameters:
  • color (tuple[int, int, int]) – Target colour expressed as (R, G, B).

  • rect (tuple[int, int, int, int] | None) – Optional search region described as (x, y, width, height).

  • tolerance (int) – Allowed per-channel delta when searching for colour matches.

Returns:

Pixel coordinates in image space that match the colour constraint.

Return type:

list[tuple[int, int]]

get_average_color(rect: tuple[int, int, int, int] | None = None) tuple[int, int, int][source]

Calculates the average color within a specified region.

Parameters:

rect (tuple[int, int, int, int] | None) – Region to average; None uses the full image.

Returns:

Average RGB value inside the requested region.

Return type:

tuple[int, int, int]

get_most_common_color(rect: tuple[int, int, int, int] | None = None, index: int = 1, ignore_colors: tuple[int, int, int] | list[tuple[int, int, int]] | None = None) tuple[int, int, int][source]

Determines the most common color in the specified region.

Parameters:
  • rect (tuple[int, int, int, int] | None) – Region to average; None uses the full image.

  • index (int) – Rank of the dominant colour to extract (1-based).

  • ignore_colors (tuple[int, int, int] | list[tuple[int, int, int]] | None) – Colours to skip while ranking.

Returns:

Most common RGB colour in the region.

Return type:

tuple[int, int, int]

get_count_of_color(color: tuple[int, int, int], rect: tuple[int, int, int, int] | None = None, tolerance: int | None = 0) int[source]

Counts the number of pixels matching a given color within a tolerance.

Parameters:
  • color (tuple[int, int, int]) – Target colour expressed as (R, G, B).

  • rect (tuple[int, int, int, int] | None) – Region to average; None uses the full image.

  • tolerance (int) – Allowed per-channel difference; defaults to 0 for exact matches.

Returns:

Number of pixels matching the specified colour.

Return type:

int

get_all_colors_with_counts(rect: tuple[int, int, int, int] | None = None) list[tuple[tuple[int, int, int], int]][source]

Retrieves all colors in the specified region along with their pixel counts.

Parameters:

rect (tuple[int, int, int, int] | None) – Region to average; None uses the full image.

Returns:

Colour counts ordered by frequency.

Return type:

list[tuple[tuple[int, int, int], int]]

get_median_color(rect: tuple[int, int, int, int] | None = None) tuple[int, int, int][source]

Calculates the median color of the specified region.

Parameters:

rect (tuple[int, int, int, int] | None) – Region to average; None uses the full image.

Returns:

Median RGB colour inside the region.

Return type:

tuple[int, int, int]

maximize_color_match(rect: tuple[int, int, int, int], initial_tolerance: int = 100, tolerance_step: int = 1) tuple[tuple[int, int, int], int][source]

Finds the color and tolerance that best match the region’s dominant color.

Parameters:
  • rect (tuple[int, int, int, int] | None) – Region to evaluate when computing the dominant colour.

  • initial_tolerance (int) – Initial tolerance applied when searching.

  • tolerance_step (int) – Amount to decrease tolerance when broadening the search.

Returns:

Matched RGB colour and the tolerance applied.

Return type:

tuple[tuple[int, int, int], int]

erode_image(iterations: int = 1, kernel: ndarray[tuple[Any, ...], dtype[uint8]] | None = None) None[source]

Applies morphological erosion to the backbuffer image.

Parameters:
  • iterations (int) – Number of erosion passes to run. Defaults to 1.

  • kernel (npt.NDArray[np.uint8] | None) – Structuring element to use; defaults to a 3x3 ones matrix.

dilate_image(iterations: int = 1, kernel: ndarray[tuple[Any, ...], dtype[uint8]] | None = None) None[source]

Applies morphological dilation to the backbuffer image.

Parameters:
  • iterations (int) – Number of dilation passes to run. Defaults to 1.

  • kernel (npt.NDArray[np.uint8] | None) – Structuring element to use; defaults to a 3x3 ones matrix.

find_image(sub_image: ndarray[tuple[Any, ...], dtype[uint8]] | Image, rect: tuple[int, int, int, int] | None = None, confidence: float = 0.95, median_tolerance: int | None = None) list[tuple[int, int, int, int]][source]

Finds occurrences of a subimage within the main image using template matching.

Parameters:
  • sub_image (npt.NDArray[np.uint8] | Image.Image) – Subimage to locate inside the backbuffer.

  • rect (tuple[int, int, int, int] | None) – Search region specified as (x, y, width, height). If None, the entire image is used.

  • confidence (float) – Matching confidence threshold (default 0.95).

  • median_tolerance (int | None) – Optional per-channel median colour tolerance for matches.

Returns:

Bounding boxes locating the subimage.

Return type:

list[tuple[int, int, int, int]]

find_contours(color: tuple[int, int, int], rect: tuple[int, int, int, int] | None = None, tolerance: int = 0, min_area: int = 10, vertices: int | None = None) list[ndarray[tuple[Any, ...], dtype[uint64]]][source]

Finds contours in the image that match the specified color.

Parameters:
  • color (tuple[int, int, int]) – Target colour expressed as (R, G, B).

  • rect (tuple[int, int, int, int] | None) – Search region specified as (x, y, width, height). If None, the entire image is used.

  • tolerance (int) – Allowed deviation per colour channel.

  • min_area (int) – Minimum area in pixels squared for a contour to qualify.

  • vertices (int | None) – Required vertex count for returned contours.

Returns:

Contours matching the search criteria.

Return type:

list[npt.NDArray[np.uintp]]

draw_points(points: Sequence[tuple[int, int]], color: tuple[int, int, int] = (255, 0, 0)) None[source]

Draws points on the backbuffer image.

Parameters:
  • points (Sequence[tuple[int, int]]) – Coordinates to mark on the backbuffer.

  • color (tuple[int, int, int]) – Drawing colour (RGB). Defaults to red.

draw_contours(contours: tuple[tuple[tuple[tuple[int, int]]]], color: tuple[int, int, int] = (255, 0, 0)) None[source]

Draws contours on the backbuffer image.

Parameters:
draw_circle(circle: tuple[int, int, int], color: tuple[int, int, int] = (255, 0, 0)) None[source]

Draws a circle on the backbuffer image.

Parameters:
  • circle (tuple[int, int, int]) – Circle definition (x, y, radius).

  • color (tuple[int, int, int]) – Drawing colour (RGB). Defaults to red.

draw_rectangle(rect: tuple[int, int, int, int], color: tuple[int, int, int] = (255, 0, 0)) None[source]

Draws a rectangle on the backbuffer image.

Parameters:
  • rect (tuple[int, int, int, int]) – Rectangle specified as (x, y, width, height).

  • color (tuple[int, int, int]) – Drawing colour (RGB). Defaults to red.

filter_colors(colors: tuple[int, int, int] | list[tuple[int, int, int]], tolerance: int = 0, *, keep_original_colors: bool = False) None[source]

Filters the backbuffer image to retain only specified colors within a given tolerance.

Parameters:
  • colors (tuple[int, int, int] | list[tuple[int, int, int]]) – Colours to keep while filtering.

  • tolerance (int) – Per-channel tolerance threshold (0-255).

  • keep_original_colors (bool) – When True, retain source colours for matching pixels; otherwise, update the backbuffer in place.