Window Vision Functions

class autocv.core.vision.Vision(hwnd: int = -1, lang: str = 'en', device: str | None = None, conf_threshold: float = 0.6, speed: Literal['fast', 'balanced', 'accurate'] = 'balanced', *, disable_model_source_check: bool = False)[source]

Bases: WindowCapture

Capture windows, process images, and perform OCR.

The class maintains a persistent OpenCV-compatible backbuffer in opencv_image. Most image processing routines operate on this buffer.

Notes

  • Frames returned by refresh() and stored in opencv_image are in BGR channel order.

  • Public color values are expressed as RGB tuples.

set_backbuffer(image: ndarray[tuple[int, ...], dtype[uint8]] | Image) None[source]

Set the image buffer to the provided NumPy array or PIL Image.

Parameters:

image – Image data used to refresh the backbuffer.

refresh(*, set_backbuffer: bool = True) ndarray[tuple[int, ...], dtype[uint8]] | None[source]

Capture the current window image and optionally persist it to opencv_image.

Parameters:

set_backbuffer – When True, persist the capture to self.opencv_image; otherwise return the captured frame.

Raises:

InvalidHandleError – Raised when self.hwnd is not a valid window handle.

Returns:

Captured frame when set_backbuffer is False; None otherwise.

save_backbuffer_to_file(file_name: str) None[source]

Save the backbuffer image to a file.

Parameters:

file_name – Path where the backbuffer snapshot is stored.

get_pixel_change(area: tuple[int, int, int, int] | None = None) int[source]

Calculate how many pixels changed between current and refreshed frames.

Parameters:

area – Region of interest expressed as (x, y, width, height); None inspects the full frame.

Raises:

InvalidImageError – Raised when the capture buffer is empty.

Returns:

Count of pixels with different intensities between frames.

Return type:

int

get_text(rect: Rect | None = None, colors: Color | Sequence[Color] | None = None, tolerance: int = 0, confidence: float | None = 0.8) list[OcrTextEntry][source]

Extract text from the backbuffer using PaddleOCR.

Parameters:
  • rect – Search region (x, y, width, height).

  • colors – RGB colour(s) to isolate before OCR.

  • tolerance – Per-channel tolerance when matching the colour filter.

  • confidence – Minimum acceptable OCR confidence between 0 and 1. If None, no filtering.

Returns:

Text entries with bounding boxes and confidence levels, using backbuffer coordinates.

get_color(point: tuple[int, int]) tuple[int, int, int][source]

Return the color of the pixel at the specified coordinates.

Parameters:

point – Pixel coordinates expressed as (x, y).

Returns:

Pixel colour as (R, G, B).

Raises:
  • InvalidImageError – Raised when self.opencv_image is empty.

  • ValueError – Raised when the backbuffer is not a 3-channel BGR image.

  • IndexError – If the coordinates are out of bounds.

find_color(color: tuple[int, int, int], rect: tuple[int, int, int, int] | None = None, tolerance: int = 0) list[tuple[int, int]][source]

Find pixel coordinates matching a color within a tolerance.

Parameters:
  • color – Target colour expressed as (R, G, B).

  • rect – Optional search region described as (x, y, width, height).

  • tolerance – Allowed per-channel delta when searching for colour matches.

Returns:

Pixel coordinates in image space that match the colour constraint.

get_average_color(rect: tuple[int, int, int, int] | None = None) tuple[int, int, int][source]

Calculate the average color within a specified region.

Parameters:

rect – Region to average; None uses the full image.

Returns:

Average RGB value inside the requested region.

get_most_common_color(rect: Rect | None = None, index: int = 1, ignore_colors: Color | Sequence[Color] | None = None) Color[source]

Determines the most common color in the specified region.

Parameters:
  • rect – Region to sample; None uses the full image.

  • index – Rank of the dominant colour to extract (1-based).

  • ignore_colors – RGB colour(s) to skip while ranking.

Returns:

Most common RGB colour in the region.

Raises:

ValueError – If index is less than 1 or the requested region contains no pixels after filtering.

get_count_of_color(color: tuple[int, int, int], rect: tuple[int, int, int, int] | None = None, tolerance: int | None = 0) int[source]

Counts the number of pixels matching a given color within a tolerance.

Parameters:
  • color – Target colour expressed as (R, G, B).

  • rect – Region to sample; None uses the full image.

  • tolerance – Allowed per-channel difference; None is treated as 0 for exact matches.

Returns:

Number of pixels matching the specified colour.

get_all_colors_with_counts(rect: tuple[int, int, int, int] | None = None) list[tuple[tuple[int, int, int], int]][source]

Retrieves all colors in the specified region along with their pixel counts.

Parameters:

rect – Region to sample; None uses the full image.

Returns:

Colour counts ordered by frequency.

get_median_color(rect: tuple[int, int, int, int] | None = None) tuple[int, int, int][source]

Calculate the median color of the specified region.

Parameters:

rect – Region to sample; None uses the full image.

Returns:

Median RGB colour inside the region.

maximize_color_match(rect: tuple[int, int, int, int], initial_tolerance: int = 100, tolerance_step: int = 1) tuple[tuple[int, int, int], int][source]

Finds the color and tolerance that best match the region’s dominant color.

Parameters:
  • rect – Region to evaluate when computing the dominant colour.

  • initial_tolerance – Initial tolerance applied when searching.

  • tolerance_step – Amount to decrease tolerance when narrowing the search.

Returns:

Matched RGB colour and the tolerance applied.

erode_image(iterations: int = 1, kernel: ndarray[tuple[int, ...], dtype[uint8]] | None = None) None[source]

Applies morphological erosion to the backbuffer image.

Parameters:
  • iterations – Number of erosion passes to run. Defaults to 1.

  • kernel – Structuring element to use; defaults to a 3x3 ones matrix.

dilate_image(iterations: int = 1, kernel: ndarray[tuple[int, ...], dtype[uint8]] | None = None) None[source]

Applies morphological dilation to the backbuffer image.

Parameters:
  • iterations – Number of dilation passes to run. Defaults to 1.

  • kernel – Structuring element to use; defaults to a 3x3 ones matrix.

find_image(sub_image: ndarray[tuple[int, ...], dtype[uint8]] | Image, rect: tuple[int, int, int, int] | None = None, confidence: float = 0.95, median_tolerance: int | None = None) list[tuple[int, int, int, int]][source]

Finds occurrences of a subimage within the main image using template matching.

Parameters:
  • sub_image – Template image in RGB/RGBA ordering.

  • rect – Search region specified as (x, y, width, height). If None, the entire image is used.

  • confidence – Matching confidence threshold (default 0.95).

  • median_tolerance – Optional per-channel median colour tolerance for matches.

Returns:

Bounding boxes locating the subimage.

find_contours(color: tuple[int, int, int], rect: tuple[int, int, int, int] | None = None, tolerance: int = 0, min_area: int = 10, vertices: int | None = None, *, close_and_dilate: bool = False) list[ndarray[tuple[int, ...], dtype[int32]]][source]

Find contours in the backbuffer that match a color.

Parameters:
  • color – Target colour expressed as (R, G, B).

  • rect – Search region specified as (x, y, width, height). If None, the entire image is used.

  • tolerance – Allowed deviation per colour channel.

  • min_area – Minimum area in pixels squared for a contour to qualify.

  • vertices – Required vertex count for returned contours.

  • close_and_dilate – When True, apply a 3x3 morphological close and dilation pass to the filtered mask before extracting contours.

Returns:

Contours matching the search criteria.

draw_points(points: Sequence[Point], color: Color = (255, 0, 0)) None[source]

Draws points on the backbuffer image.

Parameters:
  • points – Coordinates to mark on the backbuffer.

  • color – Drawing colour (RGB). Defaults to red.

draw_contours(contours: Contour | Sequence[Contour], color: Color = (255, 0, 0)) None[source]

Draws contours on the backbuffer image.

Parameters:
  • contours – Contour(s) as produced by OpenCV.

  • color – Drawing colour (RGB). Defaults to red.

draw_circle(circle: tuple[int, int, int], color: tuple[int, int, int] = (255, 0, 0)) None[source]

Draws a circle on the backbuffer image.

Parameters:
  • circle – Circle definition (x, y, radius).

  • color – Drawing colour (RGB). Defaults to red.

draw_rectangle(rect: tuple[int, int, int, int], color: tuple[int, int, int] = (255, 0, 0)) None[source]

Draws a rectangle on the backbuffer image.

Parameters:
  • rect – Rectangle specified as (x, y, width, height).

  • color – Drawing colour (RGB). Defaults to red.

filter_colors(colors: Color | Sequence[Color], tolerance: int = 0, *, keep_original_colors: bool = False) None[source]

Filters the backbuffer image to retain only specified colors within a given tolerance.

Parameters:
  • colors – Colours to keep while filtering.

  • tolerance – Per-channel tolerance threshold (0-255).

  • keep_original_colors – When True, retain source colours for matching pixels; otherwise replace the backbuffer with a binary mask.