Skip to main content

AI API

Provides an API to interact with LLMs

ai.ask

Signatures

ask(question, opts \ [])

ask(arg, question, opts)

Description

Ask a yes/no question about the current screen or a specified region.

This function captures the current screen and optionally crops it to a specific region before asking the AI a yes/no question about it.

Parameters

  • question (string) - A yes/no question about the screen
  • opts (table) - Optional parameters

Options

  • region - Optional region name to focus the AI analysis on a specific area of the screen

Example

local is_playing = ai.ask("Is the video currently playing?")

if is_playing then
print("Video is playing")
else
print("Video is not playing")
end

-- With region
local has_error = ai.ask("Is there an error message?", {region = "error_dialog"})

ai.ask

Signatures

ask(question, opts \ [])

ask(arg, question, opts)

Description

Ask a yes/no question about a provided image and returns a boolean answer.

Uses OpenAI's structured output feature to guarantee a reliable true/false response.

Parameters

  • image (userdata) - Image object to analyze
  • question (string) - A yes/no question about the image
  • opts (table) - Optional parameters

Example

local screenshot = screen.capture()
local is_visible = ai.ask(screenshot, "Is the Netflix app visible on the screen?")

if is_visible then
print("Netflix is visible!")
end

ai.codegen

Signatures

codegen(prompt)

Description

Generate lua code from the prompt.

Parameters

  • prompt (string) - Natural language description of actions to convert to Lua code

Example

local codegen = ai.codegen("Navigate home, wait 5 seconds, then press up and enter")

print(codegen)
-- control.home() wait(5000) control.up() control.ok()

ai.navigate

Signatures

navigate(target, opts \ [])

navigate(arg, target, opts)

Description

Navigate menus using AI with the current screen or a specified region.

This function captures the current screen and optionally crops it to a specific region before attempting AI-guided navigation to the target.

Parameters

  • target (string) - Navigation target description
  • opts (table) - Optional parameters

Options

  • region - Optional region name to focus the navigation on a specific area of the screen

Example

local sequence, prompt, result, viewpoint = ai.navigate("Settings menu")

-- With region
local sequence, prompt, result, viewpoint = ai.navigate("Volume control", {region = "audio_panel"})

ai.navigate

Signatures

navigate(target, opts \ [])

navigate(arg, target, opts)

Description

Navigate menus using AI with a provided image.

Parameters

  • image (userdata) - Image object to analyze for navigation
  • target (string) - Navigation target description
  • opts (table) - Optional parameters

Options

  • model - AI model to use for navigation
  • temperature - AI temperature setting
  • imageScale - Scale factor for image processing
  • promptType - Type of navigation prompt
  • additionalInstructions - Extra instructions for the AI

ai.prompt

Signatures

prompt(user_prompt, opts \ [])

prompt(arg, prompt, opts)

Description

Runs a prompt with the current screen or a specified region and returns the response and viewpoint.

This function captures the current screen and optionally crops it to a specific region before sending the prompt to the AI.

Parameters

  • user_prompt (string) - The prompt to send to the AI
  • opts (table) - Optional parameters

Options

  • region - Optional region name to focus the AI analysis on a specific area of the screen

Example

local response, viewpoint = ai.prompt("What is the text on the screen?")

print(response)
-- "The screen shows a login form with username and password fields."

-- With region
local response, viewpoint = ai.prompt("What color is this button?", {region = "submit_button"})

ai.prompt

Signatures

prompt(user_prompt, opts \ [])

prompt(arg, prompt, opts)

Description

Runs a prompt with a provided image, or the current screen, and returns the response and the viewpoint of the AI.

Parameters

  • image (userdata) - Image object to analyze
  • prompt (string) - The prompt to send to the AI
  • opts (table) - Optional parameters

Example

local response, viewpoint = ai.prompt("What is the color of the screen?")

print(response)
-- "The general color of the screen is blue."

ai.run

Signatures

run(prompt, state)

Description

The same as codegen, but runs the generated code immediately.

Parameters

  • prompt (string) - Natural language description of actions to execute

Example

ai.run("Navigate home, wait 5 seconds, then press up and enter")