← All Articles

Agent Mode Tutorial: AI-Driven AS/400 Automation

IMTerm's Agent Mode lets you control AS/400, z/OS, and VT220 sessions using natural language. Instead of writing screen-by-screen automation scripts, you describe what you want to accomplish and the AI model navigates the terminal, reads the screens, fills in fields, and returns results. This tutorial covers setup, writing your first agent script, connecting to an AI provider, and building a real-world automation workflow.


What Agent Mode Is

Agent Mode is IMTerm's built-in integration between the Screen-to-HTML API and AI language models. It provides:

  • A script editor (powered by CodeMirror) for writing automation instructions
  • A WebSocket bridge that exposes the current terminal screen state to the AI model
  • A function-calling interface (imterm.screen(), imterm.submit(), imterm.keys()) that the AI uses to navigate the terminal
  • Support for five AI backends: Claude (Anthropic), GPT-4 (OpenAI), Gemini (Google), Ollama (local/self-hosted), and Mock (for testing)
  • An audit trail that logs every screen the agent visited and every field it submitted

Prerequisites

Before enabling Agent Mode:

  1. IMTerm v2.0.1 or later installed and running
  2. At least one active terminal session connected to your AS/400 or z/OS host
  3. An API key for your chosen AI provider (or Ollama running locally)

Step 1: Configure the AI Provider

In your config.yaml, add an agent section:

agent:
  # Provider: claude | openai | gemini | ollama | mock
  provider: claude

  # API key for the chosen provider.
  # Use environment variable to avoid storing keys in config files.
  api_key: ""   # or set IMTERM_AGENT_API_KEY in the environment

  # Model to use (provider-specific).
  # Claude: claude-sonnet-4-6, claude-haiku-4-5-20251001
  # OpenAI: gpt-4o, gpt-4-turbo
  # Gemini: gemini-1.5-pro, gemini-1.5-flash
  # Ollama: llama3.2, qwen2.5-coder
  model: "claude-sonnet-4-6"

  # Maximum tokens per agent turn (controls cost/depth tradeoff).
  max_tokens: 4096

  # Timeout for each AI API call.
  timeout: "30s"

  # Rate limit: maximum agent API calls per minute.
  rate_limit_per_minute: 10

  # Allow agents to submit forms (if false, agent is read-only).
  allow_submit: true

For Ollama (local AI, no API key needed):

agent:
  provider: ollama
  ollama_host: "http://localhost:11434"
  model: "llama3.2"
  allow_submit: true

Set your API key as an environment variable:

export IMTERM_AGENT_API_KEY=sk-ant-...

Or on Windows:

[System.Environment]::SetEnvironmentVariable("IMTERM_AGENT_API_KEY", "sk-ant-...", "Machine")

Step 2: Open the Script Editor

In an active terminal session, open Agent Mode:

  • Menu: Tools > Agent Mode
  • Keyboard shortcut: Ctrl+Shift+A

The panel splits: the terminal screen stays on the left, and the CodeMirror script editor opens on the right.

The editor supports JavaScript with IMTerm's built-in helper functions. Autocomplete is available (Ctrl+Space).


Step 3: Your First Agent Script

Start with a read-only script that queries the AS/400 and returns information:

// Read the current AS/400 customer count from WRKACTJOB screen.
// This script does not submit anything - it is read-only.

const instruction = `
  Navigate to the Work with Active Jobs screen (WRKACTJOB command).
  Count the total number of active jobs shown.
  Return just the number.
`

const result = await imterm.agent(instruction)
console.log('Active jobs:', result)

Click "Run" (or Ctrl+Enter). The agent:

  1. Reads the current screen via imterm.screen()
  2. Sends the instruction and screen state to the AI model
  3. The AI decides it needs to navigate - it calls imterm.keys(['ENTER']) to get to the command line, then imterm.submit({ command: 'WRKACTJOB' })
  4. Reads the resulting screen and extracts the job count
  5. Returns the count to your script

The result appears in the output panel below the editor.


The IMTerm Agent API

Agent scripts have access to these functions:

imterm.screen()

Returns the current screen state as a JSON object (same format as the REST API's /screen/form endpoint):

const screen = await imterm.screen()
console.log(screen.title)        // e.g. "CUSTOMER MAINTENANCE"
console.log(screen.fields)       // array of field objects
console.log(screen.keys)         // array of PF key objects

imterm.submit(fields, aidKey)

Fills in fields and presses an AID key:

// Fill in a customer number and press Enter
await imterm.submit({ CUSTID: '0001234' }, 'ENTER')

// Press F3 to exit (no fields to fill)
await imterm.submit({}, 'F3')

imterm.keys(keys)

Send a sequence of keystrokes:

// Type a command on the command line
await imterm.keys(['TAB', 'WRKACTJOB', 'ENTER'])

imterm.agent(instruction)

The high-level function that invokes the AI model with an instruction and the current screen state. The AI can call imterm.screen(), imterm.submit(), and imterm.keys() autonomously to complete the task.

const result = await imterm.agent('Find customer 1234 and return their balance')

imterm.waitFor(condition, timeout)

Wait for a specific screen state before continuing:

// Wait until the customer detail screen appears
await imterm.waitFor(screen => screen.title.includes('CUSTOMER DETAIL'), 10000)

Real-World Example: Customer Data Extraction

This script navigates a Hebrew AS/400 customer management system, searches for a customer by ID, and returns their contact information:

// Extract customer contact info from AS/400 CUSTMAINT program.
// Assumes: session is at the main menu (screen title contains "MAIN MENU")

const customerId = '00012345'

// Step 1: Navigate to Customer Maintenance
await imterm.submit({ OPTION: '11' }, 'ENTER')   // option 11 = Customer Maintenance
await imterm.waitFor(s => s.title.includes('CUSTOMER'), 5000)

// Step 2: Enter the customer number and search
await imterm.submit({ CUSTNO: customerId }, 'ENTER')
await imterm.waitFor(s => s.title.includes('CUSTOMER DETAIL'), 5000)

// Step 3: Read the current screen and extract fields
const screen = await imterm.screen()
const fields = Object.fromEntries(screen.fields.map(f => [f.name, f.value]))

// Step 4: Return the customer info
const customer = {
  id:      fields.CUSTNO?.trim(),
  name:    fields.CUSTNAME?.trim(),
  address: fields.ADDR1?.trim(),
  city:    fields.CITY?.trim(),
  phone:   fields.PHONE?.trim(),
  balance: fields.BALANCE?.trim()
}

// Step 5: Exit back to the main menu
await imterm.submit({}, 'F3')

console.log(JSON.stringify(customer, null, 2))

Real-World Example: AI-Driven Workflow

When the screen navigation logic is too complex to script manually, delegate to the AI:

// Ask the AI to find the highest-balance overdue customer.
// The AI will navigate the screens it needs to find the answer.

const result = await imterm.agent(`
  In this AS/400 customer management system:
  1. Navigate to the accounts receivable aging report.
  2. Find the customer with the highest balance that is more than 90 days overdue.
  3. Return their customer number, name, and overdue balance.
  Format the result as: CUSTNO | Name | Balance
`)

console.log('Result:', result)

The AI model receives the current screen state, determines which navigation steps are needed, executes them using the function-calling interface, reads each resulting screen, and produces the answer.


Saving and Sharing Scripts

Scripts saved in Agent Mode are stored in IMTerm's data directory under scripts/. They are accessible to all users with the Agent role.

To save the current script: File > Save Script or Ctrl+S in the editor.

To load a saved script: File > Open Script or the script library panel (Ctrl+Shift+L).

Scripts are plain JavaScript files - you can edit them with any text editor outside of IMTerm and place them in the scripts directory to make them available in the editor.


Security and Permissions

Agent Mode has its own permission layer. In config.yaml:

agent:
  # Roles that can use Agent Mode at all.
  allowed_roles: ["admin", "agent"]

  # Roles that can submit forms (agent actions that modify AS/400 data).
  submit_roles: ["admin"]

  # Require approval before each AI-initiated submit.
  require_approval: false

  # Log every agent interaction to the audit trail.
  audit: true

With require_approval: true, each imterm.submit() call initiated by the AI pauses and shows the user a preview of what will be submitted. The user approves or rejects each step. This is appropriate for production environments where AI-initiated data changes must be reviewed.


Using Mock Mode for Testing

During script development, use the Mock provider to avoid consuming AI API credits:

agent:
  provider: mock

The Mock provider returns scripted responses based on the screen title. It is useful for testing script structure and error handling without live AI calls.

To test against a known screen sequence, add mock responses:

agent:
  provider: mock
  mock_responses:
    - screen_title_contains: "MAIN MENU"
      response: "Navigate to option 11"
    - screen_title_contains: "CUSTOMER MAINTENANCE"
      response: "Submit customer ID 0001234"

Monitoring Agent Runs

While an agent script is running, the Activity panel (View > Activity Panel or Ctrl+Shift+P) shows:

  • Each screen visited (title, timestamp)
  • Each field submitted (field names and values, AID key)
  • Each AI API call (prompt tokens, completion tokens, latency)
  • Any errors or exceptions

After a run completes, the full activity log is available in the audit trail at /admin/audit. Filter by session or by user to review agent activity.


Cost Management

AI API calls consume tokens and cost money. To control costs:

  • Use a smaller model for simple navigation tasks (claude-haiku-4-5-20251001 instead of claude-sonnet-4-6)
  • Set max_tokens to a reasonable limit for your use case
  • Use rate_limit_per_minute to prevent runaway scripts
  • In development, use Mock mode
  • Use imterm.screen() + explicit imterm.submit() calls for well-understood screen sequences instead of imterm.agent() - direct API calls do not consume AI tokens

The Activity Panel shows token usage per run. Check the totals before deploying scripts to production.


Provider Comparison

Provider Best for Latency Cost
Claude (Sonnet) Complex navigation, Hebrew Low Medium
Claude (Haiku) Simple lookups, repetitive tasks Very low Low
GPT-4o English-only workflows Low Medium
Gemini 1.5 Flash High volume, cost-sensitive Very low Low
Ollama (local) Air-gapped environments, no API cost Medium (hardware dependent) Free
Mock Testing and development Instant Free

For Hebrew AS/400 applications, Claude is recommended - it understands Hebrew field labels and instructions naturally and can follow instructions written in Hebrew.


Troubleshooting

Agent says "I cannot see the screen" The screen state is not being passed correctly. Check that the session is connected and showing a screen (not disconnected or at the login prompt).

Agent loops without making progress The AI is navigating in circles. This usually means the instruction is ambiguous. Add more detail about which menu option or command to use, or use explicit imterm.submit() calls for the navigation steps and only use imterm.agent() for the final task.

"Rate limit exceeded" error The rate_limit_per_minute setting is too low for your script, or you have too many concurrent agent sessions. Increase the limit or reduce concurrency.

"Submit not allowed" error The logged-in user's role is not in submit_roles. Add the user to a role that has submit permission, or change the role configuration.

Hebrew instructions not understood Verify that model: "claude-sonnet-4-6" (or another multilingual model) is configured. GPT-3.5 and some Ollama models have limited Hebrew comprehension.