Agent Mode Tutorial: AI-Driven AS/400 Automation
IMTerm's Agent Mode lets you control AS/400, z/OS, and VT220 sessions using natural language. Instead of writing screen-by-screen automation scripts, you describe what you want to accomplish and the AI model navigates the terminal, reads the screens, fills in fields, and returns results. This tutorial covers setup, writing your first agent script, connecting to an AI provider, and building a real-world automation workflow.
What Agent Mode Is
Agent Mode is IMTerm's built-in integration between the Screen-to-HTML API and AI language models. It provides:
- A script editor (powered by CodeMirror) for writing automation instructions
- A WebSocket bridge that exposes the current terminal screen state to the AI model
- A function-calling interface (
imterm.screen(),imterm.submit(),imterm.keys()) that the AI uses to navigate the terminal - Support for five AI backends: Claude (Anthropic), GPT-4 (OpenAI), Gemini (Google), Ollama (local/self-hosted), and Mock (for testing)
- An audit trail that logs every screen the agent visited and every field it submitted
Prerequisites
Before enabling Agent Mode:
- IMTerm v2.0.1 or later installed and running
- At least one active terminal session connected to your AS/400 or z/OS host
- An API key for your chosen AI provider (or Ollama running locally)
Step 1: Configure the AI Provider
In your config.yaml, add an agent section:
agent:
# Provider: claude | openai | gemini | ollama | mock
provider: claude
# API key for the chosen provider.
# Use environment variable to avoid storing keys in config files.
api_key: "" # or set IMTERM_AGENT_API_KEY in the environment
# Model to use (provider-specific).
# Claude: claude-sonnet-4-6, claude-haiku-4-5-20251001
# OpenAI: gpt-4o, gpt-4-turbo
# Gemini: gemini-1.5-pro, gemini-1.5-flash
# Ollama: llama3.2, qwen2.5-coder
model: "claude-sonnet-4-6"
# Maximum tokens per agent turn (controls cost/depth tradeoff).
max_tokens: 4096
# Timeout for each AI API call.
timeout: "30s"
# Rate limit: maximum agent API calls per minute.
rate_limit_per_minute: 10
# Allow agents to submit forms (if false, agent is read-only).
allow_submit: true
For Ollama (local AI, no API key needed):
agent:
provider: ollama
ollama_host: "http://localhost:11434"
model: "llama3.2"
allow_submit: true
Set your API key as an environment variable:
export IMTERM_AGENT_API_KEY=sk-ant-...
Or on Windows:
[System.Environment]::SetEnvironmentVariable("IMTERM_AGENT_API_KEY", "sk-ant-...", "Machine")
Step 2: Open the Script Editor
In an active terminal session, open Agent Mode:
- Menu: Tools > Agent Mode
- Keyboard shortcut: Ctrl+Shift+A
The panel splits: the terminal screen stays on the left, and the CodeMirror script editor opens on the right.
The editor supports JavaScript with IMTerm's built-in helper functions. Autocomplete is available (Ctrl+Space).
Step 3: Your First Agent Script
Start with a read-only script that queries the AS/400 and returns information:
// Read the current AS/400 customer count from WRKACTJOB screen.
// This script does not submit anything - it is read-only.
const instruction = `
Navigate to the Work with Active Jobs screen (WRKACTJOB command).
Count the total number of active jobs shown.
Return just the number.
`
const result = await imterm.agent(instruction)
console.log('Active jobs:', result)
Click "Run" (or Ctrl+Enter). The agent:
- Reads the current screen via
imterm.screen() - Sends the instruction and screen state to the AI model
- The AI decides it needs to navigate - it calls
imterm.keys(['ENTER'])to get to the command line, thenimterm.submit({ command: 'WRKACTJOB' }) - Reads the resulting screen and extracts the job count
- Returns the count to your script
The result appears in the output panel below the editor.
The IMTerm Agent API
Agent scripts have access to these functions:
imterm.screen()
Returns the current screen state as a JSON object (same format as the REST API's /screen/form endpoint):
const screen = await imterm.screen()
console.log(screen.title) // e.g. "CUSTOMER MAINTENANCE"
console.log(screen.fields) // array of field objects
console.log(screen.keys) // array of PF key objects
imterm.submit(fields, aidKey)
Fills in fields and presses an AID key:
// Fill in a customer number and press Enter
await imterm.submit({ CUSTID: '0001234' }, 'ENTER')
// Press F3 to exit (no fields to fill)
await imterm.submit({}, 'F3')
imterm.keys(keys)
Send a sequence of keystrokes:
// Type a command on the command line
await imterm.keys(['TAB', 'WRKACTJOB', 'ENTER'])
imterm.agent(instruction)
The high-level function that invokes the AI model with an instruction and the current screen state. The AI can call imterm.screen(), imterm.submit(), and imterm.keys() autonomously to complete the task.
const result = await imterm.agent('Find customer 1234 and return their balance')
imterm.waitFor(condition, timeout)
Wait for a specific screen state before continuing:
// Wait until the customer detail screen appears
await imterm.waitFor(screen => screen.title.includes('CUSTOMER DETAIL'), 10000)
Real-World Example: Customer Data Extraction
This script navigates a Hebrew AS/400 customer management system, searches for a customer by ID, and returns their contact information:
// Extract customer contact info from AS/400 CUSTMAINT program.
// Assumes: session is at the main menu (screen title contains "MAIN MENU")
const customerId = '00012345'
// Step 1: Navigate to Customer Maintenance
await imterm.submit({ OPTION: '11' }, 'ENTER') // option 11 = Customer Maintenance
await imterm.waitFor(s => s.title.includes('CUSTOMER'), 5000)
// Step 2: Enter the customer number and search
await imterm.submit({ CUSTNO: customerId }, 'ENTER')
await imterm.waitFor(s => s.title.includes('CUSTOMER DETAIL'), 5000)
// Step 3: Read the current screen and extract fields
const screen = await imterm.screen()
const fields = Object.fromEntries(screen.fields.map(f => [f.name, f.value]))
// Step 4: Return the customer info
const customer = {
id: fields.CUSTNO?.trim(),
name: fields.CUSTNAME?.trim(),
address: fields.ADDR1?.trim(),
city: fields.CITY?.trim(),
phone: fields.PHONE?.trim(),
balance: fields.BALANCE?.trim()
}
// Step 5: Exit back to the main menu
await imterm.submit({}, 'F3')
console.log(JSON.stringify(customer, null, 2))
Real-World Example: AI-Driven Workflow
When the screen navigation logic is too complex to script manually, delegate to the AI:
// Ask the AI to find the highest-balance overdue customer.
// The AI will navigate the screens it needs to find the answer.
const result = await imterm.agent(`
In this AS/400 customer management system:
1. Navigate to the accounts receivable aging report.
2. Find the customer with the highest balance that is more than 90 days overdue.
3. Return their customer number, name, and overdue balance.
Format the result as: CUSTNO | Name | Balance
`)
console.log('Result:', result)
The AI model receives the current screen state, determines which navigation steps are needed, executes them using the function-calling interface, reads each resulting screen, and produces the answer.
Saving and Sharing Scripts
Scripts saved in Agent Mode are stored in IMTerm's data directory under scripts/. They are accessible to all users with the Agent role.
To save the current script: File > Save Script or Ctrl+S in the editor.
To load a saved script: File > Open Script or the script library panel (Ctrl+Shift+L).
Scripts are plain JavaScript files - you can edit them with any text editor outside of IMTerm and place them in the scripts directory to make them available in the editor.
Security and Permissions
Agent Mode has its own permission layer. In config.yaml:
agent:
# Roles that can use Agent Mode at all.
allowed_roles: ["admin", "agent"]
# Roles that can submit forms (agent actions that modify AS/400 data).
submit_roles: ["admin"]
# Require approval before each AI-initiated submit.
require_approval: false
# Log every agent interaction to the audit trail.
audit: true
With require_approval: true, each imterm.submit() call initiated by the AI pauses and shows the user a preview of what will be submitted. The user approves or rejects each step. This is appropriate for production environments where AI-initiated data changes must be reviewed.
Using Mock Mode for Testing
During script development, use the Mock provider to avoid consuming AI API credits:
agent:
provider: mock
The Mock provider returns scripted responses based on the screen title. It is useful for testing script structure and error handling without live AI calls.
To test against a known screen sequence, add mock responses:
agent:
provider: mock
mock_responses:
- screen_title_contains: "MAIN MENU"
response: "Navigate to option 11"
- screen_title_contains: "CUSTOMER MAINTENANCE"
response: "Submit customer ID 0001234"
Monitoring Agent Runs
While an agent script is running, the Activity panel (View > Activity Panel or Ctrl+Shift+P) shows:
- Each screen visited (title, timestamp)
- Each field submitted (field names and values, AID key)
- Each AI API call (prompt tokens, completion tokens, latency)
- Any errors or exceptions
After a run completes, the full activity log is available in the audit trail at /admin/audit. Filter by session or by user to review agent activity.
Cost Management
AI API calls consume tokens and cost money. To control costs:
- Use a smaller model for simple navigation tasks (
claude-haiku-4-5-20251001instead ofclaude-sonnet-4-6) - Set
max_tokensto a reasonable limit for your use case - Use
rate_limit_per_minuteto prevent runaway scripts - In development, use Mock mode
- Use
imterm.screen()+ explicitimterm.submit()calls for well-understood screen sequences instead ofimterm.agent()- direct API calls do not consume AI tokens
The Activity Panel shows token usage per run. Check the totals before deploying scripts to production.
Provider Comparison
| Provider | Best for | Latency | Cost |
|---|---|---|---|
| Claude (Sonnet) | Complex navigation, Hebrew | Low | Medium |
| Claude (Haiku) | Simple lookups, repetitive tasks | Very low | Low |
| GPT-4o | English-only workflows | Low | Medium |
| Gemini 1.5 Flash | High volume, cost-sensitive | Very low | Low |
| Ollama (local) | Air-gapped environments, no API cost | Medium (hardware dependent) | Free |
| Mock | Testing and development | Instant | Free |
For Hebrew AS/400 applications, Claude is recommended - it understands Hebrew field labels and instructions naturally and can follow instructions written in Hebrew.
Troubleshooting
Agent says "I cannot see the screen" The screen state is not being passed correctly. Check that the session is connected and showing a screen (not disconnected or at the login prompt).
Agent loops without making progress
The AI is navigating in circles. This usually means the instruction is ambiguous. Add more detail about which menu option or command to use, or use explicit imterm.submit() calls for the navigation steps and only use imterm.agent() for the final task.
"Rate limit exceeded" error
The rate_limit_per_minute setting is too low for your script, or you have too many concurrent agent sessions. Increase the limit or reduce concurrency.
"Submit not allowed" error
The logged-in user's role is not in submit_roles. Add the user to a role that has submit permission, or change the role configuration.
Hebrew instructions not understood
Verify that model: "claude-sonnet-4-6" (or another multilingual model) is configured. GPT-3.5 and some Ollama models have limited Hebrew comprehension.