Building Screen Apps with AI Agents: An MCP Workflow

Open a terminal. Type a description of what you want on a screen. Walk away. Come back to a working application deployed across your fleet.

That sequence sounds like a conference demo designed to impress and then disappoint. But the gap between the demo and reality has narrowed, and the reason isn't better language models. It's better tooling around them. The models were already capable of generating screen application code. What they lacked was the ability to see what they built, manipulate test data, read error logs, and verify rendering across the aspect ratios that real-world screens demand. Those are tooling problems, not intelligence problems.

TelemetryOS addresses this by treating AI agents as first-class users of the development platform, not afterthoughts bolted onto a human-centric IDE.

What MCP Is and Why It Matters Here

Model Context Protocol is an open standard for connecting AI agents to external tools. Think of it as a USB-C port for AI: one protocol that lets any compatible agent interact with any compatible service. Anthropic released the specification in late 2024, adoption moved fast, and the protocol now sits under vendor-neutral governance at the Linux Foundation. Every major AI platform supports it: Claude, ChatGPT, Cursor, VS Code Copilot, Gemini.

For screen development, MCP solves a fundamental problem. An AI agent writing code for a display can't see the display. It generates HTML, CSS, and JavaScript, then hopes the output looks correct. MCP provides the mechanism for tools that give the agent eyes, hands, and ears inside the development environment.

18 Tools That Give AI Agents Control

The TelemetryOS Developer App includes a built-in MCP server with 18 tools across six categories. This isn't a wrapper around a few convenience functions. It's a complete control surface.

Store tools (6) let the agent read, write, delete, list, and inspect data across all four storage scopes: application-wide, instance-specific, device-local, and shared inter-app namespaces. An agent building a weather dashboard can seed test data for a specific city, switch locations, and verify the UI updates correctly, without the developer touching a browser.

Visual capture tools (2) provide screenshots at any aspect ratio, including an automatic sweep across all eight presets from 5:1 ultra-wide through 1:5 portrait skyscraper. This is the "eyes" part: implement a change, capture, evaluate, iterate.

Canvas control tools (4) switch aspect ratios, set background presets, toggle between render/settings/web views, and switch color schemes. The agent tests a corporate dashboard against a dark conference room background, then flips to a bright retail context.

Log and diagnostics tools (3) read application console output, build server logs, and platform operation logs. When something breaks, the agent reads the error, diagnoses the issue, and fixes it in the same conversation turn.

Runtime control (2) and data tools (1) round out the set with application reload, server restart, and simulation data reset.

The key insight: the AI agent never leaves the code editor. The MCP server provides everything that would normally require switching to a browser, opening DevTools, or checking a terminal.

15 Skills That Encode Platform Expertise

Raw tool access isn't enough. An agent with 18 hammers still needs to know which nail to hit first. TelemetryOS equips every new project with 15 domain-specific skills that encode deep platform knowledge into structured guidance.

A requirements skill walks through a six-phase interactive conversation: Vision, Render, Data, Settings, Multi-Mode, and Summary. This happens before a single line of code is generated. Architecture skills guide mount point structure and multi-mode patterns. Design skills (five of them) cover constraints specific to signage, kiosks with touch and session management, web portals with SPA routing, and settings interfaces using the SDK's 23-component design system. Data skills handle store synchronization, CORS proxy integration, weather API access, and media library usage. Testing and debugging skills close the loop.

The skills build on each other deliberately. Requirements feed architecture decisions, which inform design patterns, which guide implementation. AI agents generate better code when they have explicit constraints to work within.

The Six-Phase Conversation

Most AI coding workflows start with code generation. That's backwards for screen applications, where physical context matters as much as logic. A drive-through menu board has different requirements than a lobby dashboard, even if both display data from the same API.

The requirements skill enforces six phases before code exists:

Vision establishes what the application does and what problem it solves.
Render defines layout patterns, animation behavior, and information hierarchy.
Data maps store keys, external sources, refresh intervals, and fallback behavior.
Settings determines what operators can configure without touching code.
Multi-Mode addresses entity-scoped variations: different stores, departments, or buildings.
Summary produces a structured specification that becomes the implementation blueprint.

Anyone who has built screen applications recognizes the failure mode this prevents: jumping straight to code, then discovering three weeks later that the client expected portrait orientation, offline operation, or a settings interface that doesn't exist.

The Visual Feedback Loop

Here's where the workflow gets concrete. The agent implements a component, then calls the screenshot tool with a sweep across all eight aspect ratios. The images reveal that a two-column layout collapses on portrait screens, or that text truncates on ultra-wide displays. The agent adjusts, sweeps again, verifies.

This cycle runs dozens of times in a single session. It's the same feedback loop a human developer runs by resizing a browser window, except it's systematic (every aspect ratio, every time) and fast (no context switching).

The limitation worth acknowledging: screenshot evaluation works well for layout issues but poorly for animation timing, transition smoothness, and interaction flow. An agent can verify that a kiosk idle screen renders correctly but cannot evaluate whether the touch response feels natural. Visual feedback loops have clear boundaries, and pretending otherwise leads to false confidence.

From Prompt to Physical Screen

Deployment closes the loop. A single Publish action inside the Developer App — also exposed to agents through the MCP server — handles version bumping, archive creation, upload, cloud build, and deployment to assigned screens in one step. For teams using Git workflows, pushing to a connected GitHub repository triggers automatic builds and deployment, with branch-specific deployments for staging screens.

That said, deploying AI-generated code to physical screens in public spaces demands more governance than deploying to a web server. Staged rollouts, instant rollback to any previous version, and audit trails are not optional. The speed of AI-generated development makes these guardrails more important, not less. When you can go from prompt to production in an afternoon, the temptation to skip review is real. Resist it.

An Opinionated Environment as Taste

The TelemetryOS Developer App makes deliberate choices about what good screen applications look like. Five mount points (render, settings, web, background workers, Docker containers) with clear responsibilities. Four storage scopes with explicit synchronization patterns. A settings design system with 23 pre-built components. These constraints are opinionated, and that's the point.

AI agents generate better code against opinionated platforms because the decisions are already made. The agent doesn't need to choose between localStorage and IndexedDB, invent a settings UI pattern, or decide how to handle offline state. The SDK has answers. The opinionated surface area means the agent's output is more predictable, more testable, and more maintainable.

The tradeoff is flexibility. If your application needs something outside the prescribed patterns, the SDK works against you rather than for you. Standard web technologies remain the escape hatch, since TelemetryOS applications are web applications underneath, but the AI skills won't guide you through uncharted territory. The opinionated path works best for data displays, interactive kiosks, sensor-driven signage, and connected screen experiences.

What Comes Next

The interesting question isn't whether AI agents can build screen applications. They already can, given the right tools. The interesting question is what happens when the bottleneck shifts from writing code to reviewing it. When an AI agent goes from requirements to deployed application in hours, the scarce resource becomes taste: knowing what to build, how it should behave in physical space, and what "good enough" looks like when screens are in front of real people.

That tension between creation speed and governance isn't going away. The teams that navigate it well won't be the ones with the best AI models. They'll be the ones with the best processes for deciding what should exist on their screens in the first place.