Back to Blog
engineeringarchitecture

Point, Copy, Paste: An Element Inspector Built for Vibe Coders

We built a silent DevTools inspector that copies any UI element's identity — selector, text, aria labels, parent chain — to your clipboard, ready to paste into an LLM. No browser extension required.

Callipso TeamMarch 11, 202611 min read

Point, Copy, Paste: An Element Inspector Built for Vibe Coders

You are staring at a button in your app. You want to change it. You open Claude Code or Cursor and type: "change the button in the header." Which button? There are twelve buttons in the header. The LLM asks for clarification. You squint at the code, search for class names, grep for text content. Three minutes later you have a selector. Three minutes you did not need to spend.

This happens constantly when you vibe code. You move fast, the UI grows faster, and you lose track of what things are called. There are no tooltips because you never wrote them. There are no comments because the LLM did not add them. You are looking at your own app and you cannot name the thing you want to change.

We built a tool that solves this in two keystrokes. Press F8, hover over any element, press Cmd+Shift+X. Your clipboard now contains the element's exact identity — its CSS selector, text content, aria labels, data attributes, and its parent chain for disambiguation. Paste it into your LLM. It knows exactly what you mean, first try, every time.

The Problem: "That Button"

The fundamental issue is a context gap between what you see and what you can describe.

You see a button. It is blue, it says "Deploy", it is in the top-right area of some panel. You know exactly which button you mean. But your LLM does not have eyes. It needs a selector, a class name, or enough structural context to find the right element in a codebase with thousands of DOM nodes.

This gets worse in large codebases. There might be three elements with the class .deploy-btn — one in the header, one in the sidebar, one in a modal. Telling the LLM "the deploy button" is ambiguous. Telling it "the .deploy-btn inside the .header-actions container that is a child of .main-toolbar" is precise. But you do not know that hierarchy off the top of your head.

Manual inspection works but breaks the flow. You open DevTools, navigate the Elements panel, click through the DOM tree, find the element, read its attributes, mentally compose the parent chain, switch back to your LLM, and type it all out. That is six context switches for one piece of information.

How It Works

The solution has three parts: a silent DevTools window, a point-and-identify mechanism, and a parent chain resolver.

Silent DevTools

When Callipso starts, it creates a BrowserWindow for Chrome DevTools but never shows it:

typescript
devToolsWindow = new BrowserWindow({
    width: 800,
    height: 600,
    show: false,  // invisible
    title: 'DevTools - Callipso',
});

mainWindow.webContents.setDevToolsWebContents(devToolsWindow.webContents);
mainWindow.webContents.openDevTools({ mode: 'detach' });

The DevTools window exists in memory, fully functional, but invisible. The user never sees it. This gives us programmatic access to the DevTools inspector without any visual noise.

When the user closes the window (accidentally or intentionally), we intercept the close event and hide it instead of destroying it:

typescript
devToolsWindow.on('close', (e) => {
    e.preventDefault();
    devToolsWindow.hide();
});

The window survives for the entire app session. No re-creation overhead, no state loss.

F8: Toggle Inspector

Pressing F8 sends Cmd+Shift+C to the hidden DevTools window, which toggles the element inspector overlay:

typescript
globalShortcut.register('F8', () => {
    if (!f8InspectorActive) {
        const devTools = mainWindow.webContents.devToolsWebContents;
        devTools.sendInputEvent({
            type: 'keyDown',
            keyCode: 'C',
            modifiers: ['shift', 'meta']
        });
        f8InspectorActive = true;
    }
});

The inspector highlight appears over your app. You see colored outlines around elements as you move your mouse. This is the standard Chrome DevTools inspector — we are just activating it through a global keyboard shortcut instead of clicking the magnifying glass icon.

Cmd+Shift+X: Copy Element Identity

This is where the real work happens. When you press Cmd+Shift+X, we:

  1. Get the cursor position relative to the app window
  2. Run document.elementFromPoint(x, y) in the renderer
  3. Extract the element's identity: selector, text, aria labels, data attributes, title, placeholder
  4. Walk up the parent chain to find meaningful ancestors for disambiguation
  5. Format everything and copy to clipboard

The parent chain resolution is the critical part. Here is a simplified version of the logic:

javascript
// Find the most specific selector for this element
function getSelector(element) {
    if (element.id) return '#' + element.id;
    const tag = element.tagName.toLowerCase();
    const classes = element.className.trim().split(/\s+/)
        .filter(c => c && c.length > 0);
    if (classes.length) {
        const best = classes.find(c => c.includes('-')) || classes[0];
        return '.' + best;
    }
    return tag;
}

// Walk up to find meaningful parent context
let parent = element.parentElement;
let depth = 0;
const meaningfulParents = ['button', 'a', 'input', 'select', 'label', 'li', 'tr'];
while (parent && depth < 4) {
    const tag = parent.tagName.toLowerCase();
    if (meaningfulParents.includes(tag)) {
        parentType = tag + (parent.className ? '.' + parent.className.split(' ')[0] : '');
        break;
    }
    // Also check for semantically meaningful class names
    const classes = parent.className?.split(/\s+/) || [];
    const meaningful = classes.find(c =>
        c.includes('btn') || c.includes('button') || c.includes('tab') ||
        c.includes('item') || c.includes('card') || c.includes('entry')
    );
    if (meaningful) {
        parentType = tag + '.' + meaningful;
        break;
    }
    parent = parent.parentElement;
    depth++;
}

The output looks like this:

.deploy-btn (in .header-actions)
text: "Deploy"
data-action: "deploy-production"

Or for a more complex element:

.status-indicator (in li.terminal-item)
aria: "Terminal status"
title: "Running"
data-terminal-id: "abc-123"

You paste this into your LLM and say "change the color of this element." The LLM has the exact selector, knows it is inside a li.terminal-item, has the data attribute, and can find it in your codebase unambiguously.

The Race Condition That Broke Everything

We shipped this feature and it worked — most of the time. Some startups, F8 did nothing. Cmd+Shift+E (another DevTools shortcut) silently failed. The inspector toggle worked on one launch but not the next.

We added a diagnostic endpoint (/dev/devtools-debug) to capture a rolling event log of every DevTools operation. The log revealed the problem immediately:

open → open → close-existing → window-created → devtools-closed

Two open events. DevTools was being opened twice on startup through two independent paths, 220ms apart:

  1. Main process did-finish-load handler reads localStorage and opens DevTools (correct path)
  2. Renderer loadDevToolsOnLaunch() sends open-devtools IPC after settings load (redundant path)

The second open destroyed the first window (as intended — we clean up existing windows before creating new ones). But the first window's destruction fired a devtools-closed event asynchronously. By the time that event arrived, the second window was already alive. The stale event cleared the references to the second window: devToolsWindow = null, devToolsWebContents = null.

The DevTools window existed on screen (hidden, as designed) but the app had no reference to it. Every shortcut that checked isDevToolsAvailable() returned false. Every attempt to send input to the DevTools web contents hit a null reference.

The Fix

Two changes. No new files.

Remove the redundant open. The renderer's loadDevToolsOnLaunch() was sending open-devtools IPC to do the same thing the main process was already doing. We deleted those four lines. The main process handler is the single source of truth.

Add window ID tracking. Even with the duplicate removed, we added defense-in-depth. Each DevTools window gets a numeric ID from Electron. When devtools-closed fires, we check: is this event for the current window ID? If not, it is a stale event from a destroyed window, and we ignore it.

typescript
let currentDevToolsWindowId: number | null = null;

// On create:
currentDevToolsWindowId = devToolsWindow.id;

// On devtools-closed:
window.webContents.on('devtools-closed', () => {
    const eventWindowId = devToolsWindow?.isDestroyed() ? null : devToolsWindow?.id;
    const isStale = eventWindowId !== null && eventWindowId !== currentDevToolsWindowId;
    if (isStale) return; // ignore stale events
    // ... clear refs ...
});

After the fix, the diagnostic endpoint shows exactly one open event, hasDevToolsWebContents: true, and all shortcuts work on every startup.

What Already Exists

We researched this space before writing this post. There are roughly six Chrome extensions, two YC-funded startups, an IDE with a built-in browser, and an open issue on Chrome DevTools MCP that address pieces of this problem.

ToolTypeElement PickingParent ChainAria / Data AttrsLLM-AgnosticInstall Required
UISelector2AIChrome extensionYes (Alt+O toggle)NoNoYesYes
DOM Extractor for LLMsChrome extensionYes (Cmd+Shift+.)NoNoYesYes
Element to LLMChrome extensionYes (click)Siblings onlyPartialYesYes
Copy DOM and CSS for AIFirefox extensionYes (click)Yes (to body)NoYesYes
Inspector (YC F25)Desktop appYes (click)React source linkNoYesYes
Onlook (YC F25)Desktop appYes (click)React component treeNoYesYes
Cursor Visual EditorBuilt-in browserYes (click)React componentsNoNoNo
CallipsoBuilt-inYes (F8 + Cmd+Shift+X)Yes (4 levels + semantic)YesYesNo

The Chrome extensions are the closest. UISelector2AI has a toggle mode and generates AI-formatted prompts. DOM Extractor for LLMs has drag and precision modes with Claude Code-formatted output. Both are useful tools.

Cursor's Visual Editor: Same Problem, Different Trade-Offs

Cursor shipped a Visual Editor in their built-in browser starting with Cursor 2.2 (December 2025). It solves the same context gap — you see a button, you want the AI to know which one — but the approach is fundamentally different.

In Cursor, you click a "Select Element" tool in the built-in browser, click an element in your rendered app, and the AI agent immediately gets the element's context. On React apps, it resolves the component name and props and searches your codebase for the corresponding file. A design sidebar opens with visual controls for margins, padding, colors, and layout. You can type "make this bigger" and the agent writes the edit. It is a tight, integrated loop.

The trade-offs become clear when you step outside Cursor's boundaries:

It only works inside Cursor's browser. If you are building an Electron app, a Tauri app, a mobile app in a simulator, or anything that does not render in Cursor's built-in browser tab, the Visual Editor cannot see it. Callipso's approach works on any app with a web renderer — your own Electron app, any browser tab via CDP, any CEF-based application — because it operates at the DOM level through DevTools, not through a specific IDE's browser.

It only feeds Cursor's agent. The selected element's context goes directly into Cursor's AI chat. You cannot paste it into Claude Code, ChatGPT, or any other tool. If you switch IDEs — or use multiple agents across different terminals, which is Callipso's core workflow — the element context is locked in. Callipso copies to clipboard. Paste it anywhere.

React-first component resolution vs. universal DOM inspection. Cursor's strength is React: it resolves component names, props, and can navigate the component tree. On a plain HTML page, a Vue app, a Svelte app, or a legacy jQuery codebase, the component resolution does not apply. Callipso's parent chain walk uses raw DOM — CSS selectors, class names, semantic tags, aria labels, data attributes. It works on any web technology because it does not depend on a framework's runtime.

Click vs. hover. Cursor requires clicking, which selects the element and opens the design sidebar. Callipso uses hover (via the DevTools inspector overlay) followed by a keyboard shortcut. The hover approach lets you explore multiple elements quickly without committing to a selection, and the keyboard shortcut means your hands never leave the keyboard.

Cursor's Visual Editor is good for what it does — if you are building a React web app inside Cursor, the click-to-component flow is fast and well-integrated. But it is a feature inside one IDE's browser. Callipso's inspector is a pattern that works across any web-rendered application, any LLM, and any development setup.

The Cursor community has been requesting a "click-to-source" feature — click an element, open the source file at the right line — that has not shipped yet. Callipso sidesteps that problem entirely: the clipboard output gives any LLM enough context to find the source itself.

The Broader Gaps

Beyond Cursor, none of the tools in the table above combine a standalone desktop app, global keyboard shortcuts that work outside the browser, parent chain extraction with semantic filtering, aria label and data attribute capture, and a silent DevTools window that requires zero setup. The Chrome extensions require installing an extension, which only works in Chrome or Firefox, not in an Electron app you are building.

Chrome DevTools MCP has an open issue (#268) requesting exactly this kind of "select element in page for AI inspection" feature. It has been open since October 2025. The demand is clearly there.

Platform Support: Where This Works

This pattern works for any application that uses a web-based rendering layer.

Electron apps — this is what we ship. Callipso is an Electron app inspecting its own renderer. If you are building an Electron app, you can copy the exact pattern: create a hidden BrowserWindow, set its webContents as the DevTools target, and use document.elementFromPoint() for inspection. Your users get element identification for free with no additional tooling.

Web apps via Chrome DevTools Protocol (CDP) — if you are building a web app (React, Vue, Svelte, anything in a browser), you can achieve the same thing by launching Chrome with --remote-debugging-port=9222 and connecting via CDP:

bash
# Launch Chrome with debug port
"/Applications/Google Chrome.app/Contents/MacOS/Google Chrome" \
  --remote-debugging-port=9222 \
  --user-data-dir=/tmp/chrome-debug-profile

# Execute elementFromPoint via CDP
curl -s http://localhost:9222/json/list  # get tab WebSocket URL
# Connect via WebSocket and call Runtime.evaluate with the same JS

The document.elementFromPoint() logic, parent chain walking, and clipboard formatting are all plain JavaScript. They work identically in Electron and in a browser tab accessed through CDP. The only difference is how you get programmatic access to the renderer — Electron gives it to you natively, Chrome gives it through the debug protocol.

CEF-based apps (Spotify, Steam, other apps using Chromium Embedded Framework) and Tauri apps (WebView-based) are also web-based and support similar inspection. We have not tested these ourselves, but the same DOM APIs apply.

Implementing It Yourself

If you want to add this to your own Electron app, here is the minimal implementation.

Step 1: Silent DevTools Window

typescript
import { BrowserWindow } from 'electron';

let devToolsWindow: BrowserWindow | null = null;

function openSilentDevTools(mainWindow: BrowserWindow) {
    devToolsWindow = new BrowserWindow({
        width: 800,
        height: 600,
        show: false,
    });

    mainWindow.webContents.setDevToolsWebContents(devToolsWindow.webContents);
    mainWindow.webContents.openDevTools({ mode: 'detach' });

    devToolsWindow.on('close', (e) => {
        e.preventDefault();
        devToolsWindow?.hide();
    });
}

Step 2: Element Identification

typescript
import { globalShortcut, screen, clipboard } from 'electron';

globalShortcut.register('CommandOrControl+Shift+X', async () => {
    const cursor = screen.getCursorScreenPoint();
    const bounds = mainWindow.getBounds();
    const x = cursor.x - bounds.x;
    const y = cursor.y - bounds.y;

    const info = await mainWindow.webContents.executeJavaScript(`
        (function() {
            const el = document.elementFromPoint(${x}, ${y});
            if (!el) return null;

            function getSelector(el) {
                if (el.id) return '#' + el.id;
                const cls = el.className?.trim().split(/\\s+/)
                    .filter(c => c && c.includes('-'))[0];
                return cls ? '.' + cls : el.tagName.toLowerCase();
            }

            function getParentContext(el) {
                let p = el.parentElement, depth = 0;
                while (p && depth < 4) {
                    const tag = p.tagName.toLowerCase();
                    const cls = p.className?.split(/\\s+/)[0];
                    if (['button','a','li','tr','label'].includes(tag))
                        return tag + (cls ? '.' + cls : '');
                    if (cls && /btn|button|tab|item|card/.test(cls))
                        return tag + '.' + cls;
                    p = p.parentElement;
                    depth++;
                }
                return null;
            }

            const selector = getSelector(el);
            const parent = getParentContext(el);
            const text = el.innerText?.trim().substring(0, 80) || '';
            const aria = el.getAttribute('aria-label') || '';

            return { selector, parent, text, aria };
        })()
    `);

    if (info) {
        const parts = [];
        parts.push(info.parent
            ? info.selector + ' (in ' + info.parent + ')'
            : info.selector);
        if (info.text) parts.push('text: "' + info.text + '"');
        if (info.aria) parts.push('aria: "' + info.aria + '"');
        clipboard.writeText(parts.join('\\n'));
    }
});

That is roughly 50 lines. Add the F8 inspector toggle (send Cmd+Shift+C to the hidden DevTools web contents) and you have the full feature.

For Web Apps (CDP)

If you are not in Electron, the same JavaScript runs through CDP's Runtime.evaluate:

javascript
// Node.js script using the 'ws' package
const WebSocket = require('ws');

async function inspectElement(tabWsUrl, x, y) {
    return new Promise((resolve) => {
        const ws = new WebSocket(tabWsUrl);
        ws.on('open', () => {
            ws.send(JSON.stringify({
                id: 1,
                method: 'Runtime.evaluate',
                params: {
                    expression: `(function() {
                        const el = document.elementFromPoint(${x}, ${y});
                        if (!el) return null;
                        // ... same getSelector, getParentContext logic ...
                        return JSON.stringify({ selector, parent, text, aria });
                    })()`
                }
            }));
        });
        ws.on('message', (data) => {
            const resp = JSON.parse(data);
            resolve(JSON.parse(resp.result?.result?.value || 'null'));
            ws.close();
        });
    });
}

The DOM logic is identical. Only the transport layer changes.

Where This Needs Work

We ship this in Callipso today and use it daily, but it has gaps.

Parent chain depth. We walk four levels up. In deeply nested UIs (a table inside a card inside a tab inside a panel), four levels may not be enough to disambiguate. We plan to make the depth configurable and to add smarter heuristics that stop at semantic boundaries like [role="dialog"] or <section>.

Duplicate class names. If two sibling elements share the same class and the same text content, the current output does not distinguish them. Adding positional context (:nth-child(2)) or nearby sibling info would help. We have not shipped this yet.

Non-visible elements. The inspector only works on elements you can hover over. Overflow-hidden content, elements behind modals, or elements that only appear on hover are not reachable. This is a fundamental limitation of the point-and-click approach.

Formatting options. The output format is plain text, optimized for pasting into a chat. Some users might prefer JSON, or a format that includes the full CSS path, or one that includes computed styles. We have not added output format options yet.

What We Learned

  1. The context gap is the bottleneck. In vibe coding, the slowest step is not the LLM's response time or the code generation quality. It is the human describing what they want precisely enough for the LLM to act on it. Any tool that reduces the description overhead directly increases vibe coding speed.

  2. Silent DevTools is an underused pattern. Most Electron apps that open DevTools do it visibly — either docked or in a separate window. Keeping it hidden but functional gives you programmatic access to the inspector, the console, the network panel, and the profiler without any UI cost. We use it for four different features (F8 inspector, Cmd+Shift+X copy, Cmd+Shift+E toggle, Cmd+Shift+F12 show/hide).

  3. Parent chain beats flat selectors. A selector like .deploy-btn is ambiguous. A selector like .deploy-btn (in .header-actions) is not. The four-level parent walk with semantic filtering (looking for meaningful container classes like -item, -card, -tab) is simple to implement and dramatically improves LLM accuracy.

  4. Race conditions in window lifecycle are subtle. Two independent code paths opening DevTools 220ms apart, with asynchronous close events from the first window clearing refs for the second — this is the kind of bug that works 80% of the time and fails silently the other 20%. Diagnostic event logs with timestamps caught it instantly. Without the /dev/devtools-debug endpoint, we would have been guessing for hours.

  5. The ecosystem is fragmented. Six Chrome extensions, two YC startups, an open MCP issue. Everyone sees the problem. Nobody has combined the pieces into a zero-install, global-shortcut, parent-chain-aware solution that works outside the browser. The demand is real and growing as vibe coding becomes the default way people build software.

Two keyboard shortcuts. Fifty lines of code. Your clipboard knows exactly which button you are looking at. That is the kind of leverage that makes vibe coding actually work.

Share: