I wanted to start to send some of my screen caps over to an LLM (securely of course) and realized it’s really easy to do on MacOS from the CLI. The trick is to enable Terminal to be able to capture the screen, and to learn how to easily grab the windowid for a process that’s running.
First Step
There’s a command I didn’t know about called screencapture
which you invoke from the CLI with a destination filename (it will overwrite it if it exists) and you can give it a variety of image format extensions.
screencapture foo.png
It won’t work unless you have enabled screensharing in Privacy settings for Terminal. Go to Settings > Privacy & Security
and then hit “+” to add Terminal (it’s inside Utilities within your Applications folder).
If you want to do the thing where you select an area of the screen in interactive mode or to capture a full window with spacebar, just do this:
screencapture -i interactive.png
Now if you want to capture a window, you’ll need to do a little more work.
Second Step
You want to install an old repo that seems to still work — it helps you get the WindowID parameter you need to capture a specific window from the CLI.
brew install smokris/getwindowid/getwindowid
To see where it got installed type
which getwindowid
Hopefully that’s in your path and you can just do:
getwindowid Terminal --list
And assuming you have Terminal open, you can see the names of different windows. But most importantly you can see, for example, an id=
3233 field. Next do (that’s a lowercase “L” flag)
screencapture -l 3233 terminal.png
That will save the window. Cool, huh?
Third Step
You’re used to seeing all the windows you could possibly screenshare and want to know how to do that. Here’s what you do:
osascript -e 'tell application "System Events" to get name of every application process whose visible is true'
This will give you a comma delimited list of the applications running. Use this script to make it easy for you to match applications to window ids:
#!/bin/bash
# Check if an application name was provided
if [ -z "$1" ]; then
echo "Usage: $0 <partial_app_name>"
exit 1
fi
# Get the list of visible applications from osascript
apps=$(osascript -e 'tell application "System Events" to get name of every application process whose visible is true')
# Convert the comma-separated list to a line-separated list (preserve spaces in names)
apps_clean=$(echo "$apps" | tr ',' '\n' | sed 's/^ *//;s/ *$//')
# Filter the application names based on the input (case-insensitive partial match)
matched_app=$(echo "$apps_clean" | grep -i "$1")
# Debugging: Show all matched applications
if [ -n "$matched_app" ]; then
echo "Matched applications:"
echo "$matched_app"
else
echo "No matching applications found for \"$1\"."
exit 1
fi
# Fetch and display window information for all matches
while read -r app; do
echo "=== Checking windows for \"$app\" ==="
echo "Using application name: \"$app\""
# Query getwindowid with the matched application name
output=$(getwindowid "$app" --list 2>/dev/null | grep -v '^""')
if [ -n "$output" ]; then
echo "$output"
else
echo "No windows found for \"$app\"."
fi
done <<< "$matched_app"
Store this is as findw.sh
and then install it with
sudo mv findw.sh /usr/local/bin/findw
chmod +x /usr/local/bin/findw
This lets you do things like
findw outlook
findw microsoft
And give you the related window ids (and also filter out any windows with title “”).
Step 4
Use it with LLM CLI and enjoy life by copying it to the clipboard
screencapture -l 14934 -c
Now you can send it to gpt-4o easily enough by installing pngpaste
brew install pngpaste
And then just
pngpaste - | llm "describe this image" -a -
If you want to do it all in one incantation
(screencapture -l 14934 -c && pngpaste -) | llm "describe this image" -a -
You can make this into a simple command by opening your shell resource file
vi ~/.zshrc
And add
describew() {
screencapture -l "$1" -c && pngpaste - | llm "describe this image" -a
}
Then add
source ~/.zshrc
And then
describew 14934
You must be logged in to post a comment.