Most ways of wiring an AI model into your own shortcuts mean building raw API calls, juggling headers, and parsing JSON by hand. Grok AI Nano skips all of that. It’s a drop-in action that hands xAI’s Grok models a dictionary and returns a tidy result you can feed straight into the rest of your workflow.
How it works
Think of Nano as a replacement for the old “Ask ChatGPT” action, except it talks to Grok. You pass it a Dictionary with a text, image, or imageGen key, and it returns a response in the mode you asked for. Text comes back as a dictionary that also tracks token usage and cost; generated images and videos come back as the file itself.
Under the hood it leans on the text and vision models’ agentic tools: web_search for fresh information, x_search for real-time X posts, and code_interpreter for running Python on the fly.
What you’ll need
- Grok AI Chat: the companion shortcut that stores your key, defaults, and tool settings. Nano won’t run without it set up in API mode.
- An xAI API key: generate one at console.x.ai. It needs access to at least one text, image, or video model.
- iCloud Drive: where your settings and key actually live, handled through Grok AI Chat.
- An internet connection: every request goes out to xAI over Wi-Fi or cellular.
Adding it to your iPhone
- Install Grok AI Chat first and switch it to API mode.
- Tap Add Shortcut on this page to open Grok AI Nano in the Shortcuts app.
- When the import sheet appears, scroll to the bottom and tap Add Shortcut to confirm.
First-run setup
Before Nano can do anything useful, Grok AI Chat has to be configured. Open it, choose API mode, paste your key from console.x.ai, and pick the default models you want for text, image, and video. One thing worth knowing: the API charges for usage, so you’ll need a payment method on file in the xAI console before your first request goes through. Once Chat is holding your key and defaults, Nano reads them automatically and you never touch the key again.
Using it day-to-day
The pattern is always the same: build a Dictionary, run Nano with it as the input. For a question, add a text key with your prompt, then read the answer back with a Get Dictionary Value action set to response. Want the cost of that call? Pull usage.cost_in_usd from the same result.
Swapping modes is just swapping keys. Use imageGen instead of text to generate a picture, or videoGen to make a five-second clip. Add an image key holding Base64-encoded data (up to three for text and vision, one for editing) and Grok will analyze or transform it. To override the default model for a single call, drop in a model key like grok-4.3 or grok-imagine-image-quality.
Tips
Video generation takes noticeably longer than text or images, so don’t assume the shortcut has stalled. If you’re animating a still and can’t think of a direction, give it something plain like “make a video out of this image” anyway, because Grok tends to fail on an empty instruction. And when a call goes sideways, check for an error key in the output before assuming your dictionary was wrong.
FAQ
Do I really need a second shortcut just to use this?
Yes, and there’s a reason for it. Grok AI Chat handles key storage and model defaults so Nano can stay tiny and focused. Set up Chat once and you can call Nano from as many of your own shortcuts as you like.
Is the API actually free?
The shortcut is, but the Grok API isn’t. You pay xAI per request based on tokens and the model you pick, which is why Nano reports cost_in_usd on every text call so you can keep an eye on spend.
What happened to the older Grok models?
Nano only works with the current grok-4 family. The grok-3 series and anything older are gone, and xAI now reroutes requests for retired models straight to grok-4.3.