v0.1.0 · MIT · World-first

One API call. Eight assets.

The first Claude Code plugin that generates AI images — from your terminal. Claude reads your brief, writes the prompt, dispatches one well-chosen call, then derives every variant you need via sharp.

Step 1 — Add from marketplace
/plugin marketplace add Sakaax/img-pilot Copy
Step 2 — Install
/plugin install img-pilot@img-pilot Copy
Logo generator icon · text · combo
Favicon set 16 → 512 + webmanifest
Social cards OG · Twitter · Discord
GitHub banner 1280 × 640
App icons iOS · Android
9 providers plug your own key
At a glance

Cost-optimized by design.

One generated logo (1024 × 1024) becomes eight brand assets via sharp. You pay $0.04. You don't pay $0.32. The math is in the decision tree.

1
API call
8
Derived assets
9
Providers supported
120
Tests, all mocked
0
Telemetry
Why img-pilot

Every AI image tool lives in a browser tab.

You context-switch, you prompt, you download, you drag into your project, you repeat for every variant. img-pilot lives inside Claude Code — the terminal you're already in — and makes Claude the art director.

Browser toolimg-pilot
Prompt qualityUser guesses at specificsClaude builds from full UX + brand context
API calls per setOne per asset (5+ for full set)One source → 8+ derived assets via sharp
Cost for full set$0.20 – $0.40$0.03 – $0.08
ConsistencyEach asset designed in isolationAll derived from same source, guaranteed
Favicon setForgotten or defaultAuto-derived, all sizes, webmanifest
Provider lock-inHardcoded to one9 providers, one config line
API key safetyHope you gitignoredAuto-gitignore + chmod 600 + pre-commit hook
How it works

Four steps. One call.

Every command runs through the same decision tree. The full plan is shown in chat before any paid API call. You approve, you pay.

01

Claude reads your context

Priority order: brand-pilot/tokens.css + palette cache + Tailwind snippet → ux-pilot/ux-brief.mdimg-pilot/brief.md. If nothing exists, img-pilot runs a 6-question discovery and saves img-pilot/brief.md for future runs. You never repeat yourself across plugins.

02

Claude builds the prompt

200–300 word production-quality prompts. Exact hex values, specific style words from the brief, explicit constraints (works at 16×16, transparent background, flat design), and a curated anti-slop list (no gradients on logo marks, no stock aesthetics, no generic tech clichés). The full prompt is shown in chat before anything is dispatched.

03

The cost optimizer picks the cheapest path

Exists already? Skill asks "regenerate?" → SVG-pure viable? Zero cost, no API → Derivable from an existing asset? Resize / compose / round corners, zero cost → Derivable from another asset in this run's plan? One API call + N sharp derivations → API call with prompt + provider + cost emitted as a plan step.

04

Gallery

Every run updates img-pilot/gallery.html — a persistent dark-themed audit log of every asset. Each card shows the image, the prompt used, the provider, the cost, dimensions, and timestamp. Latest run on top. Browse on localhost with --serve.

The six commands

Six commands. One install.

Every command runs standalone or as part of the guided flow via /img-pilot. Outputs land in img-pilot/ at your project root (auto-gitignored).

/img-pilot logo
Logo mark
Icon · text · combo variants. 1024 × 1024 source, used as the seed for every derivation downstream.
/img-pilot favicon
Full favicon set
16 / 32 / 180 / 192 / 512 px + apple-touch-icon + site.webmanifest. Derived from the logo at zero API cost.
/img-pilot social
OG + Twitter + Discord
1200 × 630 · 1200 × 628 · 1280 × 720. Logo composed on brand background with exact palette applied. Zero API cost when deriving.
/img-pilot banner
GitHub banner
1280 × 640. Logo + tagline composed over brand background. Ready for .github/banner.png.
/img-pilot icons
iOS + Android app icons
180 / 192 / 512 px with proper rounded corners via SVG mask. Matches each platform's visual expectations.
/img-pilot config
First-time setup
Interactive config flow. Auto-gitignore + chmod 600 + pre-commit hook installed before any key is written.
Providers

Nine providers. No lock-in.

Thin adapters, same interface. Plug your own key. Switch per run with --provider <name>. Cost estimates shown before every call.

ProviderBest for~$/image
OpenAIGeneral quality baseline (GPT Image 1.5)$0.04
Black Forest LabsPhotorealism, sharp edges (FLUX 2 Pro)$0.04
Google ImagenBest value, strong text rendering (Vertex AI)$0.04
Stability AIDev-friendly, self-hostable path (SD 3.5)$0.03
IdeogramText-in-image — logos, wordmarks (v3)$0.08
Leonardo AICustom models, brand fine-tuning$0.035
ReplicateAccess to any open-source modelvariable
RecraftNative SVG output — icons (v3)$0.04
fal.aiUltra-fast, async webhooksvariable
The pilot ecosystem

Works with ux-pilot + brand-pilot.

img-pilot is the third plugin in the Sakaax pilot family. It reads both sister plugins' outputs automatically — palette, fonts, tone, style, validated design tokens — and inherits every design decision. Zero reconfiguration.

ux-pilot     → UX discovery + brief
brand-pilot  → Brand tokens (CSS + Tailwind + palette)
img-pilot    → AI-generated visual assets

Same typography. Same palette. Same voice. The pilot plugins share one visual identity on purpose — when you install a second or third one, you already know where everything lives. Consistency at the ecosystem level is the same discipline img-pilot enforces at the asset level.

Cost & security

Built to not burn your credit card.

API keys are protected at three layers. Every call is confirmed before it costs money. Zero telemetry, zero outbound traffic you didn't trigger.

Cost transparency

Dry-run first. CLI emits a JSON plan with asset list, provider, full prompt preview, and total cost. Shown in chat before anything is dispatched.
Explicit confirm. Claude Code waits for your approval. No path in the code fires an API call without --confirm.
Hard session limit. max_api_calls_per_session = 5 by default. Configurable. Enforced in the CLI, not just the skill.

Security triple-layer

Auto-gitignore. img-pilot/ is appended to .gitignore before any config write. Write aborts if the check fails.
chmod 600. config.toml is set to owner read/write only immediately after write (POSIX; icacls equivalent on Windows).
Pre-commit hook. Installed in .git/hooks/pre-commit, scans staged files for sk-… / AIza… / key-… / api_key = "…" and blocks matching commits.
FAQ

Still wondering?

How does img-pilot keep costs down?
One generated logo (1024 × 1024) becomes the seed for every derivation. Favicon 16/32, apple-touch-icon, OG image, Twitter card, Discord embed, GitHub banner, iOS/Android icons — all derived via sharp at zero API cost. You pay for one call, you get eight assets. The cost optimizer runs this decision tree before every request and always proposes the cheapest path.
Which providers should I start with?
If you want the quickest setup: OpenAI (paste an API key, done). If you want the best logo with text: Ideogram (higher per-image cost but worth it for wordmarks). If you want open-source access to every model: Replicate. If you want ultra-fast async: fal.ai. You can configure multiple and switch per run with --provider <name>.
What happens to my API keys?
They never leave your machine except to hit the provider URL you've picked. Zero telemetry, zero feature flags, zero remote rule fetching. tcpdump will show you the only outbound traffic is the provider call you explicitly approved. On disk, keys live in config.toml with chmod 600 and the file is gitignored before it's even written. A pre-commit hook blocks any accidental commit of common key patterns.
Can I use it without ux-pilot or brand-pilot?
Yes. If neither exists, img-pilot runs a 6-question discovery (ABCD choices + free text) and saves an img-pilot/brief.md you can reuse on future runs. The ecosystem is additive — more briefs mean richer prompts, but one is enough.
Does it work with Midjourney?
Not in v0.1. Midjourney has no official REST API (Discord-only), and the workaround adapters are fragile and in a legal grey zone. When Midjourney ships an API, we'll add an adapter the same day. In the meantime, FLUX 2 Pro via Black Forest Labs or Replicate gets you comparable quality.
What's on the roadmap?
Self-hosted inference (local Stable Diffusion via Ollama/ComfyUI). Batch mode (N variants of the same prompt). Fine-tuning and style-reference upload (LoRA, IP-adapter). CI wrapper to regenerate assets when the brief changes. Provider benchmarking (same prompt across 3 providers, side-by-side). Live pricing from provider APIs at dry-run time. Follow the repo for progress.