v0.1.0 · MIT · World-first

One API call. Eight assets.

Name: img-pilot
Author: Sakaax

The first Claude Code plugin that generates AI images — from your terminal. Claude reads your brief, writes the prompt, dispatches one well-chosen call, then derives every variant you need via sharp.

Step 1 — Add from marketplace

/plugin marketplace add Sakaax/img-pilot Copy

Step 2 — Install

/plugin install img-pilot@img-pilot Copy

Star on GitHub → brand-pilot → ux-pilot

Logo generator icon · text · combo

Favicon set 16 → 512 + webmanifest

Social cards OG · Twitter · Discord

GitHub banner 1280 × 640

App icons iOS · Android

9 providers plug your own key

Why img-pilot

Every AI image tool lives in a browser tab.

You context-switch, you prompt, you download, you drag into your project, you repeat for every variant. img-pilot lives inside Claude Code — the terminal you're already in — and makes Claude the art director.

	Browser tool	img-pilot
Prompt quality	User guesses at specifics	Claude builds from full UX + brand context
API calls per set	One per asset (5+ for full set)	One source → 8+ derived assets via sharp
Cost for full set	$0.20 – $0.40	$0.03 – $0.08
Consistency	Each asset designed in isolation	All derived from same source, guaranteed
Favicon set	Forgotten or default	Auto-derived, all sizes, webmanifest
Provider lock-in	Hardcoded to one	9 providers, one config line
API key safety	Hope you gitignored	Auto-gitignore + chmod 600 + pre-commit hook

How it works

Four steps. One call.

Every command runs through the same decision tree. The full plan is shown in chat before any paid API call. You approve, you pay.

Claude reads your context

Priority order: brand-pilot/tokens.css + palette cache + Tailwind snippet → ux-pilot/ux-brief.md → img-pilot/brief.md. If nothing exists, img-pilot runs a 6-question discovery and saves img-pilot/brief.md for future runs. You never repeat yourself across plugins.

Claude builds the prompt

200–300 word production-quality prompts. Exact hex values, specific style words from the brief, explicit constraints (works at 16×16, transparent background, flat design), and a curated anti-slop list (no gradients on logo marks, no stock aesthetics, no generic tech clichés). The full prompt is shown in chat before anything is dispatched.

The cost optimizer picks the cheapest path

Exists already? Skill asks "regenerate?" → SVG-pure viable? Zero cost, no API → Derivable from an existing asset? Resize / compose / round corners, zero cost → Derivable from another asset in this run's plan? One API call + N sharp derivations → API call with prompt + provider + cost emitted as a plan step.

Gallery

Every run updates img-pilot/gallery.html — a persistent dark-themed audit log of every asset. Each card shows the image, the prompt used, the provider, the cost, dimensions, and timestamp. Latest run on top. Browse on localhost with --serve.

The six commands

Six commands. One install.

Every command runs standalone or as part of the guided flow via /img-pilot. Outputs land in img-pilot/ at your project root (auto-gitignored).

/img-pilot logo

Logo mark

Icon · text · combo variants. 1024 × 1024 source, used as the seed for every derivation downstream.

/img-pilot favicon

Full favicon set

16 / 32 / 180 / 192 / 512 px + apple-touch-icon + site.webmanifest. Derived from the logo at zero API cost.

/img-pilot social

OG + Twitter + Discord

1200 × 630 · 1200 × 628 · 1280 × 720. Logo composed on brand background with exact palette applied. Zero API cost when deriving.

/img-pilot banner

GitHub banner

1280 × 640. Logo + tagline composed over brand background. Ready for .github/banner.png.

/img-pilot icons

iOS + Android app icons

180 / 192 / 512 px with proper rounded corners via SVG mask. Matches each platform's visual expectations.

/img-pilot config

First-time setup

Interactive config flow. Auto-gitignore + chmod 600 + pre-commit hook installed before any key is written.

Providers

Nine providers. No lock-in.

Thin adapters, same interface. Plug your own key. Switch per run with --provider <name>. Cost estimates shown before every call.

Provider	Best for	~$/image
OpenAI	General quality baseline (GPT Image 1.5)	$0.04
Black Forest Labs	Photorealism, sharp edges (FLUX 2 Pro)	$0.04
Google Imagen	Best value, strong text rendering (Vertex AI)	$0.04
Stability AI	Dev-friendly, self-hostable path (SD 3.5)	$0.03
Ideogram	Text-in-image — logos, wordmarks (v3)	$0.08
Leonardo AI	Custom models, brand fine-tuning	$0.035
Replicate	Access to any open-source model	variable
Recraft	Native SVG output — icons (v3)	$0.04
fal.ai	Ultra-fast, async webhooks	variable

The pilot ecosystem

Works with ux-pilot + brand-pilot.

img-pilot is the third plugin in the Sakaax pilot family. It reads both sister plugins' outputs automatically — palette, fonts, tone, style, validated design tokens — and inherits every design decision. Zero reconfiguration.

ux-pilot     → UX discovery + brief
brand-pilot  → Brand tokens (CSS + Tailwind + palette)
img-pilot    → AI-generated visual assets

ux-pilot landing → brand-pilot landing → img-pilot repo →

Same typography. Same palette. Same voice. The pilot plugins share one visual identity on purpose — when you install a second or third one, you already know where everything lives. Consistency at the ecosystem level is the same discipline img-pilot enforces at the asset level.

Cost & security

Built to not burn your credit card.

API keys are protected at three layers. Every call is confirmed before it costs money. Zero telemetry, zero outbound traffic you didn't trigger.

Cost transparency

Dry-run first. CLI emits a JSON plan with asset list, provider, full prompt preview, and total cost. Shown in chat before anything is dispatched.

Explicit confirm. Claude Code waits for your approval. No path in the code fires an API call without --confirm.

Hard session limit. max_api_calls_per_session = 5 by default. Configurable. Enforced in the CLI, not just the skill.

Security triple-layer

Auto-gitignore. img-pilot/ is appended to .gitignore before any config write. Write aborts if the check fails.

chmod 600. config.toml is set to owner read/write only immediately after write (POSIX; icacls equivalent on Windows).

Pre-commit hook. Installed in .git/hooks/pre-commit, scans staged files for sk-… / AIza… / key-… / api_key = "…" and blocks matching commits.

FAQ

Still wondering?

How does img-pilot keep costs down?

One generated logo (1024 × 1024) becomes the seed for every derivation. Favicon 16/32, apple-touch-icon, OG image, Twitter card, Discord embed, GitHub banner, iOS/Android icons — all derived via sharp at zero API cost. You pay for one call, you get eight assets. The cost optimizer runs this decision tree before every request and always proposes the cheapest path.

Which providers should I start with?

If you want the quickest setup: OpenAI (paste an API key, done). If you want the best logo with text: Ideogram (higher per-image cost but worth it for wordmarks). If you want open-source access to every model: Replicate. If you want ultra-fast async: fal.ai. You can configure multiple and switch per run with --provider <name>.

What happens to my API keys?

They never leave your machine except to hit the provider URL you've picked. Zero telemetry, zero feature flags, zero remote rule fetching. tcpdump will show you the only outbound traffic is the provider call you explicitly approved. On disk, keys live in config.toml with chmod 600 and the file is gitignored before it's even written. A pre-commit hook blocks any accidental commit of common key patterns.

Can I use it without ux-pilot or brand-pilot?

Yes. If neither exists, img-pilot runs a 6-question discovery (ABCD choices + free text) and saves an img-pilot/brief.md you can reuse on future runs. The ecosystem is additive — more briefs mean richer prompts, but one is enough.

Does it work with Midjourney?

Not in v0.1. Midjourney has no official REST API (Discord-only), and the workaround adapters are fragile and in a legal grey zone. When Midjourney ships an API, we'll add an adapter the same day. In the meantime, FLUX 2 Pro via Black Forest Labs or Replicate gets you comparable quality.

What's on the roadmap?

Self-hosted inference (local Stable Diffusion via Ollama/ComfyUI). Batch mode (N variants of the same prompt). Fine-tuning and style-reference upload (LoRA, IP-adapter). CI wrapper to regenerate assets when the brief changes. Provider benchmarking (same prompt across 3 providers, side-by-side). Live pricing from provider APIs at dry-run time. Follow the repo for progress.