TRACK A · AI ENGINEERING/ case-studies / shopify-plus-ai-ops

Running a multi-£m Shopify Plus store on Claude Code.

The case study we point at when prospects ask whether AI-native ecommerce operations actually work. This anonymised Shopify Plus retailer is where we built the content verticals, listing verticals, bulk-editing systems, custom Shopify tooling, dashboard surfaces and runbook workflows that now define the Hollow Point operating model.

£10k+/moOperating cost removedconservative baseline

100sHours saved every monthcatalogue, content, ops

7,000+Variants under active managementbulk edited, repriced, retagged

200+SEO content verticals shippedcollections + commercial pages

§ 00 — OPERATOR BRIEFcase summary

CLIENT

Anonymised UK consumer retail operator

Shopify Plus

CATALOGUE

~2,000 products / 7,000+ variants

active lifecycle

CUSTOMER BASE

500,000+ accounts

—

STACK

Shopify Plus + Claude Code + custom MCP

in-house

SYSTEMS

Content verticals, listing verticals, bulk editing, dashboard, private app

operator-built

ENGAGEMENT

Operator-built; deployed to clients since

self → service

§ 01 — CONTEXT

The store.

This is a UK Shopify Plus retail operation with multi-£m annual revenue, 500,000+ customer accounts, around 2,000 products, and 7,000+ variants once you count every flavour, size and configuration. Active SKU lifecycle management is not occasional admin here — new launches, clearance cycles, supplier changes, margin updates, stock changes and compliance-sensitive product data all move constantly.

The store has been running for years. The AI-engineering layer on top of it is more recent: a stack of tools, Claude Code skills, MCP servers, custom dashboards and private Shopify surfaces that turned the daily ops grind into something tractable for a small team.

This is also where our frontier SEO work came from. We were not just writing collection copy. We were building vertical systems: repeatable workflows for finding opportunities, producing content, uploading it safely, linking it internally, measuring it in GSC, and improving it again without adding headcount.

——

§ 02 — BRIEF — TO OURSELVES

We didn’t get hired for this. We’re the operators.

The brief evolved over time but settled on four things we wanted from the AI tooling:

One — replace the merchandising grind. Variant-level decisions across thousands of SKUs every quarter were eating most of the team’s week. Clearance triage, restock prioritisation, slow-mover identification, attribute normalisation. All of it doable by a person, none of it valuable to do by a person.

Two — make content scale linearly with effort, not headcount. SEO needs collection content, product descriptions, FAQ blocks, meta titles and descriptions across hundreds of pages. Doing that by hand limits how often you can refresh. Doing it with a generic AI tool makes it sound generic. We needed brand-tuned output we’d actually publish.

Three — own the tooling. SaaS tools hit a wall when you want them to do something specific to your store. We wanted code we owned, prompts we tuned, and workflows that fit our specific catalogue rather than the lowest common denominator.

Four — make the business context reusable. Every recurring workflow needed to become a runbook, skill or vertical that could be run again by the team. Not hidden in Billy’s head. Not buried in a chat transcript. Captured as a system the business could keep using.

——

§ 03 — WHAT WE BUILT

The stack, component by component.

Catalogue operations engine

A Python agent against the Shopify Admin GraphQL API, used for any workflow involving doing the same thing to hundreds or thousands of variants. First job was a cost-update run — 7,000+ variants updated in one bulk operation, 83% coverage on active products. Since then it’s done bulk metafield writes, meta title and description rewrites, redirect imports, attribute normalisation, and product creation from supplier spec sheets.

If a workflow involves the same operation across the catalogue, this is the engine.

Variant-level merchandising agent

Variant-aware clearance and lifecycle decisions. Pulls 90-day sales velocity per variant, applies a rule that took some painful learning to land on — only clearance-tag a product if every variant is slow, cut only the slow variants if the product is mixed — and writes the changes back through GraphQL.

The first full run did 248 SKU lifecycle decisions in a single day, variant-aware: 47 hardware, 201 e-liquid. Replaces roughly 40 hours of manual triage per week when run regularly.

SEO content sub-agents

Custom Claude Code agents for collection page rewrites, product descriptions, FAQ generation and meta titles. Each one is brand-voice tuned, fed structured input from Ahrefs and GSC, and writes output that meets our internal SEO checklist — primary keyword in H1 and first paragraph, semantic variations, FAQ blocks where they fit, internal links to sibling collections.

Used across 200+ collections on our own store. The output isn’t perfect — nothing AI writes is — but editing time is roughly 80% less than writing from scratch, and the quality bar is high enough that the published content wouldn’t read as AI-generated to a careful reader.

Content and listing verticals

Built vertical workflows for the work that repeats across the store: content verticals for collection SEO, listing verticals for new supplier catalogues, bulk-editing verticals for price and metafield changes, and optimisation verticals for pages already getting impressions in GSC.

This matters because scale is not just “write faster”. The system has to know what data to pull, what checks to run, what fields are safe to change, what needs human approval, and how to leave a traceable output the team can trust.

Private Shopify app for a client-specific use case

Designed and built a custom Shopify app from scratch for a specific operating requirement that off-the-shelf apps did not cover cleanly. The app gave the team a dedicated surface inside the Shopify workflow instead of forcing them to manage the process in spreadsheets or generic SaaS tools.

The broader point: when a store has a workflow that is specific, high-frequency, and commercially important, the right answer is sometimes a small private app, not another subscription duct-taped onto the stack.

Operator dashboard

Created dashboard surfaces for the decisions the team actually makes: slow movers, clearance candidates, content gaps, margin issues, stock risk, collection health, and search opportunities. The dashboard sits between the raw data and the operator, turning a messy stack into a weekly control surface.

That dashboard pattern now informs how we scope systems work for other CEOs: the goal is not more data, it is a cockpit for the operating rhythm of the business.

Custom MCP servers across the stack

In-house MCP servers connecting Claude to:

Shopify Admin GraphQL (read + write — variants, metafields, metaobjects, redirects, bulk ops)
Ahrefs (keyword data, ranking, SERP overview, competitor research)
Google Search Console (keywords, pages, performance, anonymous queries)
Sanity (cross-pollinated from the property finance build)
Internal merchandising data store (sales velocity, stock, supplier data)

These let Claude do real work on real systems. The MCP layer is the difference between AI tooling that helps and AI tooling that runs the operation.

Claude Code skills library

Authored skills for every recurring workflow — collection creation and content upload, Shopify product listing imports, weekly merchandiser reporting, clearance triage workflows, content writing for collection and product pages, competitor analysis, SEO content optimisation.

Skills turn ambiguous requests into deterministic workflows. “List these 12 products” goes from a 30-minute conversation about the brief to a 2-minute call to a skill that knows the brief. This is the same pattern we can build for any company that wants its own project system with context, verticals and repeatable CEO-level workflows.

Runbook operating system

Documented the recurring work as runbooks: what inputs are required, what the skill does, what checks happen before anything writes to Shopify, what gets reviewed by a human, and where the output is stored. This is what makes the system transferable. A CEO can ask for a content vertical, a listing vertical, a reporting vertical or a research vertical, and the business gets a durable workflow rather than a one-off answer.

Weekly merchandiser report agent

Produces a weekly report covering revenue by collection, top movers, slow movers needing attention, cost-margin alerts, low-stock warnings, and content opportunities surfaced from GSC. Pulls from Shopify, GSC and our internal store. Does what a junior merchandiser used to spend a day on, in roughly 10 minutes of compute, every Monday. Output is a Markdown document that goes straight into the team channel.

——

§ 04 — RESULTS

The honest numbers.

At least £10k/month in operating cost removed. Conservative baseline from reduced manual ops load, avoided hires, faster content production, fewer external-tool gaps, and less senior time spent on repetitive catalogue work.

Team size cut roughly in half. What used to need a small operations team — merchandiser, content person, junior dev for catalogue maintenance, plus periodic SEO consultancy — now runs with a much smaller core team plus the AI tooling.

Hundreds of hours saved per month. Conservative estimate. Clearance triage was ~40 hours/week previously, run quarterly. Bulk catalogue ops used to be a multi-day project; now a script. Content for new collections used to take 1–2 days each; now 1–2 hours. Listing and editing workflows that previously sat in spreadsheets now run through repeatable verticals.

Catalogue health. Every one of the 7,000+ variants has been touched in the last year. Most stores at this scale have stale tail-end inventory accumulating dust. We don’t, because the tooling makes maintenance cheap.

SEO foundation at a scale the team could not have manually sustained. 200+ collections with rewritten content, hundreds of meta titles and descriptions improved, internal-link opportunities identified from real search data, and commercial verticals refreshed without waiting for a traditional content sprint.

Custom systems that stayed owned by the business. The private app, dashboard, MCP servers, skills and runbooks are not generic AI demos. They are owned operating infrastructure for the store.

ops.delta

 — BEFORE —
xHeadcount:        5 ops + content + dev
xClearance run:    40 hrs/wk, quarterly
xCollection copy:  1–2 days each
xCustom workflows: spreadsheets + memory
xStale variants:   ~30% of tail
 
 — AFTER —
+Headcount:        ~50% of prior
+Cost removed:     £10k+/mo baseline
+Clearance run:    minutes, scripted
+Collection copy:  1–2 hours each
+Custom workflows: skills + runbooks
+Stale variants:   ~0% — all touched

——

§ 05 — WHAT IT MEANS FOR CLIENTS

The work we sell was built here first.

The reason this case study is on the site is straightforward: the AI engineering and ecommerce systems work we sell to clients was built here first. We use it daily. We know where it works, where it doesn't, and how long each part took to get right.

When we deploy a similar stack into a client's Shopify Plus store, we're not learning on the job. We're transplanting workflows that already run a store we own: content verticals, listing verticals, bulk-editing systems, dashboards, private apps, runbooks and context-aware skills.

Most of our AI engineering engagements involve some subset of the above — typically 8 to 12 weeks to ship the agents, MCP servers and skills that compound the most for that specific store. The deliverable looks different for every client because every catalogue, customer base and operating model is different. What’s consistent is the engineering pattern: in-house code, owned by you, deployed inside your environment, ready to run after we leave.

The same pattern also works outside Shopify. Any company with recurring operational work, fragmented context and a CEO carrying too much process in their head can have its own runbook-style operating system. This anonymised retail operator is the proof that the model works under real commercial load.

If you run a Shopify Plus store doing £1m+, or a business where the operating system lives in people and spreadsheets, the conversation is worth having.

——

hollowpoint.io / contact

operator@hollowpoint:~$cat next-step.txt

See the AI engineering engagement model.

The pattern, the scope, the deliverables. Everything described above as a fixed-scope engagement.

AI engineering brief Talk to us