Validate the Generator

There is a sentence that reliably derails conversations about accessibility and AI:
“AI can’t write proper alt text.”
It sounds responsible. It sounds ethical. It sounds like care.
It is none of those things.
It is a refusal to confront scale, history, and reality.
Alt text exists for one reason: to prevent blind and low-vision users from being excluded from visual content. That’s it. It is not literary criticism, not moral authorship, not an expression of the creator’s inner life. It is infrastructure. And infrastructure only works if it exists everywhere.
Here is the uncomfortable truth we need to start with:
Humans have never written alt text for 100% of images, and they never will.
Not on social media.
Not on ecommerce platforms.
Not on news sites.
Not on personal blogs.
Not in 1999, not in 2025, not in any plausible future.
This is not because people are bad. It is because humans are inconsistent, distracted, underpaid, and operating inside incentive structures that do not reward meticulous accessibility work. We have decades of evidence. The web is still full of missing, broken, or useless alt text.
Continuing to insist that humans should do this manually, perfectly, forever is not ethics. It is wishful thinking dressed up as virtue.
Accessibility at scale is an engineering problem.
And engineering problems require systems.
The category error
Much of the resistance to AI-generated alt text comes from a fundamental category error: treating alt text as expressive authorship rather than functional description.
Alt text does not require knowing what the photographer “meant.”
It does not require guessing intention.
It does not require emotional interpretation.
Proper alt text answers a narrow question: what is visibly present and relevant here, given the context? That is a constrained descriptive task — object recognition, spatial relationships, visible text, basic actions.
Vision–language models already do this well. In many cases, they do it more consistently than humans rushing to fill a required field or pasting boilerplate like “image may contain…”
When someone says “AI can’t write proper alt text,” what they usually mean is one of three things:
– AI might be wrong sometimes
– AI might scale mistakes
– AI removes visible human effort
All of these are deployment concerns. None of them are capability arguments.
Word processors can make spelling mistakes. We still use spellcheck.
Machine translation can be imperfect. We still translate at scale.
Fraud detection produces false positives. We still run it continuously.
We do not ban tools because they are imperfect. We validate them.
The real goal: 100% coverage
From a blind user’s point of view, the hierarchy of harm is very simple:

  1. No alt text (total exclusion)
  2. Bad alt text (misleading, but at least present)
  3. Adequate alt text (functional access)
                      1. Excellent alt text (rare, contextual, crafted)
                        The current system overwhelmingly produces number 1.
                        Missing alt text is not neutral. It is absolute exclusion.
                        A world where every image has adequate alt text is dramatically more accessible than a world where a small percentage of images have perfect alt text and the rest have nothing.
                        This is the point many debates carefully avoid: “good enough” at scale is not an evil compromise — it is often the most ethical outcome available.
                        Validate the generator, not every sentence
                        The mistake is thinking the choice is between “AI writes everything blindly” and “humans write everything manually.”
                        That is not how any serious system works.
                        The correct model is this:
                        – Require alt text for 100% of images
                        – Use AI to generate a consistent baseline
                        – Validate the generator, not each individual output
                        – Classify images by risk
                        – Spot-check low-risk images statistically
                        – Escalate high-risk images to human review
                        This is not radical. This is how we already handle:
                        – spam filtering
                        – content moderation
                        – credit card fraud
                        – search ranking
                        – live captioning
                        – automated translation
                        Amazon does not hand-write every product description.
                        YouTube does not manually caption every video.
                        Banks do not review every transaction.
                        They validate pipelines. They measure error rates. They monitor drift. They intervene where harm is plausible.
                        Accessibility deserves the same seriousness.
                        Risk is contextual, not universal
                        Not all images carry the same risk.
                        A stock photo of a mug.
                        A product shot of socks.
                        A landscape photo.
                        A meme with visible text.
                        These are low-risk images. The harm of an imperfect description is limited, and the harm of no description is guaranteed.
                        High-risk images are different:
                        – news photography
                        – medical imagery
                        – legal evidence
                        – sensitive personal content
                        – images involving identity, race, disability, or children
                        These deserve human review, additional context, or multiple descriptive layers.
                        Treating all images as high-risk is not ethical caution. It is paralysis.
                        Accountability improves, not weakens
                        One of the quiet advantages of AI-generated alt text is that responsibility becomes traceable.
                        If a system produces biased, misleading, or harmful descriptions, accountability sits with:
                        – the model choice
                        – the prompt design
                        – the evaluation criteria
                        – the deployment context
                        Contrast that with the current reality, where missing alt text often has no owner at all. No one is accountable for absence.
                        “Validate the generator” is not an abdication of responsibility. It is a demand for systemic accountability instead of individual guilt.
                        This does not replace human craft
                        None of this argues against richer description, extended narratives, or interpretive layers. It argues for separating them properly.
                        Alt text is infrastructure.
                        Extended descriptions are optional content.
                        Vibe-based narratives are contextual and opt-in.
                        Conflating these layers creates fear, confusion, and bad outcomes for everyone — especially blind users who just want to scroll without being trapped in a novella.
                        AI is exceptionally good at producing the baseline layer. Humans are better at handling the edge cases, the sensitive contexts, and the meaning-heavy moments.
                        That division of labour is not a threat to accessibility. It is how accessibility finally becomes real.
                        The honest position
                        The honest position is not “AI can’t write proper alt text.”
                        The honest position is this:
                        AI can write proper alt text.
                        Humans should design, validate, and govern the systems that do so.
                        Blind users benefit more from universal coverage than from theoretical perfection.
                        Refusing tools does not make the web accessible.
                        Building validated systems does.
                        If we actually care about access — not optics, not moral theatre, not performative purity — then it is time to stop arguing about whether AI should be allowed to help and start insisting that it be used well, transparently, and at scale.
                        Accessibility is not a handcrafted artefact.
                        It is a public utility.
                        And public utilities work because we validate the generator.
Charlotte Joanne

Charlotte Joanne

Charlotte Joanne is the editor of Through the AIs of the Blind. She curates essays, experiments, and voices exploring AI, perception, and access — shaping a publication where lived experience, design, and speculation meet.
United Kingdom