🎰 Day 777 - SaaS idea: A11y Γ— AI - https://golifelog.com/posts/saas-idea-a11y-ai-1676508270598

I've been brewing this idea for a SaaS for a while that combines a bit of my wide-ranging interests: web accessibility, design for disability, social impact work, SaaS, indie hacking, and GPT-3 AI.

Idea:
Consumer/Enterprise GPT-3 AI app for image captioning, and generating alt text for images on the internet.

Problem:
- Inclusive design and web accessibility (aka "a11y") is important work but often deprioritised because it's deemed as "expensive" or "time-consuming" or "not relevant to our target customers". But companies might take it up if it's dead easy to implement, like just *npm install a11y*.
- One important a11y feature is having alt text for images in your website. Without alt text describing the images, folks who have visual impairment/blindness using screen readers will be deprived of social participation and understanding the context of the content through the image. Like say memes for example - all the context is in the image, and people don't often caption or describe it in the post or in alt text.

Opportunity:
- Platforms are now giving more algorithm weight to images with alt text. So it's good for business/marketing. Personally I add alt text to all my images on Twitter, not just for the higher algo juice but also for inclusion.
- AI like GPT-3 aren't just good for text-to-image generation. They can also read images and describe them. There's already [2 MVPs](https://gpt3demo.com/category/image-captioning) doing that. I tested out one of the [apps called CLIP](https://huggingface.co/spaces/akhaliq/CLIP_prefix_captioning) using a Spiderman meme, and the result "A group of cartoon characters standing on top of a blue skateboard" isn't not the most accurate description tbh. Lots of room for improvement and maybe that's where the opportunity is?

![Meme of 7 Spidermen pointing at one another](https://i.ibb.co/h9N5ypz/Screen-Shot-2023-02-16-at-7-24-54-AM.png)

Features & business model:
- Imagine if alt text is automated and hassle-free. All it takes is to install a package and the software adds all the required alt text for you into the code. Or you're uploading a picture to Twitter and the app auto-generates a description in 1 second. Most basic MVP version would be like CLIP – a drag and drop for an image to get a text description.
- B2C: Chrome extension to write alt text for you when triggered on a site. Or a mobile app that adds an alternative AI keyboard option to your normal keyboard that you can trigger on any text field using a command like `/gen`.
- B2B: Premium npm package or plugin that reads all images in your site, adds alt text descriptions to HTML automatically. Payment can be a mix of recurring and non-recurring: one-time generation vs monthly ongoing generation. Great for ecommerce sites with lots of images uploaded every month.

*What do you think?*
Carl Poppa πŸ›Έ

wait, they're standing on a blue skateboard??

0 Likes
Jason Leow Author

LOL i know right.. just found it it's built on GPT-2, not 3 πŸ€”

0 Likes

Please sign in to leave a comment.