π° Day 777 - SaaS idea: A11y Γ AI - https://golifelog.com/posts/saas-idea-a11y-ai-1676508270598
I've been brewing this idea for a SaaS for a while that combines a bit of my wide-ranging interests: web accessibility, design for disability, social impact work, SaaS, indie hacking, and GPT-3 AI.
Idea:
Consumer/Enterprise GPT-3 AI app for image captioning, and generating alt text for images on the internet.
Problem:
- Inclusive design and web accessibility (aka "a11y") is important work but often deprioritised because it's deemed as "expensive" or "time-consuming" or "not relevant to our target customers". But companies might take it up if it's dead easy to implement, like just *npm install a11y*.
- One important a11y feature is having alt text for images in your website. Without alt text describing the images, folks who have visual impairment/blindness using screen readers will be deprived of social participation and understanding the context of the content through the image. Like say memes for example - all the context is in the image, and people don't often caption or describe it in the post or in alt text.
Opportunity:
- Platforms are now giving more algorithm weight to images with alt text. So it's good for business/marketing. Personally I add alt text to all my images on Twitter, not just for the higher algo juice but also for inclusion.
- AI like GPT-3 aren't just good for text-to-image generation. They can also read images and describe them. There's already [2 MVPs](https://gpt3demo.com/category/image-captioning) doing that. I tested out one of the [apps called CLIP](https://huggingface.co/spaces/akhaliq/CLIP_prefix_captioning) using a Spiderman meme, and the result "A group of cartoon characters standing on top of a blue skateboard" isn't not the most accurate description tbh. Lots of room for improvement and maybe that's where the opportunity is?
![Meme of 7 Spidermen pointing at one another](https://i.ibb.co/h9N5ypz/Screen-Shot-2023-02-16-at-7-24-54-AM.png)
Features & business model:
- Imagine if alt text is automated and hassle-free. All it takes is to install a package and the software adds all the required alt text for you into the code. Or you're uploading a picture to Twitter and the app auto-generates a description in 1 second. Most basic MVP version would be like CLIP β a drag and drop for an image to get a text description.
- B2C: Chrome extension to write alt text for you when triggered on a site. Or a mobile app that adds an alternative AI keyboard option to your normal keyboard that you can trigger on any text field using a command like `/gen`.
- B2B: Premium npm package or plugin that reads all images in your site, adds alt text descriptions to HTML automatically. Payment can be a mix of recurring and non-recurring: one-time generation vs monthly ongoing generation. Great for ecommerce sites with lots of images uploaded every month.
*What do you think?*
Idea:
Consumer/Enterprise GPT-3 AI app for image captioning, and generating alt text for images on the internet.
Problem:
- Inclusive design and web accessibility (aka "a11y") is important work but often deprioritised because it's deemed as "expensive" or "time-consuming" or "not relevant to our target customers". But companies might take it up if it's dead easy to implement, like just *npm install a11y*.
- One important a11y feature is having alt text for images in your website. Without alt text describing the images, folks who have visual impairment/blindness using screen readers will be deprived of social participation and understanding the context of the content through the image. Like say memes for example - all the context is in the image, and people don't often caption or describe it in the post or in alt text.
Opportunity:
- Platforms are now giving more algorithm weight to images with alt text. So it's good for business/marketing. Personally I add alt text to all my images on Twitter, not just for the higher algo juice but also for inclusion.
- AI like GPT-3 aren't just good for text-to-image generation. They can also read images and describe them. There's already [2 MVPs](https://gpt3demo.com/category/image-captioning) doing that. I tested out one of the [apps called CLIP](https://huggingface.co/spaces/akhaliq/CLIP_prefix_captioning) using a Spiderman meme, and the result "A group of cartoon characters standing on top of a blue skateboard" isn't not the most accurate description tbh. Lots of room for improvement and maybe that's where the opportunity is?
![Meme of 7 Spidermen pointing at one another](https://i.ibb.co/h9N5ypz/Screen-Shot-2023-02-16-at-7-24-54-AM.png)
Features & business model:
- Imagine if alt text is automated and hassle-free. All it takes is to install a package and the software adds all the required alt text for you into the code. Or you're uploading a picture to Twitter and the app auto-generates a description in 1 second. Most basic MVP version would be like CLIP β a drag and drop for an image to get a text description.
- B2C: Chrome extension to write alt text for you when triggered on a site. Or a mobile app that adds an alternative AI keyboard option to your normal keyboard that you can trigger on any text field using a command like `/gen`.
- B2B: Premium npm package or plugin that reads all images in your site, adds alt text descriptions to HTML automatically. Payment can be a mix of recurring and non-recurring: one-time generation vs monthly ongoing generation. Great for ecommerce sites with lots of images uploaded every month.
*What do you think?*