đŸ‡”đŸ‡ž Donate eSIMs to Gaza đŸ‡”đŸ‡ž

Accessible UML Diagrams

A page from a 1057 AD book about magic, containing a grid of strange magic diagrams. I've drawn in red question marks on top of some of the diagrams, to increase the confusion.
Thoughts on trying to make technical diagrams accessible to blind readers.

Back in January (how has it been two months already!?), I was working on the UML diagrams for my Kindle Display post and realized I had no idea how to make them accessible for screen readers.

For those who don’t know, UML (Unified Modeling Language) is a specification for creating visual diagrams to explain computer systems and code.

Specifically, I was making UML Sequence diagrams which are images used to explain how data flows around a system.

Below is an example from the post. Don’t worry about understanding it, it’s just here to give you a sense of what they look like.

Diagram Code

@startuml

Kindle->WebServer: send request for new image

WebServer->CalendarAPI: what's happening today?

CalendarAPI->WebServer: list of calendar events

WebServer->WeatherAPI: what's the current weather?

WeatherAPI->WebServer: fucking freezing!

WebServer->WebServer: Calculate current moon phase

WebServer->WebServer: Choose random daily image

WebServer->WebServer: Make an HTML page using all the info above

WebServer->WebServer: Use Playwright to a screenshot of that HTML page

OrangePi->OrangePi: Use Imagemagick to convert the\nscreenshot to greyscale

OrangePi->Kindle: respond with the image

@enduml

UML diagrams are a visual, spatial, representation of a system, that helps the reader build up a cognitive map of the system. They’re really handy when an idea is too difficult or unwieldy to explain with just text.

When I’m reading a UML diagram, I often physically trace the flow of it with my finger, and I rely on subtle graphical differences for additional information, e.g. solid vs dashed arrows.

But given that these diagrams are so visual, how can I possibly convey them to blind readers?


Computers were designed as a visual tool. You hit your fingers against visually-labeled plastic keys and watch words appear on the screen. They are an extension of printed text, allowing us to communicate with panels of tiny lights instead of ink.

There’s no inherent reason that we had to design computers around text. You could imagine an entirely sound based interface (see telephones and phone phreaking), or something more tactile (or all of the above!) but instead we ended up with keyboards and screens.

The more we have made computers an inescapable part of our lives, the more we have excluded people that can’t rely on sight, people that navigate the world through touch, and sound, and smell.

Screen readers are examples of software that attempt to bridge that gap, in this case by trying to turn computers into a more auditory interface, by reading the screen aloud.

The standard accessibility approach for images on computers is to add what’s known as “alt text”: a textual representation of the image that gets read aloud by screen-reader software.

For example, the image below has the alt text: “A cartoon drawing of a brown cat from the Tab Cat browser extension. The cat’s name is ‘Silly Cupcake’ and their head is cocked to the side quizzically”.

A cartoon drawing of a brown cat from the Tabby Cat browser extension. The cat's name is 'Silly Cupcake' and their head is cocked to the side quizzically.

My sighted readers might be noticing that I didn’t completely describe the picture with my alt text. I didn’t mention the pink background the cat is painted on, nor the shadow under the cat, nor the fact that the font ‘Silly Cupcake’ is written is sans-serif.

We run into the “a picture is worth a thousand words” problem. I could have included all that information in the alt-text, but I’m never going to be able to perfectly conjure the image in my readers’ heads. I have to use my judgement about what text best captures the emotion or information I was trying to communicate with the image in the first place. It’s an art, not a science.

(For an inspiring NSFW example of people turning alt text into an art form, rather than just using it to check an accessibility box, check out ALT After Dark.)


Going back to UML diagrams, how can we write alt text to describe them, given that they’re generally used to replace large blocks of text?

If I could describe my system easily with just text, I wouldn’t have any need for a diagram in the first place!

I turned to Mastodon to see if anybody had any thoughts about this.

I didn’t have enough people comment to form concensus, but AndrĂ© Polykanine shared that textual code-based representations of diagrams and maths have been helpful in the blind community.

One of the tools that was mentioned in the thread and which came up a lot in my own searching is PlantUML, a Java program that allows you to create UML diagrams with a text based syntax.

You write code like this:

Alice -> Bob: Authentication Request
Bob --> Alice: Authentication Response

Alice -> Bob: Another authentication Request
Alice <-- Bob: Another authentication Response

and PlantUML turns it into an image like this:

Diagram Code

@startuml

Alice -> Bob: Authentication Request

Bob --> Alice: Authentication Response

Alice -> Bob: Another authentication Request

Alice <-- Bob: Another authentication Response

@enduml

As opposed to visual diagramming tools (e.g. Microsoft Visio), tools like PlantUML let you use pure text to create visual images. For blind readers, if the code is included along with the diagram image, they’ll have access to all the same information that sighted users have, albeit in a less ergonomic format.

I’ve used other text based diagrams in the past (most recently js-sequence-diagrams), and some Mastodon commenters recommended mermaid.ai (formerly mermaid.js an emoji sized picture of the Hide the Pain Harold internet meme, which is a picture of an old man smiling a very fake smile in front of a laptop. The push to shove AI into everything makes me feel like Harold.), but I decided to go with PlantUML.

It’s been around for a while (it was started in 2009), it’s open source, and I found some research literature about it being used as an accomodation for blind readers.

Specifically, I found a really interesting research paper titled “Growing an Accessible and Inclusive Systems Design Course with PlantUML” where a college instructor used PlantUML to accomodate a blind student that used a screen reader in a Systems Analysis and Design course.

The course was team-based, with students working together in groups “on a project that covers all stages of the systems development lifecycle”. Diagrams were used heavily throughout the course, both by the instructor to explain concepts in class, and as “project deliverables
 at almost all stages of the project”.

The blind student was “initially reluctant to participate in the team project based on previous experience working on a team”, but after the two of them sat down to figure out what accomadations might help, the student decided to give it a try.

They decided between them that the members of the team the student was placed on would use PlantUML to create their diagrams so that they could all equally contribute and collaborate. Additionally, every team, regardless of if they used PlantUML or not, was required to include a description of their diagrams alongside the image version.

I’m happy to report that these accomadations worked! The blind student was able to engage in the course, learned with and from their sighted classmates, and generally had a much less frustrating experience than they’d had in other comp-sci classes.

To quote the paper:


the experience of working with a professor who was understanding of the requirements for accessibility was a great experience for the student, and enabled them to fully engage with all aspects of the learning process for the course, including learning multiple new diagram types. PlantUML allowed the student to work effectively with a team in a course that was heavily diagram-based.

Interestingly, it seems that many of the sighted students, even those outside of the blind student’s team, ended up adopting PlantUML themselves. One sighted student even “realized that the declarative nature of the PlantUML markup was a better cognitive fit for them than the visual diagramming tool that had served them well in the past.”

That’s certainly been my experience as a sighted writer of diagrams – I find visual diagram tools fiddly and hard to use, whereas text based tools like PlantUML come much easier to me.

I recommend giving the full paper a read, there’s lots of interesting stuff in there, in particular some guidance on how best to collaborate on and present PlantUML diagrams in an in-person setting.


Ok, so what’s this all mean for making accessible diagrams on the web?

Going off that paper, it seems that text based diagram tools like PlantUML are a big help for blind readers.

On my blog, I decided to generate the diagrams using PlantUML and to automatically include the code for the diagram in a <details> element below the image.

Click for Technical Details

For my nerdy readers, here’s a quick technical summary of how I integrated PlantUML with my static site generator.

I was already using Pandoc to convert the Markdown files I write my blog posts as in to HTML, so I added a new Pandoc filter to generate the diagrams. It looks for code blocks starting with @startuml and passes that code to the PlantUML java app, which turns it into an SVG file.

I do a string replace on the generated SVG to update its internal CSS to refer to my CSS variables (e.g. var(--text-color)). This ensures that the diagram looks good in both light and dark mode. I’m sure there’s some PlantUML config I could set to update the colors, but the hacky string replace solution works well enough for now.

The pandoc filter then replaces the code block with the SVG, copies the aria-label over from the markdown code block, and adds a collapsed <details> element containing the diagram code.

The java app takes a fair amount of time to generate the SVG, so to make sure my static site generation is still quick, I store the SVG in a cache (keyed by a hash of the code block). That way I only run PlantUML when the actual diagram code changes.

That’s the gist of it!

Still, my own opinion is that including the PlantUML code is still insufficient.

While I’d rather use PlantUML over a visual tool to write diagrams, I wouldn’t choose reading PlantUML code over looking at the generated image. Even though the code contains all the same information as the image, it’s presented in a much less ergonomic way.

If the code itself was a sufficient way to describe a system, we wouldn’t need to generate the image at all.

Can we do better? I’m really not sure.

Being sighted myself, and as a fairly visual learner at that, I have trouble imagining a medium that could convey spatial information as effectively as an image. To be clear, this isn’t me saying that such a medium doesn’t exist, only that I don’t have the personal experience required to come up with it!

In a world of infinite means, maybe there could be some magic device that creates a tactile representation of a diagram? It seems like there have been some attempts at making braille tablets (for example, this $15,000 đŸ€Ż one), but they’re currently far too expensive.

If you use screen readers and have any ideas or feedback about my approach I’d love to hear it :)


Finally, I’d like to share why I think this is important.

Having worked in the tech industry for a decade, I can say with some authority that accessibility, inclusivity, and security are afterthoughts to most software companies. Nearly every product I’ve worked on (including at big companies like Google) has had serious accessibility flaws, and were often broken when used through a screen reader.

The majority of the people making software these days (myself included) are white, male, not disabled, and wealthy. We develop on expensive hardware (I’m typing this on a Macbook Pro) in ideal conditions. We have external monitors, fast internet connections, ergonomic keyboards, $800 cell phones, and accurate mental models of how software interfaces work.

If we make software that doesn’t work for disabled people, that eats up a limited data plan, that doesn’t work at all on an old phone, well, we won’t notice. Because it works on our machines, and that of our even more privileged shareholders.

Even when you do get the odd individual working at these companies that cares, that might even be impacted by an inaccessible user experience themselves, they generally don’t have the freedom to work on accessibility. They’re told that its out of scope, that the product works well enough, and that you’ll only get promoted by launching new things, not by improving old ones.

The only time that these companies tell their devs to work on accessibility is when they’re required to do so by external regulation (e.g. ADA law in the US), and even then a lot of them will only do it if they’re actively being threatened by a lawsuit. (see fireborn’s “because fuck you” essay for a deep dive into this topic).

But goddamn it, I think building things that work for everybody matters.

I want to live in the world where we take the time to build things right, where we listen to the people we’re building things for, where people of all sorts get the opportunity to build their own infrastructure, to create their own environments.

I want to live in the world where we take care of one another.

Trying to make my own personal blog comfy for as many of my potential readers as I can is a step in that direction. It doesn’t matter that I don’t have many readers, and that few of them will ever use the accessibility features I make. Hell, even if nobody ever uses them, I don’t think my time has been wasted.

Because by doing it, I am, however transiently, living in that world that I want to create.

I hope that, one day, we can all share in that world together.