In yesterday’s reading list, I wrote about the web page WEB ACCESSIBILITY DIRECTIVE: THIS IS WHAT THE DISABILITY MOVEMENT EXPECTS. The European Disability Forum is angry: “2/3 of public websites in Europe are still not accessible” it thunders after it turned off the caps lock. It has a plan to make the Web more accessible, published on its website. As a Word document, naturally.
This annoyed me, as a markup wonk and as a man-with-MS, so I spent a couple of hours today turning the Word document into HTML, which I imaginatively titled EDF Position on the Proposal for a Directive on the Accessibility of Public Sector Bodies’ Websites. I’ve written to them and offered the document for their site if they want it.
Here’s how I did it, in case you’re lucky enough to have to convert Word documents to HTML (this one was pretty good, though – they used proper headings etc).
- Save as “HTML” in Word (ha!)
- Run it through DocToHtml – Doc to HTML Converter (free 30 day trial)
<h1 id="to turn old-fashioned named anchors into
ids on elements
- HTML Tidy didn’t work for me, leaving lots of redundant
</a>s lurking around after step 3 above. So, I had a brainwave; I knew the browser’s parsing algorithm would dump those closing tags, so I went to Opera Dragonfly’s JS console, typed in document.documentElement.innerHTML and pasted the returned code into my document. Thanks Mathias!
some tedious replacement of funny characters with their character entities (isn’t there some utility that will do that?)No need to do this if you use UTF-8 and (d’oh) use a font with the right glyphs
- Some very light styling
Don’t tell the boss, though; he thinks I’ve been working.
(Last Updated on )