As the editor of a site that often deals with contributed content, I'm sometimes
forced to deal with contributions that don't reach me in their optimal format.
Being web-based, our format of choice is naturally good old HTML. Most authors
on the other hand, prefer writing in a word processor because of the added value they
bring to the writing process. Most HTML editors simply don't have a thesaurus
or spelling and grammar checking built into them yet. As such, many of our contributions
come to me as Microsoft Word documents and I'm stuck trying to convert them to HTML.
Luckily I've learned something that helps make the process easier.
We all know that Word has an option on the File menu called "Save as Web Page..."
that does just that. The problem is, have you seen the HTML it produces? As a test
I opened up Word, typed one sentence into a blank file, and saved it as HTML. The
resulting file is 2.5KB in size and contains all sorts of weird tags. Anyone know
what <o:p> means? My browser certainly doesn't!
Luckily there's another option. Instead of using the "Save as Web Page..."
menu item, do a File -> Save As... and in the "Save as type:"
dropdown box select "Web Page, Filtered (*.htm, *.html)". The HTML
produced this way is much less cluttered and the resulting file is only 650 bytes...
a quarter the size of the other method and your browser won't be able to tell the difference.
For more information about using filtered HTML you might want to check out the following links: