I stumbled upon this nice little paper about steganography (the art of hiding a message inside within another message) using HTML documents.

It describes a few ways to conceal information:

  1. Changing order of HTML attributes
  2. Including invisible characters (e.g. null space or white space)
  3. Modifying the case of letters in HTML tags
  4. Adding id tags containing encoded information

My personal favorite is #3, editing the case of letters in HTML tags, because:

  • It doesn’t change the size of the HTML document
  • You can’t see it if you just open the DOM in your browser, you need to open the source code
  • It’s easy to code, and I’m lazy
  • It looks like the rEtArDeD meme

The obvious limitation is that you need a HTML document big enough to contain your message.

A basic implementation would work like this:

image/svg+xml Test <html><body><div>hello</div>... 0h 84 101 115 116 1t 0m 1l 0b 1o 0d 0y 0d 1i 1v 0d 0i 1v ... ... hTmLbOdydIVdiV... <hTmL><bOdy><dIV>hello</diV>... Inputs Convert message to ASCII Convert ASCII to a string of bits(8 bits per character), and matcheach bit to a character in theHTML document Change the HTML tag characterto uppercase when the matchedbit is equal to 1 Output Message HTML document T e

Here is a demo of this algorithm (if it breaks on edge cases, it just means I’m not a good developer):

Encoder

input message (payload)
input html (carrier)
output html (package)
...

Decoder

input html (package)
output message (payload)

...

Does it have any real-world application ? Probably not.

Is it cool ? Yes.