I was asked to provide translations for a couple of words in “tricky” languages and in the message it said “Unicode values will do.” Well, I can ask the translations from vendors but I started wondering about the Unicode values. It isn’t too difficult to search for the character values one by one this time but what if I needed to process bigger chunks of text?

FileFormat.info is a wonderful site — I use it all the time (at work). There you can find oodles of information on a character, and you can even enter a character, e.g. Devanagari as I did, in the search field and it really finds it! Of course there is the official Unicode site but I haven’t yet learnt to use it to my full advantage. Its best feature — in my opinion — is the ≡ information (a character is identical to another character or a combination of characters).

Macromedia Dreamweaver is quite handy in determining the HTML entity (decimal) behind a character. (I’m not actually sure if you could choose to convert the characters to HTML Hex instead.) You just paste the text in the design view and the entities appear in the code view. For this particular assignment the client eventually needs the HTML entities.

But the question is, if I needed to find out the Unicode value of each character for a big chunk of text, how would I do it?