12 May 2009 11 comments Python
I needed to find out what are the least used letters in the English language. I pulled down a list of about 100,000+ English words, split them all and made a list of about 1,000,000 letters. Sorted them by usage and came up with this as the result:
It would be interesting to make a heatmap of this over an image of a QWERTY keyboard.
Below is a the same list but with ratios compared to the least common:
e 3.0 s 2.3 i 2.1 a 2.0 r 1.9 n 1.8 t 1.6 o 1.5 l 1.4 d 1.1 c 0.9 u 0.9 g 0.8 p 0.7 m 0.7 h 0.6 b 0.5 y 0.4 f 0.4 k 0.3 w 0.3 v 0.3 z 0.1 x 0.1 j 0.1 q 0.0
I hope I got that right because I did that calculation in a quick one-liner just now. It basically means that the letter
e is 3 times more common than the average.