Conference article

The Use of Project Gutenberg and Hexagram Statistics to Help Solve Famous Unsolved Ciphers

Richard Bean
School of Information Technology and Electrical Engineering, University of Queensland, Australia

Published in: Proceedings of the 3rd International Conference on Historical Cryptology HistoCrypt 2020

Linköping Electronic Conference Proceedings 171:5, p. 31-35

NEALT Proceedings Series 44:5, p. 31-35

Published: 2020-05-19

ISBN: 978-91-7929-827-2

ISSN: 1650-3686 (print), 1650-3740 (online)


Project Gutenberg, begun by Michael Hart in 1971, is an attempt to make public do-main electronic texts available to the public in an easily available and useable form. The number of available texts reached 60,000 by 2019. Classical cryptanalysis methods rely on the development and use of high-quality frequency tables of letter arrangements from a variety of sources. As the amount of text grows, frequency tables of higher orders can be developed and may provide more solving power for classical cryptographic algorithms. As a side-effect of the availability of a wide range of public domain texts, we were able to develop hexagram frequency tables of letters in the English language which were then a crucial factor to solving an unsolved transposition cipher of Mahon and Gillogly (2008). The texts themselves were then used as input to solve a book cipher of Thouless (1948) using the same scoring method.


cryptanalysis; frequency tables; hill climbing; book ciphers


No references available

