Autocorrelation analysis calculates the similarity between two sequences: the original data and the original data linearly shifted over a range "t". This analysis on the plaintext (original data) and the ciphertext (encrypted data) can help determine the size of the secret key, drastically reducing the effort to discover it and consequently gain access to the information.

For this analysis to be successful, the similarity between byte sequences should be as small as possible, i.e. the data should appear as random as possible in its distribution, considering the distribution of the original data.

For our test, we encoded a file in TXT format containing only one sequence in plain text (ASCII) mode, aiming to make the attack as easy as possible:

abcdefghijklmnopqrstuvwxyz0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ

abcdefghijklmnopqrstuvwxyz0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ

abcdefghijklmnopqrstuvwxyz0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ

OBAKE files were generated using the OBAKE-512, AES-CBC, and ChaCha algorithms, all encrypted with secret key (1 letter = "a"), without compression and/or columnar transposition. We consider CR-LF bytes for the purpose of binary compatibility between all files.

As you can see below, the lack of a self-correlation pattern in the results of the OBAKE-512 algorithm is evident, making it resistant to this type of attack.

Below we have two sections:

- Test results with LAG=1 (sequential binary comparison)
- Test results with LAG=64 (comparison among the lines of the text)

If you want details about the program and the files used here, please visit this page.

For better understanding, take a look at the autocorrelation plot from the source file, which clearly demonstrates a sequential and ordered shape. Also, you can see that the correlation values are far away from the upper critical value and the lower critical value. These values are defined by a calculation based on the total amount of samples:

Bibliographic references

T. Siegenthaler et al., "Correlation-immuniti of Non-Linear combining functions for Cryptographic Applications", IEEE Transactions on Information Theory, Vol 5, IEEE, 1984

B. Schneier, Applied Cryptography: Protocols, Algorithms and Source Code in C, 2nd Ed., John Wiley & Sons, Inc., 1996

C. G. Mehmet E.Dalkilic, "An Interactive Cryptanalysis Algorithm for the Vigenere Cipher," 15 Dez 2000.

D. Stinson, Criptography: Theory and Practice, Chapman & Hall/CRC, 2006.

A. Biryukov, Encyclopedia of Cryptography and Security, H. C. v. Tilborg, Ed., SpringerScience+Business Media LLC, 2011.

B. Schneier, "Detecting Words and Phrases in Encrypted VoIP Calls", https://www.schneier.com/blog/archives/2011/03/detecting_words.html

Y.Zhou, A.Zhang, Y.Cao, "The Cryptographic Properties of the Autocorrelation Functions for Encryption Algorithm: Proceedings of International Conference on Mechatronics and Intelligent Robotics"