Vigenère cipher (how-to) by andycyca

Introduction

A Vigenere cipher is, in its simplest form, encoding a message with a Caesar shift cipher, but using a different shift for every letter of the plaintext. Let’s take, for example, the message AAAAAAAAAA (the letter ‘A’ repeated 10 times).

Under a regular Caesar shift, you “move” your characters down the alphabet by a specific amount. For instance, with a shift of 1, the message becomes BBBBBBBBBB. With a shift of 2, it becomes CCCCCCCCCC and so on. As you might imagine, this kind of cipher can be attacked by analysing letter frequency, because the same letter in plaintext is always encoded as the same letter in the ciphertext.

A way around this limitation is to use a different shift for every letter of the plaintext. This, in essence, is the Vigenère cipher.

Let’s imagine, then, that we will shift our message by 0 units for the first letter, 1 unit for the second one and so on. The shift then, could be established as 0-1-2-3-4-5-6-7-8-9. The ciphertext, in this case, would be ABCDEFGHIJ.

What, then, if you want to shift by more than 9 units? Well, you could turn to hexadecimal, but it would be easier to just use the 26 letters of the alphabet to represent these shifts, from 0–25. Then, in order to encode and decode a Vigenère cipher, you need not just the message, but a code word (keyword) that will be used repeatedly to encode and decode.

A mathematical approach

It’s easy to see now that the Vigenère cipher can be understood as a simple operation of modular addition or subtraction, repeated for every letter of the plaintext, with shifts determined by the keyword.

Simple, isn’t it?

Well, this simplicity does have its drawbacks, as others have written here. The most important one is that it can still be attacked by either frequency analysis or key elimination.

Kasiski examination

The main idea is that, if your message is long enough, there is a high chance for a single letter to be encoded with the same key-letter more than once. This could result in the ciphertext having short strings of repeated characters. This in turn can be used to deduce the keyword length n, and from there it’s possible to break down the message in n-long columns, which greatly speeds up regular frequency analysis.

A naive Vigenère implementation in Python 3.x

# -*- coding: utf-8 -*-
"""
Created on Thu Aug  8 00:25:37 2019

@author: Andy

Vigenere cypher, naive approach.

Assume both plaintext and cypher are in all caps. The Vigenere cypher is
done algebraically with additions modulo 26, 'adding' the letters from
column (plaintext_i) and the letters from row (key_i).

***For demonstration purposes only***: This module stores message, key and
encoded text in plain text.

'alpha' is the basis for the Vigenere square. Originally, you'd use only the
26 letters of the Latin alphabet, but by extending this string with symbols
you could extend this 
"""

import math
import re

alpha = 'ABCDEFGHIJKLMNOPQRSTUVWXYZ'
plain = input('What is your message? > ')
plain = re.sub('[\W]+', '', plain)
plain = str.upper(plain)
cypher = input('What is your cypher code? > ')
cypher = re.sub('[\W]+', '', cypher)
cypher = str.upper(cypher)

# Using ceiling function to ensure a longer than necessary key
times = math.ceil(len(plain) / len(cypher))
proto_key = cypher * times
key = proto_key[:len(plain)]
# Store the encoded message
encoded = ''
# Calculate addition modulo 26
for idx, letter in enumerate(plain):
    new = (alpha.index(plain[idx]) + alpha.index(key[idx])) % len(alpha)
    encoded += alpha[new]
print(plain)    # test: ATTACKATDAWN
print(key)      # test: LEMON
print(encoded)  # test: LXFOPVEFRNHR

THE IRON NODER CHALLENGE XII: WE'LL RUST WHEN WE'RE DEAD

Breaking the Vigenère cipher	Everything2 Crypto Project	polyalphabetic substitution cipher	Vigenère Stream
Caesar shift	THE IRON NODER CHALLENGE XII: WE'LL RUST WHEN WE'RE DEAD	If cryptography is outlawed, bayl bhgynjf jvyy unir cevinpl	encryption algorithm
code wheel	codebreaking	Charles Babbage	symmetric cipher
teledildonics	ROT14	C++ input iterator	Cartesian product
Cryptonomicon	1854	Spooks	One Time Pad
categorical syllogism	Numbers Station	March 13, 2000	monoalphabetic substitution cipher