Generators

by Benjamin (Bill) Planche

feat. Philipp Jovanovic

Slides at github.com/Aldream/presentations

- Randomness - defs + pseudo-randomness
- RNGs & PRNGs - defs + implementation
- Test Tools - suites + implementation

*Lack of pattern, predictability or determinism in events.*

**However.**

*Is it really random?*

*Or are we simply ignorant of the underlying pattern?*

Basic definition

Sequence of independent random variables

Formal definition

???

An infinite sequence can be considered random if:

**It has the***Frequency Stability Property*- $S \in \mathbb{A}^n$ with $\mathbb{A}$ alphabet of $m$ symbols
- $\forall a \in \mathbb{A}, \lim_{n \to \infty} |\{s, s \in S \land s = a\}| = \frac{1}{m}$

**Any sub-seq. selected by a**ex:*proper method*isn't biased too.- $ S = (1,0,1,0,1,0,1,0) $ not biased
- $f(X) \to (X^i | \forall i \leq |X| \land i \equiv 0 \pmod 2)$
- $\implies f(S) \to (1,1,1,1)$ biased

**However.**

- How to mathematize this
*proper method*of selection? - Yields an empty set (demo by Jean Ville in 1939)

A Random Sequence has no "exceptional and effectively verifiable" property

- No properties verifiable by a recursive algorithm
- Frequency / measure-theoretic
- Quite satisfactory

- Important measure for Information Theory

Length of the shortest program able to generate the sequence.

A finite string is random

if it requires a program at least as long as itself to be computed

$\exists c \geq 0$, such as $H(S_0^n) \geq n - c$

with $H$ complexity of Kolmogorov

- "
*Incomprehensible informational content*" - Complexity / Compressibility approach

A sequence is statistically random if it has norecognizablepatterns.

- Less strict than previous definitions
- Doesn't imply objective unpredictability

... leaves room to the concept of ...

Exhibits statistical randomness...

... though generated by a deterministic causal method.

Generators

Device which can produce a sequence of random numbers, i.e. a without deterministic properties and patterns.

Dice tossing, coin flipping, bird trajectories, ...

- Often only random in appearances
*Cheating*by knowing the rules / initial state

Nuclear decay, Behavior of photons hitting a semi-transparent mirror, ...

- Golden solutions
- Globally too costly to be democratized

Thermal signal from transistor, radio noise, Analog-to-digital conversion noise, ...

- Easier to detect
- Offer good results

- Unix Systems
- $/dev/urandom$ & $/dev/random$
- Device files probing analog sources (mouse, keyboard, disk accesses, etc.)

- Windows Systems
- $CryptGenRandom$
- Gather through system state (CPU counters, env. var, threads IDs, etc.)

- Based on the unpredictable IO + behavior of the users
- Harvest
**entropy**, Output random bytes

In both cases, entropy decreases during *inactivity*

**... Shortages.**

Ramdomness through Determinism...??

Pseudo-Random Sequences

- Clever implementations $\rightarrow$ Long-enough period
- Determinism $\rightarrow$ totally defined by init config
*State**Seed*

- Sequential shift register
- New bit = linear function of previous state
- Combinational logic
- $\mathfrak{F}$ mapping in vector space of binary $n$-tuples
- $f$ feedback function = boolean operation of $n$ variables

$$\mathfrak{F}:\mathbb{F}_2^n \to \mathbb{F}_2^n$$ $$\mathfrak{F}:(x_1, x_2, ..., x_n)\mapsto (x_2, x_3, ..., x_n, f((x_1, x_2, ..., x_n))$$

- $f$ $\equiv$ poly mod 2 in finite field arithmetic
*Feedback polynomial*

*Taps*= bits of the register used in the linear operation- Conditions to maximal-length LFSR (period $2^n-1$):
- Having an even number of taps
- Using a relatively-prime set of taps

$f(x) = x_{16} + x_{14} + x_{13} + x_{11} + 1$

Same theory as for LFSRs

Only one difference

**The feedback function $f$ is non-linear**

ex: $f(x) = x^4 + x^1 \cdot x^2 + 1$

- Makes NLFSRs harder to predict than LFSRs
- Makes it harder to ensure a max period of $2^n-1$ bits.

Applications in every area

where unpredictable behavior is desirable/required

Various applications $ \rightarrow $ Various requirements

- Crypto-secure RNGs for security applications
- Outputs uniqueness for shuffling methods
- ...

- RNGs $ \rightarrow $ ~ safer but less abundant
- PRNGs $ \rightarrow $ ~ weaker but lighter

- Def. depending on the field $ \rightarrow $
*Which one to test?* - Large number of possibilities $ \rightarrow $ Impossible to fully cover

- Statistical tests or complexity evaluations
- Battery of tests to identify statistical bias
- Checking hypothesis of
*perfect behavior*

- Impossible to fully cover $ \rightarrow $ no universal battery of tests
- Good RNGs $ \approx $ pass complicated or numerous tests

- DIEHARD Tests
- TestU1 Suite

*Break*linearly recurrent sequences in $\mathbb{F}_n$- Find min degree $L$ and annihilator poly $F(x)$ of the seq $S$

- Make a first guess for $F(x)$
- At each iteration $l$:
- Generate $S_l'$ of $l$ elements, using reverse of $F(x)$
- Compare $S$ and $S_l'$: $ \beta_0^l(S, S_l') $
- We know $S_l'$ correct up to the $(l-1)^{th}$ symbol
- If $l^{th}$ symbol not correct, ie $ \beta_0^l(S, S_l') = (0,0,0,...,1) $:
- Last iteration $m$ when this happened, we had $ \beta_0^m(S, S_m') = (0,0,0,...,1) $
- So $ \beta_0^l(S, S_l') + \beta_{l-m}^l(S, S_m') = (0,0,0,...,0) \to$ correction to apply
- $F(x) \gets F(x) + x^mF_l(x)$

Iteration **0**

$\beta$ = **0**

(*stopped*)

- Various characteristics / Various Uses
- Choose wisely!
- Don't implement your own RNG!
- ... especially for crypto!
- ... but if you try, test test test!

- Downey, R.: Some recent progress in algorithmic randomness. In Mathematical Foundations of Computer Science 2004. Springer Berlin Heidelberg (2004)
- Wikipedia: Random Number Generation (2014)
- Aumasson, J.P.: Crypto for Developers - Part 2, Randomness. AppSec Forum Switzerland 2013 (2013)
- Raymond, S., Andrew, S., Patrick, C., Jason, M.: Linear Feedback Shift Register (2001)
- Joux, A.: Algorithmic cryptanalysis. CRC Press (2009)
- Szmidt, J.: The Search and Construction of Nonlinear Feedback Shift Registers. Military Communication Institute, Zegrze, Poland (2013)
- Wikipedia: Linear Feedback Shift Register (2014)
- Dubrova, E.: A List of Maximum Period NLFSRs. IACR Cryptology ePrint Archive 2012 (2012) 166
- Ritter, T.: Randomness Tests: A Literature Survey (2007)
- L'Ecuyer, P, S.R.: TestU01 - A Software Library in ANSI C for Empirical Testing of Random Number Generators (2002)
- Marsaglia, G.: The Marsaglia Random Number CDROM including the Diehard Battery of Tests of Randomness (2005)
- Soto, J.: Statistical Testing of Random Number Generators. Proceedings of the 22nd National Information Systems Security Conference NIST, 1999 (1999)
- Berlekamp, E.R.: Nonbinary BCH decoding. University of North Carolina. Department of Statistics (1967)
- Massey, J.L.: Shift-register synthesis and BCH decoding. Information Theory. IEEE Transactions 15(1) (1969) 122-127
- Feng, G.-L., T.K.: A generalization of the Berlekamp-Massey algorithm for multisequence shift-register. Information Theory, IEEE Transactions 37(5) (2012) 1274-1287
- Rodrigez, S.: Implementation of a decoding algorithm for codes from algebraic curves in the programming language Sage. diploma thesis, Faculty of San Diego State University (2013)

A random sequence must not be predictable.

No effective strategy should lead to an infinite gain if we bet on its symbols.

- Predictability approach

```
def createLFSRgenerator(taps, seed):
""" Returns a LFSR generator, defined by the given sequence of taps and initial value.
@param taps (Tuple[int]): Sequence of taps defining the register.
ex: (1, 0, 0, 1) -> f(x) = x^4 + x^3 + 1
@param seed (int): Initial value given to the register
@return LFSR Generator """
def lfsrGen(): """ @yield Pseudo-Random value from the defined LFSR """
deg = len(taps) # Degree of the feedback polynomial
period = math.pow(2,deg) - 1 # Max period of the LFSR
value = seed # Initial value
it = 0
while (it < period): # Computing new value of most-significant bit:
bit = 0
for j in range(deg): # AND-operation between the current value and the taps-tuple
if taps[j]:
bit ^= value >> j
bit &= 1 # XOR-operation to get the new value of the bit
# Final value in register by popping less-sign bit and appending the new most-sign one:
value = (value >> 1) | (bit << (deg-1))
it += 1
yield value
return lfsrGen
```

```
function createLFSRGenerator(taps, seed) {
/** Returns a LFSR generator, defined by the given sequence of taps and initial value.
@param taps (Tuple[int]): Sequence of taps defining the register.
ex: (1, 0, 0, 1) -> f(x) = x^4 + x^3 + 1
@param seed (int): Initial value given to the register
@return LFSR Generator */
return function *lfsrGen() { /** @yield Pseudo-Random value from the defined LFSR */
var deg = taps.length, // Degree of the feedback polynomial
period = Math.pow(2, deg) - 1, // Max period of the LFSR
value = seed; // Initial value
for (var it = 0; it < period; it++) { // Computing new value of most-significant bit:
var bit = 0;
for (var j = 0; j < deg; j++) { // AND-operation between the current value and the taps-tuple
if (taps[j])
bit ^= value >> j;
}
bit &= 1; // XOR-operation to get the new value of the bit
// Final value in register by popping less-sign bit and appending the new most-sign one:
yield (value = (value >> 1) | (bit << (deg - 1)));
}
}
}
```

```
def createNLFSRgenerator(taps, seed):
""" Returns a NLFSR generator, defined by the given combination of taps and initial value.
@param taps (Tuple[Array[int]]): Sequence of combination of taps defining the non-linear register.
ex: ([0,0],[],[2],[1,2]) -> f(x) = x^4*x^4 + x^2 + x^1*x^2 + 1 (poor choice)
@param seed (int): Initial value given to the register
@return NLFSR Generator """
def nlfsrGen(): """ @yield Pseudo-Random value generated by a pre-defined NLFSR """
deg = len(taps) # Degree of the feedback polynomial
period = math.pow(2,deg) - 1 # Max Period of the NLFSR (read Warning above)
value = seed # Initial value
it = 0
while (it < period): # Computing the new value of the most-significant bit:
bit = 0
for tap in taps:
# Computing the binary multiplication x^K_0 * x^K_1 * ... * x^K_n with [K_0, K_1, ..., K_n] the j-th taps array
if len(tap):
element = 1
for k in tap:
if not (value >> k & 1):
element = 0 # Binary multiplication of terms returns 1 iif none of the terms is null.
break # So if we encounter a null bit, we simply return 0, else 1.
else:
element = 0
bit ^= element # Binary addition of the multiplication results
bit &= 1
# Getting the final value in the register by popping the less-significant bit and appending the new most-significant one:
value = (value >> 1) | (bit << (deg-1))
it += 1
yield value
return nlfsrGen
```

```
function createNLFSRgenerator(taps, seed) {
/** Returns a NLFSR generator, defined by the given combination of taps and initial value.
@param taps (Tuple[Array[int]]): Sequence of combination of taps defining the non-linear register.
ex: ([0,0],[],[2],[1,2]) -> f(x) = x^4*x^4 + x^2 + x^1*x^2 + 1 (poor choice)
@param seed (int): Initial value given to the register
@return NLFSR Generator */
return function *nlfsrGen() { /** @yield Pseudo-Random value generated by a pre-defined NLFSR */
var deg = taps.length, // Degree of the feedback polynomial
period = Math.pow(2,deg) - 1, // Max Period of the NLFSR (read Warning above)
value = seed, // Initial value
it = 0
while (it < period) { // Computing the new value of the most-significant bit:
var bit = 0
for (var j = 0; j < taps.length; j++) {
var element = 1;
if (taps[j].length) { // Computing the binary multiplication x^K_0 * x^K_1 * ... * x^K_n with [K_0, K_1, ..., K_n] the j-th taps array
for (var k = 0; k < taps[j].length; k++) {
if (!(value >> taps[j][k] & 1)) {
element = 0; // Binary multiplication of terms returns 1 iif none of the terms is null.
break; // So if we encounter a null bit, we simply return 0, else 1.
}
}
} else { element = 0; }
bit ^= element; // Binary addition of the multiplication results:
}
bit &= 1;
// Getting the final value in the register by popping the less-significant bit and appending the new most-significant one:
it += 1;
yield (value = (value >> 1) | (bit << (deg-1)));
}
}
}
```

- Developed by George Marsaglia, in 1995
- 15 tests run over a large file containing the sequence

birthday spacings, overlapping permutations, ranks of 31x31 and 32x32 matrices, ranks of 6x8 matrices, monkey tests, count the 1's, parking lot, minimum distance, random spheres, squeeze, overlapping sums, runs, and craps

- Software library, initiated in 1985
- Collection of utilities in ANSI C
- Classical stat tests + others from literature + original ones
- Tools to implement specific stat tests.

- At each iteration $l$:
- Evaluate the
*discrepancy* - If null:
- $F(x)$ and $L$ still correct
- Go to next iteration

- Else:
- $F(x)$ should be concordantly adjusted
- Shift & Scale syndromes added since last update

- If $l > 2L$:
- Update $L$ to keep track of progression

- Evaluate the

```
def BerlekampMasseyAlgorithm(sequence):
""" Applies the Berlekamp-Massey Algorithm to the given sequence of bits;
Returns the smallest annihilating polynomial F, ie. the smallest inverse
feedback polynomial corresponding to the generating LFSR.( F(sequence) = 0 )
@param sequence (Array[int] or Tuple[int]): Sequence of bits to analyze
@returns Array defining the computed inverse feedback polynomial
ex: [1, 0, 0, 1, 1] represents the inverse polynomial x^4 + x^3 + 1,
and thus the feedback polynomial x^4 + x + 1 (taps = (1, 0, 0, 1)) """
def discrepancy(sequence, poly, i, L):
""" Returns the discrepancy.
@param sequence (Array[int] or Tuple[int]): Sequence of bits to analyze
@param poly (Array[int]): Current version of the inverse polynomial
@param i (int): Current position in the sequence
@param L (int): Current number of assumed errors
@return Binary value of the discrepancy """
return sum([sequence[i-j]&poly[j] for j in range(0,L+1)])%2 # = s[i]*p[i] + s[i-1]*p[1] + ... + s[i-L]*p[L]
def addPoly(poly1, poly2, length):
""" Computes the addition of two F2 polynomials.
@param poly1 (Array[int]): Array representing the 1st polynomial
@param poly2 (Array[int]): Array representing the 2nd polynomial
@param length (int): Length to be covered by the addition (trusting user to avoid testing)
@returns Resulting Binary Array """
return [poly1[j]^poly2[j] for j in range(0, length)]
# Initializing:
N = len(sequence)
F, f = [0]*N, [0]*N # Polynomials, with F being the one returned at the end (inverse feedback polynomial)
F[0] = f[0] = 1
L = 0 # Current number of assumed errors
delta = 1 # Number of iterations since last update of L
for l in range(N): # Computing F and L:
beta = discrepancy(sequence, F, l, L)
if beta != 0: # Adjusting F for this term:
g = F.copy()
F = addPoly(F, [0]*delta + f, N)
if 2 * L <= l: # If it is not the case, we must update L (and thus re-initalize delta), and also f:
L = l + 1 - L # number of available syndromes used to calculate discrepancies
delta = 1
f = g # f get the previous value of F
else: delta += 1
else: delta += 1
return F[:L+1] # output the polynomial
```

```
function BerlekampMasseyAlgorithm(sequence) {
/** Applies the Berlekamp-Massey Algorithm to the given sequence of bits;
Returns the smallest annihilating polynomial F, ie. the smallest inverse
feedback polynomial corresponding to the generating LFSR.( F(sequence) = 0 )
@param sequence (Array[int] or Tuple[int]): Sequence of bits to analyze
@returns Array defining the computed inverse feedback polynomial
ex: [1, 0, 0, 1, 1] represents the inverse polynomial x^4 + x^3 + 1,
and thus the feedback polynomial x^4 + x + 1 (taps = (1, 0, 0, 1)) */
function discrepancy(sequence, poly, i, L) {
/** Returns the discrepancy.
@param sequence (Array[int] or Tuple[int]): Sequence of bits to analyze
@param poly (Array[int]): Current version of the inverse polynomial
@param i (int): Current position in the sequence
@param L (int): Current number of assumed errors
@return Binary value of the discrepancy */
var disc = 0;
for (var j = 0; j < L+1; j++) disc += (sequence[i-j] & poly[j]) // disc = s[i]*p[i] + s[i-1]*p[1] + ... + s[i-L]*p[L]
return disc%2;
}
function addPoly(poly1, poly2, length) {
/** Computes the addition of two F2 polynomials.
@param poly1 (Array[int]): Array representing the 1st polynomial
@param poly2 (Array[int]): Array representing the 2nd polynomial
@param length (int): Length to be covered by the addition (trusting user to avoid testing)
@returns Resulting Binary Array */
var poly = [];
for (var j = 0; j < length; j++) poly.push(poly1[j] ^ poly2[j]);
return poly;
}
// Initializing:
var N = sequence.length;
var F = [], f = [] // Polynomials, with F being the one returned at the end (inverse feedback polynomial)
for (var i = 0; i < N; i++) { F.push(0); f.push(0); }
F[0] = f[0] = 1
var L = 0 // Current number of assumed errors
var delta = 1 // Number of iterations since last update of L
for (var l = 0; l < N; l++) { // Computing F and L:
var beta = discrepancy(sequence, F, l, L);
if (beta != 0) { // Adjusting F for this term:
var g = F.slice(0);
var fShifted = f.slice(0); for (var k = 0; k < delta; k++) { fShifted.unshift(0); }
F = addPoly(F, fShifted, N);
if (2 * L <= l) {
L = l + 1 - L; // number of available syndromes used to calculate discrepancies
delta = 1;
f = g; // f get the previous value of F
} else delta += 1;
} else delta += 1;
}
for (var k = L+1; k < N; k++) { F.pop(); }
return F; // output the polynomial
}
```