Fall 2016, University of Kentucky, Models of Computation, CS 575, with Dr
Judy Goldsmith. These incomplete lecture notes are just my own, not proofread or
endorsed by Dr G or UKY.

Introducing context free grammars (CFG)

A CFG has four things:

A set of symbols $N$. (nonterminals)

A set of terminal symbols $T$ which is disjoint from $N$.

A set of rules $R$ where each rule maps some symbol in $N$ to some string
$\alpha$ made of symbols from $N \cup T$

A chosen start symbol $S \in N$.

And we say “the language $L$ generated from a grammar $G$” is the set of
terminal-only strings reachable from by following rules from the start symbol
$S$.

Example: make a grammar that generates a language

Let the language $L = \{w \in (a+b)^*: n_a(w) > n_b(w)\}$. In words,
$L$ is the set of strings with more $a$’s than $b$’s.

Let’s make a grammar $G$ that generates this language.

Let our nonterminals be $N = \{S, T\}$, let our terminals be $T = \{a, b\}$,
let our start state be $S$. Here are our rules, we have two:

So to generate $baaba$, we would start with $S$, turn that into $bSa$, then
$baSba$, then $baTba$, then $baaba$. A string is in the language generated by
the grammar only if (first) you can follow rules from the start state and reach
it and (second) only terminal symbols are in it.

Example 2: making a grammar that generates the palindromes

A palindrome is a string that reads the same backwards and forwards. Like
racecar. Let’s make a grammar that generates all palindromes made from the
alphabet $\{a, b, c\}$.

Here is a set of rules that almost does it:

($\epsilon$ means “nothing”, the empty string)

So to make $abcb bcba$ we could go

But we can’t make $aabaa$ or any other odd-length palindromes. To do that, we
extend our rule:

And now $G$ generates all $a, b, c$ palindromes.

CFG’s are closed under concatenation

We say a language is context-free if there is some CFG that generates exactly
it. We claim that if $L_1$ is context free and $L_2$ is context free, then
the language $L_3 = \{xy : x \in L_1, y \in L_2 \} $ is also context
free. ($L_3$ is the set of strings that you can get by concatenating something
from $L_1$ and something from $L_2$.)

Why? Call $L_1$’s start state $S_1$ and $L_2$’s start state $S_2$. Then
we’ll make $L_3$’s start state $S_3$ and our grammar will only have one rule: