### Unit 9: Regular Expressions and Finite-State Automata

Computer language compilers frequently use regular languages (which feature regular expressions) to match patterns within text or perform lexical analysis. This unit will introduce formal languages and state automata to prove the correctness or falsity of a language. We will see that some languages might not be valid for a state automaton, but that we can also define and create state automata for languages defined using regular expressions.

The definition of regular expressions is recursive. This definition and that of regular sets and regular expressions requires careful contemplation on your part.

The topic of formal languages and the related topic of finite state automata are not directly addressed in our two references. Our study of discrete structures has given us the background to develop these topics in instruction paragraphs.

We begin with notation, terminology, and definitions and expressions for formal languages. Then we introduce finite automata and their relationship with regular expressions, and use them to solve problems.

**Completing this unit should take you approximately 5 hours.**

Upon successful completion of this unit, you will be able to:

- define formal languages using operator precedence and regular expressions;
- construct and analyze finite state automata; and
- relate formal languages and automata.

### 9.1: Formal Languages

Read this discussion of grammar and formal languages.

### 9.1.1: Polish Notation

- Read this discussion of Polish notation for describing expressions.

### 9.1.2: Languages Defined by Regular Expressions

Read this discussion of languages defined by regular expressions.

### 9.1.3: Order of Precedence Rules

Read this discussion of operator precedence.

### 9.1.4: Deciding Whether Regular Expressions Define the Same Language

Download the linked file for a discussion of operator precedence.

### 9.2: Finite State Automata

Read this discussion on finite state automata.

### 9.2.1: Automaton and Accepted Language

Read this discussion of regular sets and their acceptance by finite state automata.

### 9.2.2: Designing a Finite State Automaton

Read this discussion of hints on designing a finite state automaton to recognize a given regular set.

Not all languages are regular. Because a finite state machine has a finite number of states, it can only remember a finite number of inputs. Consider the set {o

_{n}1_{n}for all n >= 0}. Because this set requires a recognizing machine to remember n 0's for any n (so that it can then look for n 1's), the number of states is infinite, i.e. not a finite state machine. Thus, the language is not regular. Hence, one way to show that a language is not regular is to show that the automaton that recognizes it has an infinite number of states.

### 9.2.3: Simplifying Finite State Automata

Read this discussion of ways to simplify finite state automata.

### Unit 9 Assessment

Please complete this assignment. Then, check your answers against the answer key.