
PRFs vs PRGs; Variable-Hybrid Proofs
In this section we show that a PRG can be used to construct a PRF, and vice-versa. The construction of a PRG from PRF is practical, and is one of the more common ways to obtain a PRG in practice. The construction of a PRF from PRG is more of theoretical interest and does not reflect how PRFs are designed in practice.
Constructing a PRG from a PRF
As promised, a PRF can be used to construct a PRG. The construction is quite natural. For simplicity, suppose we have a PRF i.e., in
out
. We can build a length-doubling PRG in the following way:
Construction 6.2 (Counter PRG)
There is nothing particularly special about the inputs and
to
. All that matters is that they are distinct. The construction can be extended to easily give more than 2 blocks of output, by treating the input to
as a simple counter (hence the name of this construction).
The guarantee of a PRF is that when its seed is chosen uniformly and it is invoked on distinct inputs, its outputs look independently uniform. In particular, its output on inputs and
are indistinguishable from uniform. Hence, concatenating them gives a string which is indistinguishable from a uniform
-bit string.
That really is all there is to the security of this construction, but unfortunately there is a slight technical issue which makes the security proof more complicated than you might guess. We will have to introduce a new technique of variable hybrids to cope with it.
Claim 6.3
If is a secure PRF, then the counter PRG construction
above is a secure PRG.
Proof
In order to prove that is a secure PRG, we must prove that the following libraries are indistinguishable:
During the proof, we are allowed to use the fact that is a secure PRF. That is, we can use the fact that the following two libraries are indistinguishable:
The inconvenience in the proof stems from a mismatch of the variable in
the
variable in
In
is local to the QUERY subroutine. Over the course of an execution,
will take on many values. Since
is used as the PRF seed, we must write the calls to
in terms of the LOOKUP subroutine of
. But in
the PRF seed is fixed for the entire execution. In other words, we can only use
to deal with a single PRF seed at a time, but
deals with many PRG seeds at a time.
To address this, we will have to apply the security of (i.e., replace
with
) many times during the proof - in fact, once for every call to QUERY made by the calling program. Previous security proofs had a fixed number of hybrid steps (e.g., the proof of Claim
used 7 hybrid libraries to show
). This proof will have a variable number of hybrids that depends on the calling program. Specifically, we will prove
where is the number of times the calling program calls QUERY.
Don't be overwhelmed by all these hybrids. They all follow a simple pattern. In fact, the th hybrid looks like this:
In other words, the hybrid libraries all differ in the value that is inserted into the code above. If you're familiar with C compilers, think of this as adding "#define i 427 " to the top of the code above, to obtain
First note what happens for extreme choices of :
- In
, the if-branch is never taken (count
is never true). This library behaves exactly like
by giving PRG outputs on every call to QUERY.
- If
is the total number of times that the calling program calls QUERY, then in
, the if-branch is always taken (count
is always true). This library behaves exactly like
by giving truly uniform output on every call to QUERY.
In general, will respond to the first
calls to QUERY by giving truly random output. It will respond to all further calls by giving outputs of our PRG construction.
We have argued that and
. To complete the proof, we must show that
for all
. The main reason for going to all this trouble of defining so many hybrid libraries is that
and
are completely identical except in how they respond to the
th call to QUERY. This difference involves a single call to the PRG (and hence a single PRF seed), which allows us to apply the security of the PRF.
In more detail, let be arbitrary, and consider the following sequence of steps starting with
:
We have taken
and simply expanded the else-branch
(count
) into two subcases
count
and count
). However, both cases lead to the same block of code (apart
from a change to a local variable's name), so the change has no effect
on the calling program.
We have factored out the calls
to that use seed
(corresponding to the count
case) in terms of
change no
effect on the calling program.
From the fact that is a
secure PFR, we can replace
with
, and the overall change is
indistinguishable.
Since count happens only once, only two calls to LOOKUP will be made across the entire lifetime of the library, and they are on distinct inputs. Therefore, the if-branch in LOOKUP will always be taken, and
is never needed (it is only needed to "remember" values and give the same answer when the same
is used twice as argument to LOOKUP). Simplifying the library therefore has no effect on the calling program:
Inlining the subroutine has no effect on the calling program. The resulting library responds with uniformly random output to the first calls to QUERY, and responds with outputs of our PRG
to the others. Hence, this library has identical behavior to
We showed that , and therefore:
This shows that
, so
is a secure
A Theoretical Construction of a PRF from a PRG
We have already seen that it is possible to feed the output of a PRG back into the PRG again, to extend its stretch (Claim 5.7). This is done by making a long chain (like a linked list) of PRGs. The trick to constructing a PRF from a PRG is to chain PRGs together in a binary tree. The leaves of the tree correspond to final outputs of the PRF. If we want a PRF with an exponentially large domain (e.g., in , the binary tree itself is exponentially large! However, it is still possible to compute any individual leaf efficiently by simply traversing the tree from root to leaf. This tree traversal itself is the PRF algorithm. This construction of a PRF is due to Goldreich, Goldwasser, and Micali, in the paper that defined the concept of a PRF.
Imagine a complete binary tree of height in (in will be the input length of the PRF). Every node in this tree has a position which can be written as a binary string. Think of a node's position as the directions to get there starting at the root, where a 0 means "go left" and 1 means "go right." For example, the root has position (the empty string), the right child of the root has position 1 , etc.
The PRF construction works by assigning a label to every node in the tree, using the a length-doubling PRG . For convenience, we will write
and
to denote the first
bits and last
bits of
, respectively. Labels in the tree are
-bit strings, computed according to the following two rules:
- The root node's label is the PRF seed.
- If the node at position
has label
, then its left child (at position
) gets label
, and its right child (at position
) gets label
.
In the picture above, a node's label is the string being sent on its incoming edge. The tree has leaves, whose positions are the strings
. We define
to be the label of node/leaf
. To compute this label, we can traverse the tree from root to leaf, taking left and right turns at each node according to the bits of
and computing the labels along that path according to the labeling rule. In the picture above, the highlighted path corresponds to the computation of
.
It is important to remember that the binary tree is a useful conceptual tool, but it is exponentially large in general. Running the PRF on some input does not involve computing labels for the entire tree, only along a single path from root to leaf.
Construction 6.4

Claim 6.5
If is a secure PRG, then Construction
is a secure PRF.
Proof
We prove the claim using a sequence of hybrids. The number of hybrids in this case depends on the input-length parameter in. The hybrids are defined as follows:
The hybrids differ only in their hard-coded value of . We will show that
We first start by understanding the behavior of for extreme choices of
. Simplifications to the code are shown on the right.
In , we always have
, so the only entry of
that is accessed is
. Then renaming
to
, we see that
. The only difference is when the PRF seed
is sampled: eagerly at initialization time in
vs. at the last possible minute in
.
In
and the body of the forloop is always unreachable. In that case, it is easy to see that
has identical behavior to
The general pattern is that "chops off" the top
levels of the conceptual binary tree. When computing the output for some string
, we don't start traversing the tree from the root but rather
levels down the tree, at the node whose position is the
-bit prefix of
(called
in the library). We initialize the label of this node as a uniform value (unless it has already been defined), and then continue the traversal to the leaf
.
To finish the proof, we show that and
are indistinguishable:
The library that is shown here is different from in the highlighted parts. However, these dif ferences have no effect on the calling program. The library here advances
levels down the tree (to the node at location
), initializes that node's label as a uniform value, then computes the labels for both its children, and finally continues computing labels toward the leaf. The only significant difference from
is that it computes the labels of both of
's children, even though only one is on the path to
. Since it computes the label correctly, though, it makes no difference when (or if) this extra label is computed.
We have factored out the body of the if-statement in terms of since it involves an call to
on uniform input. Importantly, the seed to
(called
in the previous hybrid) was not used anywhere else - it was a string of length
while the library only reads
for
of length
. The change has no effect on the calling program.
We have applied the security of and replaced
. The change is indistinguishable. Hence,
is a secure PRF.
We have inlined and split the sampling of
bits into two separate statements sampling
bits each. In this library, we advance
levels down the tree, assign a uniform label to a node (and its sibling), and then proceed to the leaf applying
as usual. The only difference between this library and
is that we sample the label of a node that is not on our direct path. But since we sample it uniformly, it doesn't matter when (or if) that extra value is sampled. Hence, this library has identical behavior to
.