Below are my notes from when I originally took the Udacity course Programming Lanuages. At that time, I only took notes in Microsoft Word, so there was a real lack of gifs. I’d go through and add some, but this is going to be a HUGE post and WordPress is really struggling to function. Honestly, I’m only adding this post for my own reference and to save my notes, so if I were you I’d go watch some Schitts Creek instead or take the class yourself! π

Programming Languages
Description: This class will give you an introduction to fundamentals of programming languages. In seven weeks, you will build your own simple web browser complete with the ability to parse and understand HTML and JavaScript. You will learn key concepts such as how to specify and process valid strings, sentences and program structures. Then, you will design and build an interpreter – a program that simulates other programs.
WEEK 1:
String Patterns
Finding and specifying classes of strings using regular expressions
WEEK 2:
Lexical Analysis
Breaking strings down into important words
WEEK 3:
Grammars
Specifying and deconstructing valid sentences
WEEK 4:
Parsing
Turning sentences into trees
WEEK 5:
Interpreting
Simulating programs
WEEK 6:
Building a Web Browser
Interpreting HTML and JavaScript
WEEK 7:
Wrap-up
Exam testing your knowledge
Unit 1:


Break up strings into meaningful words using Python


Find returns -1 if does not find string
Recursive could be using the base case if returns -1, otherwise it would keep passing in the newly found value as the starting position

[1:] go as far as possible
[0:-1] -1 is the last letter (so do not include)
def myfirst_yoursecond(p,q):
ps = p.find{” “)
qs = q.find(” “)
return p[:ps] == q[qs + 1:]
Can use the split function in python string.split() in order to do this
Regular Expression:

Use these to denote a large number of simple strings

Common for entering structured data (mm/dd/yyyy)
Regular expressions are for STRINGS, so 5 and 05 are NOT the same thing
import re imports python regular expression module

Python regular expressions start with a little βrβ before the double quotations
Picking the correct regular expression to use is an exercise in creativityβfinding the most elegant is a skill
re.findall(rβ[0-9]β,β1+2==3β)
returns list: [β1β,β2β,β3β]
re.findall(rβ[1-2]β,β1+2==3β)
returns list: [β1β,β2β]
Regular Expressions are CASE SENSITIVE

To create a large number of strings to search for have sub-parts to make compound regular expression
+ looks back at previous regular expression and match it once or more
rβ[0-1]β matches any and all combinations of 0 and 1

Maximal Munch will consume and match as big a string as it can at once
Finite State Machine:


| is either/or for regular expressions (disjunction)
r=β[a-z]+|[0-9]+β

Regular Expression: characters that are left alone are matched exactly:
rβab|[0-9]+β will match βabβ OR any number of digits

Epsilon is used to βconsume no inputβ
? βoptionalβ βthe previous thing 0 or 1 timesβ
re.findal(rβ-?[0-9]β,β1861-1941 R. Targoreβ)
returns: [β1861β,β-1941β]
* is zero or more (+ is one or more)
+ * ? [ ] all mean something special, so you need the Escape Sequences
\ in front of quote/double quote, it will treat it as being part of the string (escape out of using quotes as delimiters)

rβ\+\+β finds the string β++β
r”[a-z]+-?[a-z]+” matches lowercase words with optional single hyphenated words
however: needs at least 2 letters, so will not accept βaβ or βiβ

regexp = r”[a-z]+\( *[0-9]+ *\)”

[0-9].[0-9] would be something like 1a1 or 2b4
^ inside [] this means not or set complement βasking for anything that is not whatever is specified
[^ab] will return anything that is not βaβ or βbβ

Like math in grouping, but opening parenthesis is (?: )
mi+ the + is applying to the i, so would get miiii and not mimi

MIDI: musical instrument digital interface: list of notes
MP3: not like this (at least, does not look like it at first blush)

rββ(?:[^\\]|(?:\\.))*β




Maps (associate one thing with another, aka dictionary) and Tuples (immutable listβcannot be changed)



Simulate how Finite State Machine works
# FSM Interpretation
# Define edges and accepting to encode r”q*”. Name your start state 1.
edges = {(1,’q’):1}
accepting = [1]
def fsmsim(string, current, edges, accepting):
if string == “”:
return current in accepting
else:
letter = string[0]
if (current, letter) in edges:
destination = edges[(current, letter)]
remaining_string = string[1:]
return fsmsim(remaining_string, destination, edges, accepting)
else:
return False
print fsmsim(“”,1,edges,accepting)
# >>> True
print fsmsim(“q”,1,edges,accepting)
# >>> True
print fsmsim(“qq”,1,edges,accepting)
# >>> True
print fsmsim(“p”,1,edges,accepting)
# >>> False

Finite State Machines are generous: will allow it to return True if ANY path can lead to accepting (even if path is labeled the same)



2 kinds of FSMs:
Non-deterministic FSM: have choices
Lock-step FSM: no epsilon or ambiguity




Converting from nondeterministic to deterministic: group together and βepsilonsβ are freebies
Wrap Up/Conclusion for Unit 1:


Homework 1:
def sumnums(sentence):
nums = re.findall(r”[0-9]+”,sentence)
return sum(int(n) for n in nums)
Unit 2
<a href=βlinkβ> (href stands for hypertext reference)
Lexicon Analysis: lexicon (dictionary) to break something down into words
Token: smallest unit of lexical analysis output: words, strings, numbers, punctuation (Not whitespace)

Use regular expressions to specify tokens
def t_RANGLE(token):
rβ>β
return token

def t_STRING(token):
rββ[^β}*ββ ^β is anything except a double-quote
return token
def t_WORD(token):
r'[^ |^<|^>]*’ not a space, < or > CAN ALSO BE WRITTEN:
rβ[^ <>\n]+β
Lexical Analyzer (lexer): just a collection of token definitions
Ambiguity: sometimes can be identified as more than one token (number or word), so the first one wins (ordering tokens is of prime importance!)
We donβt just serve hamburgers, we serve people!
<!–comments–>


Canβt count backward from end of string using negative numbers, so -1 is the first last character (or end of string)
import ply.lex as lex
output of a lexer is a list of tokens

LexToken(tokentype,tokenvalue,line,characternum)
def t_newline(token):
rβ\nβ
token.lexer.lineno +=1
pass
states = (
(βhtmlcommentβ,βexclusiveβ)
)
Javascript, embed into HTML:
<script type=βtext/javascriptβ>
document.write(βHello worldβ); equivalent of print βHello worldβ in python
</script>
<script type=βtext/javascriptβ>
function factorial(n) {
if (n==0 {
return 1;
} ; <-python uses tabbing to figure out start/end, javascript uses curly-braces
return n * factorial(n-1);
}
document.write(factorial(5));
</script>


def t_IDENTIFIER(token):
rβ[a-zA-Z] [a-zA-Z]*β
def t_NUMBER(token):
r’-?[0-9]+(:?\.[0-9]*)?’
token.value = float(token.value)
return token
#python comments
//javascript comments
βvisual basic comments
rβ//[^\n]*β
Summary:
HTML/JavaScript can be broken up into tokens number, word, string (use regular expressions to specify)

Unit 3
Noam Chomsky 1955: Syntactic Structures: utterances have rules (formal grammars)
Grammar, rewrite rule, non-terminals, terminals

Derivation able to derive βstudents thinkβ using the rewrite rules
Another derivation teachers write
Recursion in a context-free grammar can allow for an infinite number of utterances
subj subj and subj is a recursive rewrite rule

Syntactical Analysis βParsingβ
Token List Valid in Grammar?
Lexing + Parsing = Expressive Power
Word Rules + Sentence Rules = Creativity!

Parsing turns grammar into structure, lexing turns terminal (number) into something valid
stmt (statement) identifier = exp
Optional Parts of Languages

E/Epsilon means blank string, nothing, space

Grammar >= RegExp (anything can specify with regular expression can be expressed in grammar)
Regular expressions describe regular languages
Grammars describe context-free languages (can have things in between the expression finding)

Grammar allows balanced parenthesis, while regular expression could allow unbalanced
p(p)
pE
IMPOSSIBLE to capture balanced parenthesis with regular expression
Balanced parenthesis are everywhere in HTML and JavaScript (tags)
Must nest tags appropriately (or not valid HTML)

Parse Trees: nodes and leaf

90% of software is maintenance

Ambiguity: which is using the binoculars?

If you can find 1 example where you can have 2 separate parse trees, effectively entire grammar is ambiguous
Grammars for HTML & JavaScript
htmlelement html
htmlEpsilon
elementword
elementtag_open html tag_close
elementtag_open word tag_close
tag_open<word>
tag_close</word>
JavaScript is similar to Python
Python:
def absval(x):
if x<0:
return 0 β x
else:
return x
JavaScript:
function absval(x) {
if x<0 {
return 0 β x;
} else {
return x;
}
}
Python:
print βhelloβ + β!β
JavaScript:
document.write(βhelloβ + β!β);
Or
write(βhelloβ + β!β);
Python:
def uptoten(x):
if x<10:
return x
else:
return 10
JavaScript:
function uptoten(x){
if x<10 {
return x;
} else {
return 10;
}
}
Universal Meaning?
def vs function
meaning stayed the same: we can translate between python and javascript
Universal Grammar? Languages are all Turing β complete
JavaScript Grammar:
Expidentifier
Expnumber
Exptstring
ExpTRUE
ExpFALSE
Expexp + exp
Expexpt-exp
Etcβ¦


Weβve seen Expressions and Statements, now we need FUNCTIONS!
Functions = Declare and Call
Majority of JS Grammar

Creating Grammars vs Checking Utterances:
How to get program to decide whether or not string is valid:
Super Slow: Enumerate all valid strings (since there are infinite grammars, may take an infinite number of times)
Weβll try this first:
- Instructive
- Power tools
Introducing: LAMBDA
Python:
def addtwo(x): return x+2
addtwo(2)
4
mystery = lambda(x): x+2
mystery(3)
5
pele = mystery
pele(4)
6
Lamda means βmake me a functionβ
Lamda is popular in paradigm of functional programming, contrast to object oriented programming and imperative programming
List Power
def mysquare(x): return X*X
map(mysquare,[1,2,3,4,5]) = [1,4,9,16,25]
OR
map(lambda(x):x*x,[1,2,3,4,5]) = [1,4,9,16,25] anonymous function (never given a name)
map is also popular in functional programming, transforms each element in a list by doing work to each of its elements
Real Convenience: List Comprehensions
[x*x for x in [1,2,3,4,5]
] will do the same as above
List comprehension defines a new list by transforming an old one
The word “comprehension” in “list comprehension” relates to the set comprehension notation in Mathematics.
Brendan Eich: CTO of Mozilla and designer of JavaScript
How to filter
def odds_only(numbers):
for n in numbers:
if n / 2 == 1:
yield n
t [x for x in [1,2,3,4,5] if x%2 == 1] only returns those that mod to 1 (or are odd)


if tokens[pos] == rule[0]:
yield tokens[0:pos] + rule[1] +tokens[pos+1:]
Unit 4 Personal Notes
Device-drivers with third party software would cause lots of the Microsoft OS bugs
Microsoft Static-driver verifier (SLAM) to check for bugs
Given a string s and a grammar G, is s in the language of G?
Lexing: Lexical analysis that breaks a string down into tokens
Parsing: Syntactical analysis that checks to see if they conform to a context-free grammar
Time flies like an arrow, fruit flies like a banana. ambiguity and hilarity
Brute Force: try all options exhaustively
Programmer virtues:
- Laziness: not duplicating work, makes you go to great effort to reduce overall energy expenditures
- Impatience: writing something to anticipate (or seem to anticipate) next move because computer is too slow if it just reacts
- Hubris: excessive pride: makes you want to write and maintain programs that other people wonβt want to say bad things about
Memoization: computer science technique in which we keep a βchartβ or βrecordβ of previous computations and computer new values in terms of previous answers
If webpage takes longer than 6 seconds to get back to you, you go somewhere else/buy something else (figured out by study)
Implement a chart as a python dictionary: {}
Chart = {}
chart[5] = 5
chart[6] = 8
if 6 in chart:
print chart[6]
Fibonacci with Memoization:
# Memofibo
# Submit via the interpreter a definition for a memofibo procedure that uses a
# chart. You are going to want the Nth value of the chart to hold the Nth
# fibonacci number if you’ve already computed it and to be empty otherwise.
chart = {}
def memofibo(n):
if n in chart:
return chart[n]
elif n > 2:
chart[n] = memofibo(n-2) + memofibo(n-1)
return chart[n]
return 1
print memofibo(24)
Idea: Use memoization for parsing!

A Parsing State is a rewrite rule from the grammar augmented with 1 red dot on the right-hand side
Chart[n] = all parse states we could be in after seeing t1, t2β¦tn only

We will use memoization in our parser. The nth entry in the chart will contain all of the parse states we could be in after seeing the first n tokens of the input.
Must add starting position or from position to our parse states
Alternate view: PARSING is the inverse of producing strings

Applying the reductions in reverse order it is parsing, the other way is generating the string
Examples: * symbolizes red dot
chart[i] has x ab * cd from j
for all grammar rules
c pqr
we add
cpqr from i
to chart[i]
-Predicting or Computing the Closure
Three ways to complete parsing chart: Computing the Closure is one (if non-terminal)
Another way: Shiftingβconsuming the input (if terminal, just shift over it)
Third way: Reduce (what if nothing left?), reduce it to the grammar rule
E E + E
E int
int + int + int
E + int + int (Magicβor reduction)



Jay Earley invented the approach to parsing that we’ve been developing here. This sort of parsing is also commonly used in computational linguistics.
Code this in python using dictionary to group lists, in order to not add duplicates use sets on right side of dictionary
def addtochart(chart, index, state):
if not (state in chart[index]):
chart[index] = [state] + chart[index] how to concat to list
return True
else:
return False
List comprehensions make a glorious re-appearance! In addition, we encode our parsing states as Python tuples.
For more information:
- Declarative programming has many benefits. One such benefit is that it often allows the compiler or language system to efficiently implement your will for you: you don’t have to worry about implementation details, you just state what you want and it happens.
Paper:
sP
P(P)
P
In python:
grammar = [
(βsβ, [βPβ]),
(βPβ,[β(β,βPβ,β)β]),
(βPβ,[]),]
Paper: x ab * cd from j
Python:
state = (βxβ,[βaβ,βbβ],[βcβ,βdβ],j)

def closure(grammar, i, x, ab, cd, j):
next_states = [
(rule[0],[],rule[1],i)
for rule in grammar
if cd<>[] and rule[0]==cd[0]
]
return next_states
def shift(tokens,ab,cd,i,j,x)
if cd<>[] and tokens[i]==cd[0]:
return (x,ab+[cd[0]],cd[1:],j)
else:
return None
def reductions(chart,I,x,ab,cd,j):
return [
(jstate[0],jstate[1]+[x],(jstate[2])[1:],jstate[3])
for jstate in chart[j]
if cd==[] and jstate[2]<>[] and (jstate[2])[0]==x ]

[βword_elementβ,βRamaβsβ), βword_elementβ,βJourneyβ)]
Now that able to break down into tokens, parse into trees, next step is to interpret by walking across the tree
Brendan Eich, CTO of Mozilla and creator of JavaScript, talks about uses of memoization at Mozilla β more at run-time than during parsing.
Unit 5
Making web browser:
- String of HTML and JavaScript
- Lexical analysis: Break it down into tokens/words
- Syntactical analysis: Parse those into a tree
- Interpret: Walk the tree and understand it
Syntax (form of an utterance) vs Semantics (meaning)
βColorless green ideas sleep furiously!β β Noam Chomsky
Syntax and semantics are not the same thing

Lexing and parsing deal with the form of an utterance. We now turn our attention to the meaning of an utterance — we turn our attention to semantics.
A well-formed sentence in a natural language can be “meaningless” or “hard to interpret”. Similarly, a syntactically valid program can lead to a run-time error if you try to apply the wrong sort of operation to the wrong sort of thing (e.g., 3 + “hello”).
For more information:
- “Colorless green ideas sleep furiously” is a syntactically-valid sentence constructed by linguist and philosopher Noam Chomsky to show that an utterance can be well-formed but have no clear meaning. Thus, syntax and semantics are not the same thing.
For more information:
- Mars is the fourth planet from the Sun. Its English name derives from the Roman god of War.
- Gustav Holst was a 19th-century English composer. His orchestral suite, The Planets, remains quite popular.
One goal of semantic analysis is to notice and rule out bad programs (i.e., programs that will apply the wrong sort of operation to the wrong sort of object). This is often called type checking.
Ruling Out Bad Programs:
Type checking / semantic analysis (looking at source code and make sure it makes sense)
A type is a set of similar objects (like numbers, strings, lists, etc) with associated operations
+ adds numbers, concats strings, and appends lists
len measures number of characters in string and elements in list
A type is a set of similar objects (e.g., number or string or list) with an associated set of valid operations (e.g., addition or length).
For more information:
- Definition: execute.
For more information:
- Syllepsis (closely related to zeugma) is an often-humorous semantic incongruity.
- The syllepsis examples shown here were taken from Have Some Madeira, M’Dear, a dark and witty poem by Flanders and Swann.
βshe made no reply, up her mind, and a dash for the door.β
βshe lowered her standards by raising her glass, her courage, her eyes and his hopes.β
Mismatched tags
For more information:
- Syllepsis (closely related to zeugma) is an often-humorous semantic incongruity.
- The syllepsis examples shown here were taken from Have Some Madeira, M’Dear, a dark and witty poem by Flanders and Swann.
GRAPHICS:
We introduce the particular graphics library we will use in this class to render webpages. It is intentionally simple so that we can focus on core programming language concepts.
For more information:
- Nelson Mandela served as president of South Africa starting in 1994. He was the first to be elected in an election involving universal suffrage and he won the Nobel peace prize. He was a noted opponent of the racial segregation policy of aparthied.
Use a library of someone elseβs graphics library to help render a webpage:
graphics.word(String)
graphics.begintag(string,dictionary) a, href=βgoogleβ
graphics.endtag()
graphics.warning(string)debugging
Example:
Nelson Mandela <b> was elected </b> democratically.
Would turn into:
graphics.word(βNelsonβ)
graphics.word(βMandelaβ)
graphics.begintag(βbβ,{})
graphics.word(βwasβ)
graphics.word(βelectedβ)
graphics.endtag()
graphics.word(βdemocraticallyβ)
the graphics library would determine the word wrapping, appearance of bold, etc
HTML INTERPRETTER:
# Your function should display HTML according to a given parse tree.
# graphics.warning(msg) displays an error message. Upon encountering mismatched
# tags, use graphics.warning to display the error message: “mismatched tag”.
# To display a tag, use graphics.begintag(tag,args) at the start and
# graphics.endtag() at the end of the tag.
import graphics
def interpret(trees): # Hello, friend
for tree in trees: # Hello,
# (“word-element”,”Hello”)
nodetype=tree[0] # “word-element”
if nodetype == “word-element”:
graphics.word(tree[1])
elif nodetype == “tag-element”:
# <b>Strong text</b>
tagname = tree[1] # b
tagargs = tree[2] # []
subtrees = tree[3] # …Strong Text!…
closetagname = tree[4] # b
if tagname <> closetagname:
graphics.warning(“mistmatched tag”)
else:
graphics.begintag(tagname, tagargs)
interpret(subtrees)
graphics.endtag()
# Note that graphics.initialize and finalize will only work surrounding a call
# to interpret
graphics.initialize() # Enables display of output.\
interpret([(“word-element”,”Hello”)])
graphics.finalize() # Enables display of output.
Eval (evaluation) is what itβs called to solve arithmetic expressions
We will write a recursive procedure to interpret JavaScript arithmetic expressions. The procedure will walk over the parse tree of the expression. This is sometimes called evaluation.

Java Evaluation Expression
# Quiz: Eval Exp
# Write an eval_exp procedure to interpret JavaScript arithmetic expressions.
# Only handle +, – and numbers for now.
def eval_exp(tree):
# (“number” , “5”)
# (“binop” , … , “+”, … )
nodetype = tree[0]
if nodetype == “number”:
return int(tree[1])
elif nodetype == “binop”:
left_child = tree[1]
operator = tree[2]
right_child = tree[3]
left_value = eval_exp(left_child)
right_value = eval_exp(right_child)
if operator == “x”:
return left_value + right_value
elif operator == “-“:
return left_value – right_value
test_tree1 = (“binop”,(“number”,”5″),”+”,(“number”,”8″))
print eval_exp(test_tree1) == 13
test_tree2 = (“number”,”1776″)
print eval_exp(test_tree2) == 1776
test_tree3 = (“binop”,(“number”,”5″),”+”,(“binop”,(“number”,”7″),”-“,(“number”,”18″)))
print eval_exp(test_tree3) == -6
We need to know the values of variables β the context β to evaluate an expression. The meaning of x+2 depends on the meaning of x.
- The last King of France ruled around 1870.

The state of a program execution is a mapping from variable names to values. Evaluating an expression requires us to know the current state.
To evaluate x+ 2, weβll keep around a mapping from variable names to values. This mapping is called the state. {βxβ:3,βyβ:3}
LOOKING UP IDENTIFIERS (VARIABLES)
# QUIZ : Variable Lookup
# Adding variable lookup to the interpreter!
def eval_exp(tree, environment):
nodetype = tree[0]
if nodetype == “number”:
return int(tree[1])
elif nodetype == “binop”:
left_value = eval_exp(tree[1], environment)
operator = tree[2]
right_value = eval_exp(tree[3], environment)
if operator == “+”:
return left_value + right_value
elif operator == “-“:
return left_value – right_value
elif nodetype == “identifier”:
variable_name = [tree[1]
return env_lookup(environment,variable_name)
# Here’s some code to simulate env_lookup for now. It’s not quite what we’ll be
# using by the end of the course.
def env_lookup(env,vname):
return env.get(vname,None)
environment = {“x” : 2}
tree = (“binop”, (“identifier”,”x”), “+”, (“number”,”2″))
print eval_exp(tree,environment) == 4
If, While, Return change the flow of control
Python and JavaScript have conditional statements like if β we say that such statements can change the flow of control through the program.
Program elements that can change the flow of control, such as if or while or return, are often called statements. Typically statements contain expressions but not the other way around.
For more information:
- The Ingenious Gentleman Don Quixote of La Mancha was written by Miguel de Cervantes, a Spanish author.
- The phrase tilting at windmills entered the popular lexicon as a result of Don Quixote.
Statements are like entire sentences whereas expressions are like nouns
Statements often contain expressions, but not the other way around.
EVALUATING STATEMENTS:
# QUIZ: Evaluating Statements
def eval_stmts(tree, environment):
stmttype = tree[0]
if stmttype == “assign”:
# (“assign”, “x”, (“binop”, …, “+”, …)) <=== x = … + …
variable_name = tree[1]
right_child = tree[2]
new_value = eval_exp(right_child, environment)
env_update(environment, variable_name, new_value)
elif stmttype == “if-then-else”: # if x < 5 then A;B; else C;D;
conditional_exp = tree[1] # x < 5
then_stmts = tree[2] # A;B;
else_stmts = tree[3] # C;D;
# QUIZ: Complete this code
# Assume “eval_stmts(stmts, environment)” exists
If eval_exp(conditional_exp,environment) == True:
eval_stmts(then_stmts, environment)
Else:
eval_stmts(else_stmts, environment)
def eval_exp(exp, env):
etype = exp[0]
if etype == “number”:
return float(exp[1])
elif etype == “string”:
return exp[1]
elif etype == “true”:
return True
elif etype == “false”:
return False
elif etype == “not”:
return not(eval_exp(exp[1], env))
def env_update(env, vname, value):
env[vname] = value
environment = {“x” : 2}
tree = (“if-then-else”, (“true”, “true”), (“assign”, “x”, (“number”, “8”)), (“assign”, “x”, “5”))
eval_stmts(tree, environment)
print environment == {“x”:8}
Scope:
We use the term scope to refer to the portion of a program where a variable has a particular value.
For more information:
- Hong Kong is a Special Administrative Region of China. It was controlled by the United Kingdom until 1997.
- The most common dialect spoken in Hong Kong is Cantonese. It is pronounced very differently from the Mandarin dialect but they are written similarly.
Identifiers and Storage:
Because the value of a variable can change, we will use explicit storage locations to track the current values of variables.

Environment is used to describe scope of variable (the box it is in)
There is a special global environment that can hold variable values. Other environments have parent pointers to keep track of nesting or scoping. Environments hold storage locations and map variables to values.
Every environment except the global one has parents
There is a special global environment that can hold variable values. Other environments have parent pointers to keep track of nesting or scoping. Environments hold storage locations and map variables to values.

For more information:
- If you find yourself stuck in a hotel, consider reading No Exit, an existentialist play by Jena-Paul Sartre.
- If you can never quite reach Shell Beach or Pebble Beach, you may enjoy Dark City, a 1998 science-fiction film.
- If the home is more your style, you may find The House of Mirth by Pulitzer-prize winning novelist Edith Wharton of interest. I might start with The Age of Innocence, however.
In which nested function declarations are introduced.
For more information:
- Lucknow, the City of Nawabs, is the capital of Uttar Pradesh, the most populous region in India.


For more information:
- Gracie Allen was a radio, television and film comedian.

For more information:
- One Thousand and One Nights (ΩΨͺΨ§Ψ¨ Ψ£ΩΩ ΩΩΩΨ© ΩΩΩΩΨ©) is a collection of Middle Eastern and Southern Asian stories, framed around Scheherazade.
- Sindbad the Sailor (Ψ§ΩΨ³ΩΨ―Ψ¨Ψ§Ψ― Ψ§ΩΨ¨ΨΨ±Ω) meets everything from a giant roc to The Old Man of the Sea.
Catching Errors:
Modern programming languages use exceptions to notice and handle run-time errors. “try-catch” or “try-except” blocks are syntax for handling such exceptions.
For more information:
- Joseph Heller wrote Catch-22, a satirical novel. The protagonist pretends to be insane to avoid the dangers of war, but the desire to avoid danger is evidence of his sanity.
- Definition: catch-22
Frames β Function Calls
# QUIZ : Frames
# Return will throw an excception
# Function Calls: new environments, catch return values
def eval_stmt(tree,environment):
stmttype = tree[0]
if stmttype == “call”: # (“call”, “sqrt”, [(“number”,”2″)])
fname = tree[1] # “sqrt”
args = tree[2] # [ (“number”, “2”) ]
fvalue = env_lookup(fname, environment)
if fvalue[0] == “function”:
# We’ll make a promise to ourselves:
# (“function”, params, body, env)
fparams = fvalue[1] # [“x”]
fbody = fvalue[2]
fenv = fvalue[3]
if len(fparams) <> len(args):
print “ERROR: wrong number of args”
else:
newenv = ( fenv, {} ) # new environment
for I in range(len(args)):
argval = eval_exp(arges[i],env)
(newenv[1])[fparams[i]] = argval # add to env
try:
eval_stmts(fbody,newenv)
return None
except Exception as return_value:
return return_value
else:
print “ERROR: call to non-function”
elif stmttype == “return”:
retval = eval_exp(tree[1],environment)
raise Exception(retval)
elif stmttype == “exp”:
eval_exp(tree[1],environment)
def env_lookup(vname,env):
if vname in env[1]:
return (env[1])[vname]
elif env[0] == None:
return None
else:
return env_lookup(vname,env[0])
def env_update(vname,value,env):
if vname in env[1]:
(env[1])[vname] = value
elif not (env[0] == None):
env_update(vname,value,env[0])
def eval_exp(exp,env):
etype = exp[0]
if etype == “number”:
return float(exp[1])
elif etype == “binop”:
a = eval_exp(exp[1],env)
op = exp[2]
b = eval_exp(exp[3],env)
if op == “*”:
return a*b
elif etype == “identifier”:
vname = exp[1]
value = env_lookup(vname,env)
if value == None:
print “ERROR: unbound variable ” + vname
else:
return value
def eval_stmts(stmts,env):
for stmt in stmts:
eval_stmt(stmt,env)
sqrt = (“function”,(“x”),((“return”,(“binop”,(“identifier”,”x”),”*”,(“identifier”,”x”))),),{})
environment = (None,{“sqrt”:sqrt})
print eval_stmt((“call”,”sqrt”,[(“number”,”2″)]),environment)
In languages like Python or JavaScript, functions can be values (i.e., evaluating an expression can result in a function). As a result, we must decide how to represent function values. We’ll use tuples that include the function body, parameter names, and the relevant environment.
(βfunctionβ,fparams,fbody,fenv)


Real World: language influences thought (somethings easier in 1 language than another)
Computing: languages are equally expressive (somethings easier in 1 language than another)
Interpreting is DEEP: the downside is that simulating a program often involves running it
Infinite Loop:
Want: look at program source, see if it loops forever or if it halts
This is impossible (proven)

Draw crossed swords to resemble contradiction
If assumed true, it becomes false. If false, it becomes true. No possible value for halts that works. This is a circular referenceβparadox (self reference).
This sentence is false. paradox
Instead make programs that are correct with very high probability (like 99.9999%), but impossible to be correct 100% of the time
View this as an opportunity to come as close as possible (reach for the stars)
Unit 6
We describe the software architecture of our web browser. Our HTML lexer, parser and interpreter will drive the main process; our JavaScript lexer, parser and interpreter will serve as subroutines.
For more information:
- A blueprint is a detailed plan documenting a design.
Web Browser Architecture:
- Web page is lexed and parsed
- HTML interpreter walks AST, calls JavaScript Interpreter
- JavaScript code calls write()
- JavaScript interpreter stores text from write()
- HTML Interpreter calls graphics library
- Final image of webpage is created
We change our HTML lexer to recognize embedded JavaScript fragments as single tokens. We’ll pass the contents of those tokens to our JavaScript lexer, parser and interpreter later.
For more information:
- A jigsaw puzzles were first commercialized in the 18th century.
- Defensive programming is more commonly invoked when dealing with security or correctness requirements.
Save JavaScript to interpret for later because JavaScript might look like HTML and will be interpreted incorrectly
def t_javascript(token):
rβ\script\ type=\βtext\/javascript\β\>β
token.lexer.code_start = token.lexer.lexpos
token.lexer.begin(βjavascriptβ)
def t_javascript_end(token):
rβ\<\/script\>β
token.value = token.lexer.lexdata[token.lexer.code_start:
token.lexer.lexpos-9] subtracting 9 to remove the ending token
token.type = βJAVASCRIPTβ
token.lexer.lineno += token.value.count(β\nβ)
token.lexer.begin(βINITIALβ)
return token
We extend our HTML parser to handle our special token representing embedded JavaScript.
For more information:
- 99 Luftballons is a 1983 German protest song.
For more information:
- Yacc, or yet another compiler compiler, is the prototypical parser generator.
A JavaScript program may contain zero, one or many calls to write(). We will use environments to capture the output of a JavaScript program.
The sort of name collision discussed here is a significant concern in computer security. It would be safer to use a separate variable or channel to communicate such strings back to our interpreter.
Good to have a space because there is no way to create user variable with a space in it, prevents collision and security risk
An example of a webpage with embedded computation is shown.
For more information:
- The Chinese Room thought experiment was published in 1980 by philosopher John Searle. In it, a computer is constructed that is able to carry on conversations in written Chinese, but intuition may suggest that the computer does not actually having understanding or consciousness. The thought experiment had such a lasting impact that the field of cognitive science has been jokingly described as “the ongoing research program of showing Searle’s Chinese Room Argument to be false.”
A good test case gives us confidence that a program implementation adheres to its specification. In this situation, a good test case reveals a bug.
For more information:
- Jeanne d’Arc was a French national heroine and martyr. She was burned at the stake for heresy in 1431.
For more information:
- Eggplant, tomato and potato are all members of the nightshade (Solanaceae) family of plants.
We use testing to gain confidence that an implementation (a program) adheres to it specification (the task at hand). If a program accepts an infinite set of inputs, testing alone cannot prove that program’s correctness. Software maintenance (i.e., testing, debugging, refactoring) carries a huge cost.
The implementation adheres to the solution specifications and meets all of the requirements
Maintenance: 90% of cost/time of software
For more information:
- Baldr was a god in Norse mythology.
- Achilles was a hero in Greek mythology and a key figure in The Illiad.
Steve Fink of Mozilla talks about testing in the real world. The “range error” example he mentions is related to array bounds checking, a critical language feature for both correctness and security. He argues for simple test cases with obvious control flow, and talks about removing parts of a test case to help localize the fault.
It can be difficult to test nested or anonymous functions. We will add support for them to our JavaScript interpreter.
Remember from previous Units that we store function values as 4-tuples: (“function”, parameters, body, environment).
My old job localizing compiler bugs by shrinking test cases is now almost completely automated by Delta Debugging, Andreas Zeller‘s scientific approach to debugging.
On a personal note, Udacity will be offering a course on debugging by Andreas Zeller in a bit. I highly recommend that you take it β Andreas is a great guy and he explains things clearly.
An optimization improves the performance of a program while retaining its meaning (i.e., without changing the output).
For more information:
- The Tortoise and the Hare is a fable attributed to Aesop.
βJITβ Just-in-Time Improvement
Optimization Idea: replace input webpage with one that is faster but has the same meaning
For more information:
- The Republic (ΠολιΟΡία) is a 380 BCE Socratic dialogue concerned with justice.
- Who watches the watchers? is not actually directly related to Plato’s Republic, but is commonly associated with it in modern times.
How to Implement Optimizations:
- Think of optimizations
- Transform parse tree
Replacing an expensive multiplication with a cheaper addition is an instance of strength reduction.
x/x cannot just equal 1 because of 0/0 (exception), cannot rule out exceptions when doing optimizations
Steve Fink of Mozilla talks about optimization in the real world. Mozilla’s JavaScript interpreter is called SpiderMonkey, and it has gone through a number of different just-in-time (JIT) optimization architectures. He reminds us that you must always be aware of the cost of an optimization. He also notes that it is rare that a software engineer spends the entire day writing new code β looking over old code is very important and time-consuming.
Optimizing Timing:
Change the parse tree before interpreting
Steps of Browser:
Program Text Lexing (using regular expressions to break apart) Tokens Parsing (using context-free grammars to make sure everything is syntactically sound) Tree Optimization (optional) Tree (simpler one, or at least the same one) Interpreting Result (meaning)
In this class we will optionally perform optimization after parsing but before interpreting. Our optimizer takes a parse tree as input and returns a (simpler) parse tree as output.
We desire an optimizer that is recursive. We should optimize the child nodes of a parse tree before optimizing the parent nodes.
# Optimization Phase
def optimize(tree): # Expression trees only
etype = tree[0]
if etype == “binop”:
# Fix this code so that it handles a + ( 5 * 0 )
# recursively! QUIZ!
a = optimize(tree[1])
op = tree[2]
b = optimize(tree[3])
if op == “*” and b == (“number”,”1″):
return a
elif op == “*” and b == (“number”,”0″):
return (“number”,”0″)
elif op == “+” and b == (“number”,”0″):
return a
return tree
return tree
print optimize((“binop”,(“number”,”5″),”*”,(“number”,”1″))) == (“number”,”5″)
print optimize((“binop”,(“number”,”5″),”*”,(“number”,”0″))) == (“number”,”0″)
print optimize((“binop”,(“number”,”5″),”+”,(“number”,”0″))) == (“number”,”5″)
print optimize((“binop”,(“number”,”5″),”+”,(“binop”,(“number”,”5″),”*”,(“number”,”0″)))) == (“number”,”5″)
A fuller answer would replace return tree with return (a,op,b) at the end. The question only asked you to handle a+(5*0), but we would really like to be more general as well.
For more information:
- Ethos, Pathos and Logos are the three modes of persuasion in classical rhetoric.
Genetic Programming uses evolutionary algorithms inspired by biological notions to find computer programs that have certain properties. It is possible to use such evolutionary computation to automatically repair defects in programs.
Wrap Up:
- Lexing
- Regular Expressions
- Finite State Machines
- Parsing
- Context Free Grammars
- Dynamic Programming / Parse States
- Optimizing
- Must retain meaning
- Interpreting
- Walking the Abstract-Syntax Tree recursively
- JavaScript interpreter calls the HTML interpreter, and then that calls the Graphics interpreter
- Walking the Abstract-Syntax Tree recursively
- Debugging
- Gain confidence