DM565 - Formal Languages and Data Processing
 
Fall 2023
Kim Skak Larsen

Home Innovation

Exercises
In preparation for the class, you should download and test SCIL, so you know if you encounter any problems. To run it, you need python3 and the ply package. If you don't have the package, you can install it with pip3 install ply. If you don't have pip3, you need to install that first, but Ubuntu (possibly linux via WSL) or your Mac should suggest how to do that.
  1. Consider the visitor example and explain why, in LeavesVisitor, it is OK to omit code for Node, and why recursive traversal through the Nodes still take place. Make a PrettyPrintVisitor where you use indentation in some way you define yourself to print the entire tree in a form where one can graphically identify the subtrees. The purpose of this exercise is to explain visitor patterns, which are used in the SCIL compiler, in a simpler setting.
  2. Resolve possible problems with downloading and running SCIL.
  3. Get a feeling for the syntax of SCIL just by looking at a few programs in the Test directory. Write a program with two functions, both with one argument, n. One function should iteratively add the numbers from 1 through n, and the other should obtain the same recursively. Print the two results for confirmation. ツ
  4. Make changes in lexer_parser.py:
    1. Allow the user to choose between using the keyword while and the keyword as_long_as.
    2. Change the end-of-statement symbol from semicolon to period.
    3. Allow vertical lines for line-up, so this is a legal program:
      while i < 10 do {
      |  s = s + i;
      |  i = i + 1;
      |  if i == 10 then {
      |  |  print 42;
      |  } else {
      |  |  dummy = 0;
      |  }
      }
      
      You are not supposed to check that the user uses the line-up in any particular way; you just have to allow the symbol. Note that an if-statement must include an else-part.
    4. Explain what happens if the line
      t.lexer.lineno += t.value.count("\n")
      
      is changed to
      t.lexer.lineno += 1
      
  5. Define the following three regular expressions:
    1. Numbers in scientific notation (E-notation).
    2. URLs (it is OK that your expression matches some reasonable superset of what is allowed).
    3. Strings as in normal programming languages, including the beginning and ending double quotes (it is OK that backslashed double quotes are not allowed inside the strings).
  6. Make and test the following four Flex scanners:
    1. Make texts (more) politically correct by replacing "idiot" with "intellectually challenged person", etc.
    2. Remove all whitespace and produce lines in lengths of 80 characters.
    3. Replace all sequences of whitespace with one blank and produce lines as long as possible, but at most 80 characters, by dividing only at blanks (that is, between words).
    4. Remove all tags from an HTML document. For those who do not speak HTML fluently, HTML is just regular text with some extra interpreted constructions. A tag consists of a "less than" symbol followed by some text and closed by a "greater than" symbol, or it might have slash after the "less than" symbol (you can view the source of this page to see an example).
  7. Appel 2.9. Just draw the DFA and mark final states with the action (1, 2, or 3) that should be taken.

 


   Data protection at SDUDatabeskyttelse på SDU