DM565 - Formal Languages and Data Processing
 
Fall 2023
Kim Skak Larsen

Home Innovation

Exercises
  1. Run the DFA Minimization Algorithm on the example from the lecture slides to make sure you can reproduce the result. Then try on the following DFA:
  2. Find all the problems in the following data set, which is described as listing given name, family name, CPR number, zip code (Danish), city, date for an MS degree, and whether or not the person has a Ph.D. Let the lecture slides on data cleaning inspire you in your search. Discuss which kind of problems you have found, the methods you used for finding them, and which ones can be fixed and how? [Some browsers may open the file with some (csv) application; to avoid this, you may be able to right-click and save the file that way ("Save link as..." or something similar).]
  3. Save this page, using the browser or wget, and using Beautiful Soup, find
    1. all files that are refences using an a-tag with a href attribute,
    2. all exercise questions that have subquestions (all entries in an ol-tag that has ol-subtags),
    3. all the subquestions in exercises (all entries that have a parent that is an li-tag inside an ol-tag),
    4. all 2nd subquestions,
    5. all li-tags containing the word "Using", and
    6. all texts in code-tags.

 


   Data protection at SDUDatabeskyttelse på SDU