DM841 (E23) - | Heuristics and Constraint Programming for |
Discrete Optimization |
This is the first of two assignments on discrete optimization that will be used to determine your final grade. The assignment will be graded with internal censorship. The final grade for the course will be an overall assessment of the two assignments starting from the arithmetic average.
The assignment has to be carried out individually. Communication for feedback and inspiration with peers is allowed but it should be kept to a minimum. In case, you are welcome to ask to the teacher. (Use the form at the bottom of this page.)
The submission is electronic via git.imada.sdu.dk
.
Deadline: Monday, November 20, 2023, at noon
You have to submit:
a PDF document containing max 10 pages of description of the model and the results you have obtained on the data sets available
source code of your implementation of the model in MiniZinc.
The precise format for the submission is the same as for the first assignment. You find the details at the end of this page.
The submission procedure is via git.imada.sdu.dk as for the other assignments in this course (see Assignment 1 for details). The structure of your repository is listed below:
Repo_march
asg3
├── data
│ ├── F23
│ ├── F23re
│ └── README.md
├── README.md
├── src
│ ├── cml_parser.py
│ ├── data.py
│ ├── main.py
│ ├── utils.py
│ └── verifier.py
└── tex
The report must be called main.pdf and be located inside tex
.
We will use Constraint Programming to schedule the exams at the Faculty of Science at SDU.
A starting package for this case is available in
your git repository. The description of the instances has been moved
there in the README.md
file of the data/
directory.
Let $E$ be the set of exams for courses that need to have a date scheduled. Let $D$ be the set of days that are available for scheduling exams. Each exam $e$ from $E$ has a set of enrolled students $S_e$ and requires a number of days depending on whether it is written or oral.
Exams that for some reason cannot have overlapping schedules are called conflicting exams. In particular, exams with the same teacher or censor and exams that belong to the same study program are considered conflicting.
An exam schedule is an assignment of exams to days for the exam, that is, it is a function $\sigma : E \to 2^D$. An exam schedule is feasible if it satisfies the following constraints:
Each exam in input is scheduled to start and finish in the days available.
Each exam is assigned a number of days equal to the number of required days to carry out the exam.
Exams with exam duration more than one day receive consecutive days. No holes due to weekend or holidays are allowed within an exam schedule.
Exams with preassigned dates must respect those dates and need not necessarily to satisfy the previous three constraints.
Exams that are joined must have schedules starting on the same day. (The durations for both should be the sum of the durations of the two exams but you can ignore this).
Exams that are conflicting must be scheduled in sets of days with empty intersection.
The total number of students having written exams in a day must be
smaller than MAX_SEATS_PER_DAY
. Exams with preassigned dates do not
contribute to this constraint. (This constraint must be relaxed on the reexam instances.)
The total number of oral exams in a day must be smaller than
MAX_ROOMS_PER_DAY
in order to ensure that a room to host them can
be found. Exams with preassigned dates do not count in this
constraint.
If a schedule that satisfies all the above constraints is found, then one must maximize the distance between the starting times of exams that share students. This optimization task can be achieved in different ways, for example, by introducing a new constraint that enforces a distance of at least $R$ days between exams that share students and then maximizing $R$ and/or minimizing the number of students violating this constraints, or by devising a penalty scheme that penalizes exams sharing students that are scheduled too close.
If a feasible schedule cannot be found, then one must relax the
constraints on the conflicting courses and treat them as those sharing
students but with higher priority in avoiding overlaps. In this case,
the parameter WEIGHT_PROGRAMS
can be used to indicate the relative
importance between avoiding overlaps for conflicting courses and
avoiding overlaps for courses that only share students.
In the following a penalty scheme for evaluating schedules is proposed. Alternative proposals will be accepted if well motivated.
The value of an exam plan is given by the total penalty and we aim at finding the exam plan of minimal total penalty.
The total penalty is the sum of the penalties associated to each exam. For an exam $a$ scheduled to start on day $d$ its penalty value is calculated as follows. Let $E_a$ be the set of exams sharing students with exam $a$. For $a$ and a day $d$ let the penalty with respect to an exam $b$ from $E_a$ be defined as a function of the distance between the day $d$ and the starting day of $b$ by the following procedure:
The penalty for $a$ on $d$ is then given by the sum of the penalties due to all exams $b$ in $E_a$ and the total penalty of the exam plan by the sum of the penalties of all exams $a$ in $E$. Preassigned exams are included in this calculation.
In your git repository you will find input instances. You should focus the experimental analysis on the smallest instances.
New instances might be made available at a later date or kept hidden to test your programs after submission.
You have to address the following tasks:
Model the problem in constraint programming terms and describe the model in the report. Ignore initially the objective and focus on feasibility only.
Implement the model at point 2. in MiniZinc. You might need python scripting to parse the data and put it in the format needed by MiniZinc. An initial script for doing this is available.
Report the results on the small instances available in a table with the statistics that you deem important.
Modify your model to include an objective function. Report the results in a table. Indicate whether the instance is solved to optimality or not and report meaningful measures to describe the effort needed to find the solutions. If the solution found is not optimal within a given time limit, report the best solution found so far.
Improve your model by fine tuning some parameters of your choice and present the analysis with numerical results and discussion. The parameters can be: branching rules, level of consistency enforced, redundant constraints, alternative models, symmetry breaking.
Study the use of random restarts. Describe the restart strategy and the configuration you chose and find out whether random restarts improve or not the performance of the solver. Start your implementation from the best model found so far, but make sure that a restart is worth by ensuring that at each restart a modified problem is solved.
There is no predefined time limit. You can choose to run your experiments with the time limit that best fit with your computational resources (eg, time left from deadline, number of cores and machines, etc). Performance can be assessed in several ways (max size of the instances solved, computation time, solution quality, etc.) It is up to you to identify meaningful measures and comment on the performance.
Your scripts will be tested. In order to do this include a file main.py
in src. The script takes two required arguements the name of the instance and the name of the output file. The output file contains the best solution found for the instance.
python3 src/main.py data/F23/biologi.json -o output_file
See the starting package.
The output file contains a solution in text form as follows:
N100008102,170
N100037102,160
N100038102,163,164,165,166
N100040102,173
N100042102,170
N110028102,163,164
...
Alternatively the solution can be in json format as shown below:
The json object is a dictionary with keys the STADS code of the exam and value and array with the days in which the exam has been scheduled. Days are indicated by number corresponding to the day number from the start of the year. For example day 01/01/2022 is day 1.
In Python the json file can be easily obtained with the folloing lines of code:
json.dump(dates, fp=filehandle, sort_keys=False, ignore_nan=True,
indent=4, separators=(',', ': '), ensure_ascii=False)
where dates
is a dictionary with exam STADS code as key and the list of days as values.
In the instances given in the git repositories there are also rooms. What should we do with them?
Nothing. Ignore everything that has to do with rooms.
Der også typen “o” eller “u” som er for Bachelorprojekt. Skal dettes ses som “oral”, “written” eller slet ikke planlægges?
You are right that o
and u
are mostly used for bachelorprojekt but
it has to be interpreted as requiring one single day rather than, as
in the oral exams, requiring as many days as the number of
participants requires. Bachelorprojekter should be scheduled in the
first day of those available. This is because it is supposed to be the
submission of the report. All those exams should have a date already
given and hence should not require to be scheduled. But they take part
in the objective of maximing the distance between exams.
It is also possible that exams with already given schedules are scheduled before the first day. If possible they should also be taken into account in context of minimizing distances.
Should the “forbidden” and “available” information that is given at some of the exams be used when scheduling them?
That would be nice, yes. They give days where the exam cannot be planned or the only days where it can be planned.
Exams listed in conflicting
are courses that belong to the same
study program. Avoiding overlaps among these exams is a hard
constraint. Avoiding overlaps among exams that share students is a
soft constraint.
For E20 it should be possible to avoid overlaps among conflicting
exams on all instances. For E21 and later it is not known. The
assignment states "If a feasible schedule cannot be found then one
must relax the constraints on the conflicting courses and treat them
as those sharing students but with higher priority in avoiding
overlaps." In this case, the WEIGHT_PROGRAMS
indictaes the
relative importance of the two overlap constraints.
Is there a checker or a way to verify the correctness/quality of the solutions?
Yes, one will be soon be made available.
Do weekends and holidays count towards the R days between exams or is it ok to remove these before calculating R, so there is no distance between e.g. Friday and Monday?
Weekends count in the R days.
Is it ok to remove exams without any students in the pre-processing?
Yes, it is ok to remove exams without students.
I found an example of a problematic pregiven scheduling that does not use consecutive days (E22, samf.json, B540035102, screenshot below), is it ok to remove it?
Yes, it is ok to remove that specific exam. Anyway, constraints do not need to be checked for already scheduled exams.