DM560, Introduction to Programming in C++

Bike traffic

In the file data.txt you find a list of timestamps indicating the transit of a bike by a counter placed between Niels Bohr Alle and Rødeg\aardsvej. The recording period goes from 2017-01-01 to 2017-09-30 (hence before the tunnel was constructed) and considers only one direction. There are in all 445883 timestamps.

The task of this assignment is to parse the file, interpret the data as time data, calculate some statistics on the basis of the imported data, and finally plotting a bar chart representing the time series and textual histogram representing the flow distribution.

Task 1

Implement a class Time for data of type timestamp. Objects of this class should have at least the following functionalities:

  • A constructor that takes as input the string corresponding to the timestamp, in the following form: yyyy-mm-dd hh:nn:ss where yyyy is the year, mm is the month, dd is the day, hh is the hour, nn is the minutes, ss is the seconds. Internally, the date and time must be stored in variables with names year, month, day, hour, minutes, seconds.
  • The constructor must perform a validity check and throw an exception if the string passed as argument is not in the given form or a valid date.

  • The copy constructor and copy assignment operators implement to copy of the datetime type.

  • The accessors to the data of the class must be of const type and there are none of non-const type.

  • The operators <= and >= must be overloaded for objects of this class.

  • The operator += must be overloaded for objects of this class assuming the addition is between the datetime and an integer number representing the number of minutes.

In the main function make sure to include tests to assess all these functionalities (in particular the case of incorrect input data).

Task 2

Implement a class Data with the following functionalities:

  • The constructor takes the name of a file where the input data are to be found, it parses the file and stores in an STL vector the time of each time stamp read.

  • A function that processes the vector from the previous point and returns the number of bikes passed for any given interval of time. The function takes as an input parameter the minutes of the interval (eg, 15 for quarters, 60 for hours and 1440 for days). For example, given the following time stamps:
    2017-01-01 01:14:32
    2017-01-01 01:23:18
    2017-01-01 01:25:01
    2017-01-01 01:27:57
    2017-01-01 01:40:32
    2017-01-01 01:45:15
    ....
    2017-01-02 01:05:00
    2017-01-02 01:10:00
    

    the counters calculated by the function for intervals of 15 minutes would be:

    [2017-01-01 01:00:00; 2017-01-01 01:14:59] -> 1
    [2017-01-01 01:15:00; 2017-01-01 01:29:59] -> 3
    [2017-01-01 01:30:00; 2017-01-01 01:44:59] -> 1
    [2017-01-01 01:45:00; 2017-01-01 01:59:59] -> 1
    ...
    [2017-01-02 01:00:00; 2017-01-02 01:15:00] -> 2
    

    and for intervals of 10 minutes: 0,1,3,2,0,...,1,1

  • A function that determines the interval (eg, quarter, hour, day) with the largest number of bikes passed through the counter. The function therefore takes as parameter a time interval in minutes and returns 2 timestamps between which the maximum frequency occurred or it returns an identifier of the time interval. An identifier can be a counter from the beginning of the day or the start time of the interval using the convention that the interval starts at that time and ends the given number of minutes after that.

  • A function that takes as input a day between 2017-01-01 and 2017-09-30 written as a string in that format and a time interval and outputs a bar chart in text format of the number of bikes transited in the corresponding intervals for the specified day. For example, for hourly intervals:

    00:00-01:00 ***
    01:00-02:00 **
    02:00-03:00 *
    03:00-04:00
    ...
    07:00-08:00 ****************
    ...
    23:00-24:00 ***
    

Test these functionalities in the main file.

Task 3

A flow probability distribution (FPD) links flow values (the number of bikes passing thorugh the counter in a given time interval) to their likelihood of occurrence during a given period of time.

Expand the class Data with a function to calculate the flow distribution for a given time interval during a given period of time. Let the function take three input parameters: an integer $\lambda$ representing the number of minutes making up the time interval, and two strings representing the initial, $\tau_0$, and final time, $\tau_1$, of the time period. Let the time strings of $\tau_0$ and $\tau_1$ be in the same format as the time stamps described above yyyy-mm-dd hh:nn:ss. The function must output the empirical probability distribution of flow values observed in the time period calculated as the relative frequency of measurements.

Let $I$ be the set of time intervals of duration $\lambda$ contained in the time period between $\tau_0$ and $\tau_1$ and let $X = [x_1, \ldots, x_{|I|}]$ be the list of corresponding flow values. Then, the relative frequency of a flow value $y$ in the time period can be calculated as:

For example for intervals of 5 minutes during the period from 7:00 to 9:00 we can have the following probability distribution of traffic flow:

0: 1, 1: 3, 2: 5, 3: 8, 4: 6, 5: 3, 6: 2, 7: 1, 8: 0 

Represent the distribution as an histogram in a similar way as for Task 2. For example,

0 *
1 ***
2 *****
3 ********
4 ******
5 ***
6 **
7 *
8

Task 4

On the basis of the analysis conducted, can you now guess in which direction the counter was measuring?