An Estimation of the Learning Curve Effect on Project Duration with Monte Carlo Simulation

The aim of this paper is to estimate learning curve effect on project duration with the mean of project scheduling techniques. To measure this effect only one assumption is taken: the activity time individuals / groups take to perform an activity decreases at a given rate as experience is gained with the activity. Unfortunately this effect directly is not taken into account by project management software. In some software after scheduling, supervisor manually can switch on the "as soon as possible" or "as late as possible" buttons on an activity. Monte Carlo simulation was used to model the risks in the total project durations. It is assumed that the (normal) durations of the activities can vary according to the beta distribution. The minimum estimate is 95 % of the original (normal) duration, and the maximum estimate is 140 % of the original (normal) duration. We assumed that most likely value is the (normal) duration of each activity. The learning effect on project duration with the help of test problems and real problems was investigated. In test problems learning effect can occur between two consecutive activities. These pairs are chosen randomly. After calculating project duration, these pairs are allocated to be closer to each other using the predecessor's total float time. It is assumed that the duration of impending repetitive activities is shorter due to the learning curve effect if the gap between consecutive activities is small enough. This iteration is carried out until it is not possible to shorten the successor's activity time in a pair. It is shown that this effect brings a 2-3 % shorter project duration meanwhile variance is also left in a 1-2 % range. Numerical tests were implemented by XPRESS-Mosel Optimization Software.


Introduction
In the literature some researches, depending on at which level the phenomenom occures (individual, group, firm, industry), distinguish learning, progress and experience curves for the same phenomenom.We use the term learning curve to encompass the term "progress curve" and "experience curve" also.
In this paper, project scheduling in mathematical terms means finding the longest path in a directed graph, where vertices and directed edges are given.Also, there is a given integer number assigned to each edge.In engineering terms, directed edges represent activities or connections between activities, vertices are nodes or events, and integer numbers represent activity times or time lags between activities.It is assumed that the learning effect can occur and result in a smaller activity time of a given activity if the same group of workers perform a similar activity as an immediate predecessor of the given activity.The question is: what is or what can be the cumulative effect of the reductions of activity times in the project scheduling network.This effect is calculated using calendar days, which leads to a more complex mathematical model and algorithm than calculating using only working days (Mályusz and Varga, 2016).
In a construction project, the general contractor distributes the work among subcontractors.Normally general contractor organizes its work based on consecutive technology steps.In this way the subcontractors often should do their activities interrupted.
It is obvious that they can reduce their costs if they work continuously.There are two main reasons for this: first, they can reduce their construction costs; second, the work will be completed sooner because of the learning effect.Unfortunately, in the early phase of scheduling, this effect is not considered and not supported by project management softwares.
In a multi-project environment, the learning see (Wu and Sun, 2006) effect of staff was considered when periodically scheduling the tasks for each project and assigning staff to the tasks.The solution leads to a mixed nonlinear program for project scheduling and staff allocation problems, which considers the learning effect of staff.A genetic algorithm (GA) is proposed to solve the problem.(Zha and Zhang, 2014) investigate the project scheduling problem with multiskill learning effect, where both autonomous and induced learning is considered.
In practice, project scheduling methods suffer from a lack of precision; consequently, it is a significant challenge to create a realistic and usable project schedule.It is difficult and time-consuming to estimate time, assign resources, determine interdependencies between tasks, and manage changes.It is, therefore, important to identify and investigate the differences between the practice and theory of scheduling methods (Francis et al., 2013).
In construction project management, the appropriate scheduling of a project is an essential problem.Estimation of an activity's time is a crucial part of the schedule.There is little information in the literature about the use of learning curves in scheduling, although it seems that the principle of learning curves is gathering ground in the scheduling of repetitive construction operations (Hinze and Olbina, 2009;Zahran et al., 2016).In (Hajdu, 2015), the learning curve effect on linear scheduling method is discussed.However, it should be noted that the impact of learning curves is not calculated in recent management software (Fini et al., 2016).

Learning Curve
Psychologist researchers at the turn of the 19th century focused on behavior of individuals.They found that time individuals took to perform a task and the number of errors they made decreased at a certain rate after repetitions see (Ebbinghous, 1964;Thorndike, 1898;Thurstone, 1919).
Researchers also found that errors made by groups taking repetitive activities also are decreased at a certain level as groups gained experiences (Guetzkow and Simon, 1955;Leavitt, 1951).
It is a debate among researchers whether organizational learning is a consequences of changes in behavior or changes in cognitions.It is obvious that when firm or group does repetitive works, members might learn who is good at what, and how they organize their work better.Presumably in construction industry the following two changes have bigger effects: members learn how they layout working site and how they can fit their current work to the local regulations and conditions.
It is an open question and further research is needed in how this phenomena reflects in changes of project scheduling network.The learning curve formulation: .
In the learning curve formulation the standard measure of experience is the cumulative number of units produced.So the measure is calculated by summing the total number of products from the start through the end of each time period.The cyle number is denoted by x, y is the time required to complete cycle x in labor hour per output, a is the time required to complete the first cycle, a 0 is the minimum required time complete a cycle, b is a learning coefficient.B expresses the number of units produced before the first unit, so it is an experience factor.The value of B will be in the range of 0-10 (Gottlieb and Haugbølle (Hasanzadeh et al., 2016;Kara and Kayis, 2005: 209).
Here y can represent not only time or cost but a wide range of outcomes of production for instance: defects per unit, or accidents per unit (Greenberg, 1971).
When B and a 0 are 0, then we get back the original Wright's formula.
where r is the rate of learning.Wright discovered that when the production / cycles doubles the cumulative labor time / cost decreases at a constant rate, that is, the learning rate.So learning rate is the constant rate with which cumulative labor time/cost decreases when the production / cycles doubles in a linear log x, log y model.This feature of the learning rate comes from the logarithms nature and true only in linear log x, log y model.
Several researchers have suggested that Wright's model is the best model available for describing the future performance of repetitive work (Everett and Farghal, 1994;Couto and Texiera, 2005).In the exponential average method (Mályusz and Pém, 2013), α = 0.5 yielded the most accurate predictions.Of course, there is no consensus on which model provides the best fit and predictability for construction data (Srour et al., 2016).Consequently, more theoretical and experimental investigations are necessary to adjust a model according to the real problems.
In the construction industry, the learning rate is between 85-95 %.According to the practice and theory as well after a certain amount of produced "unit" there is no significant reduction in time and cost.
Using an example of 90 % of the learning rate, if a job is ten days, a repetition of that is eight days if the working conditions are similar.The learning curve effect does not always apply, of course.It flourishes where certain conditions are present; it is also necessary for the process to be a repetitive one.Additionally, there needs to be a continuity of workers without any abrupt stops during the production process.When the learning curve effect, on occasions, comes to an abrupt stop, graphically, the curve jumps up (Ferivanto et al., 2015).

Project Scheduling Model
In this paper, the concept of activity on node network is followed.The relationships between the activities can be represented by a G = (V, E) directed graph, where V is the set of nodes (|V| = n), and E:V → V denotes the set of directed edges.Each node corresponds to an activity, and the relations are represented by the directed edges.
In the case of the exponential time algorithm, the relations can be both maximal and minimal type, but the heuristic algorithm is only able to handle minimal type finish-start relations.Although, there are four relationships that can be defined between activities, for the sake of simplicity, from now on only the Finish Start relationship is used and its variants, namely: FSk, max FS1, max FS0.The maximal precedence relationship describes the maximum allowable time between the start / finish point of the preceding and the start / finish point of the succeeding activity (in calendar days).
A w e weight is assigned to each e E ∈ directed edge, it determines the length of the relation given by the e directed edge (in calendar days).It is also assumed that G is acyclic.
The calendar vector c is given in advance (It is determined by according to the time period what is used during the calculations).c is a binary vector, c(i) = 1, if day i is a working day, and 0 otherwise.It is supposed that c is longer than the maximal possible total project duration.
Two positive integer variables are assigned to each node v (the starting and finishing time of the corresponding activity): x 2v-1 and x 2v .A predefined d v positive integer number is also ordered to each v node (the duration of the corresponding activity).
To satisfy the precedence constraints defined by the directed edges and the weights, for all e i j E = ( )∈ , directed edge, the following constraints is added: To model the learning curve effect, it is supposed that there are special learning relations (edges).LR denotes the set of these edges, and K denotes the number of these relations: To make calculations easier, it is also supposed that the duration of the endpoints of the learning edges is equal.For ease of understanding, in this heading, it is supposed that d j = 10, if j is an endpoint of a learning edge, and it can be shortened to 9 or 8 days.It can be rescaled, according to the learning curve defined in the previous heading.
If node j is not an endpoint of a learning edge (so its duration cannot be shortened by the learning curve effect), the following constraint is added: For a learning relation i j LR , ( )∈ , the following con- straints are added to the model to ensure the shortening of the duration of activity j due to the learning curve effect: If there is no working day between the two activities (so the sum of the calendar vector between the two indices is 0), then the length of the second activity is eight days: If one working day passes between the finishing time of i and the starting time of j, then the length is nine days: If there are at least two working days between the activities, then the duration of the second one is ten days: Our objective function is: (It is equivalent to minimizing the total project duration) To sum up, mathematical model is the following: Given: G = (V, E), acyclic directed graph, w e integer number to each e E ∈ directed edge, c is a binary vector, c(i) = 1, if day i is a working day, and 0 otherwise, d v positive integer number is also ordered to each v node.
Find: positive numbers x 2v-1 and x 2v assigned to each node v.
If then This integer programming model is similar to the one is presented in (Mályusz and Varga, 2016), for the working days calculation.In that case, the constraints defining the learning relations can be linearized.In this model, unfortunately, it is not possible.

Numerical Test
To examine the learning curve effect on artificial projects and to be able to compare the exact results with those from the heuristic algorithm, different test problems were randomly generated.
It is assumed that the activities could be ordered to equal length rows; the number of rows and columns are denoted by M and S, respectively.A starting and finishing activities are added, the first elements in each row are connected with the starting activity, and the last elements are connected with the finishing activity.Therefore, each graph contains M*S+2 nodes.The adjacent nodes in each row are connected with finish-start relations.Between different rows, edges (connections) were added, K denotes the number of the learning edges, and L is the number of edges between different rows without learning effect.An example can be seen in Fig. 2. The learning edges are highlighted in red.
To examine the changes in the total project durations, we generated 2500 problem instances according to the beta distribution, based on an example project.Our example project consists of 102 activities and it is generated with the same method that we used in (Wright, 1936).The project contains five pairs of learning edges; it means that we have five pairs of repetitive activities where the second activity's duration can be shortened.Using the integer programming model from (Wright, 1936), average of project duration of the example project is 238 days without taking into consideration the learning curve effect, and it can be shortened to 232 days using learning effect meanwhile variance went from 2.5 to 2.7 days.The results of the simulation can be seen in Fig. 1.The horizontal axis shows the total project durations, and the height of the column describes the frequency of each total project duration in the Monte Carlo simulation.The blue columns show the results in case of the integer programming model that takes into consideration the learning curve effect (Wright, 1936), while the red columns at the background illustrates the results without the learning curve effect.

Concluding Remarks
In this paper learning effect on project duration is investigated with a help of Monte Carlo simulation.It is presented that the cumulative learning effects of activities in a project can cause a 2-3 % reduction in project duration, meanwhile variance is about 1 % of project duration.Since the exponential time algorithm is very slow for real problems, further development of the heuristic algorithm is necessary.An investigation of learning effect on project cost is also an interesting research topic and deserves further consideration.
It is an open question and further research is needed in how the following phenomenom reflects in changes of project scheduling network of consecutive projects: how labors learn how they layout working site, how they can fit their current work to the local and conditions and how they organize their work better.

Fig. 2
Fig. 2 A randomly generated network with 202 activities and