Time-estimation of design process based on patterns

The paper gives an overview on a new approach to estimate the time need of a design process. The product structure is identified by using a DSM, from which not only the components, but the design activities could be derived. To be able to use the experience data from previous design processes in the current situation, similarity matching and weighting is used. Exponential smoothing is used to calculate the time according to the weights. A network graph method, based on PERT serves as a basis of the graphical and mathematical representation.


Introduction
In the present economical situation, in which the technical development has run to speed as well, the time of creating a new product has lead to a great competition in the industry.The time factor has a crucial importance beside innovation, fulfilling the growing claims of customers as well as financial and quality aspects.Only those products can be successful which are launched on the market in the right time, for this, the product development process and methodology has to be worked out which assures that the developed product gets onto the market on the right time.It is essential that the time and cost needs of the products' development process and the possible risks in the development time can be estimated with a sufficient accuracy already at the early phases of the development.
The basic product requirements and parameters are already known when a need of a new product development arises.
There is only a small proportions of products that are new constructions, most of them are either adapted, improved or variation ones [3], therefore the development processes having created them are rarely unique, mostly they are similar by their product structure and elements, and also this similarity can be studied.In the case of an adapted product construction, certain structural elements or groups might need to be created as new constructions meanwhile keeping the other ones as they were.When talking about variation constructions the size and/or the composition changes within the borders of the planned system.Most frequently already existing smaller or bigger product units can be used as they are or may be with certain tailoring for the given construction.
It is expedient to use the knowledge and information from previous and existing similar products, product components as much as possible in order to reduce the development time and minimize the uncertainty of the process.This can be done on the product level, when a part of a designed product is taken over to the current product development, and also can be done on the process level, when the previous development process partly repeats, therefore there are data available regarding the runoff of the process.
Nevertheless the uncertainty and the risks are constantly present because of the unpredictability of the future.Searching for the formally similar processes, optimizing the process, and using the information from the associated knowledge base greatly reduces the uncertainty and risks turn to be measurable.

Product structure
It is indispensable to have some kind of structured build-up and visualization of the building elements, parts, components when developing a product.Henceforth, based on this, there is a possibility of estimating the time, cost, and risk of the product development with different methods.
The conceptual product structure does not consist of parts, but components in which each component gives a genetic representation of a part with the suitable parameters.The product structure is the model representing the product, usually in a hierarchical form -by certain criteria -describe the relation between the product's sub-assemblies and parts and their impacts on each other.(E.g.functions, production, assembly, configuration, budget, design parameters, material choice, loads, requirements, production data, packaging, etc.) [DIN199].
There are two ways to structure the information in product data management (PDM).One is to put all the components into a virtual folder and then, linked with all the documents regarding the component, it is structured according to the build-up of the product.The other method is to create a part master data to all of the components and connect these to each other in such way to reflect the product structure.
The product structure is made up by the connection between the product components.However it has to be noted that the product structure is not an unambiguous mapping in the PDM system, so there is no such document that would describe the whole structure of the product only the connections between the components are stored and the generation of the structure could be done by this, just like when creating the bill of materials.
The product decomposition gives the solid frame of the product structure.First, the product is divided into parts, this gives the frame, then the individual parts or components, which are within the group in the structure, are variable [4].
The conceptual product structure is considered a possible stable representation because in an ideal situation the generic structure does not, or changes only minor during the product development.The conceptual product structure can follow different approaches, therefore it can be adapted to other scopes.For example if the conceptual product structure follows the functional scope, it can also support the complex and/or differing product versions, this way it is possible to map the requirements in a functional system.The strict build-up rules of the product structure's generic parts can be considered as an ideal relationship creating tools between the information needed during the early stages of product-production -for example the requirements can be easily connected to the constructional bill of materials [6].
A Design Structure Matrix (DSM) represents the relationships between the system elements in a compact, visual, and analytical form.
Two types of the DSM need to be highlighted regarding the design processes.
The component based type describes the relation between the structural elements of the product considering geometry, energy, information and material flow.From the component based matrix the activity based matrix can be derived.
The activity based matrix illustrates the structural elements and the dependency of the work steps creating them, also considering time scheduling.
The component based type representation makes the recording of the conceptual product structure possible and can be uniquely converted to an activity based DSM for the elements of the product structure to be created.Since this paper is more about the optimization of design processes hereinafter we will focus on the activity based model.
The structural elements of the product A i (i=1,2,. . .,n), by order, determine the matrix on the left side of Fig. 1.The elements of the diagonal represent themselves, so a i j =0 (i=j).The other elements of the matrix illustrate the relationships between the main structural elements.If A i structure element gives information to A j , then a i j =1, otherwise a i j =0, which means that there is no relationship between A i and A j .If a i j =1 and i<j is true about one of the elements of the matrix, then the element is above the diagonal and means a feed forward relationship, if i>j, then the element is below the diagonal and means feedback, cycle or iteration.
A number of other information can be associated to the elements of the matrix which widens the application field of the DSM method's, e.g.managing cost and time [6].
The product structure gives a fine base for the time estimation of the decomposed activities.The activity based matrix (DSM method) derived from the product structure makes is possible for project and design planning managers to model the sequential and parallel activities and to describe the relationship between tasks of inter-dependency.Practically, it equals to the adjacency matrix used to create graphs needed for networking which is applied in process modelling.

Time estimation
The time need of the design process is usually calculated from the time necessity of the elementary tasks.These intervals are small enough to be estimated correctly.The time need of the individual elements can be optimized which adds up to the time of the whole process.
It is indispensable for a company to be able to calculate with the time and cost needs and the related risks of the design process already at the stage of strategic planning, even at the decision point of undertaking the project.At this early stage there is no point or chance of decomposing the design process into elementary activities.During time estimation we use activity units of a size, which can be considered as separable sub-tasks.In the activity based matrix can be derived.
The activity based matrix illustrates the structural elements and the dependency of the work steps creating them, also considering time scheduling.

Figure 1: Example of DSM representation and graphic exposition
The component based type representation makes the recording of the conceptual product structure possible and can be uniquely converted to an activity based DSM for the elements of the product structure to be created.Since this paper is more about the optimization of design processes hereinafter I will focus on the activity based model.
The structural elements of the product A i (i=1,2,…,n), by order, determine the matrix on the left side of Figure 1.The elements of the diagonal represents themselves, so a ij =0 (i=j).The other elements of the matrix illustrate the relationships between the main structural elements.If Ai structure element gives information to A j , then a ij =1, otherwise a ij =0, which means that there is no relationship between A i and A j .If a ij =1 and i<j is true about one of the elements of the matrix, then the element is above the diagonal and means a feed forward relationship, if i>j, then the element is below the diagonal and means feedback, cycle or iteration.
A number of other information can be associated to the elements of the matrix which widens the application field of the DSM method's, e.g.managing cost and time.[6] The product structure gives a fine base for the time estimation of the decomposed activities.The activity based matrix (DSM method) derived from the product structure makes is possible for project and design planning managers to model the sequential and parallel activities and to describe the relationship between tasks of inter-dependency.Practically, it equals to the adjacency matrix used to create graphs needed for networking which is applied in process modelling.

TIME ESTIMATION
The time need of the design process is usually calculated from the time necessity of the elementary tasks.These intervals are small enough to be estimated correctly.The time need of the individual elements can be optimized which adds up to the time of the whole process.
It is indispensable for a company to be able to calculate with the time and cost needs and the related risks of the design process already at the stage of strategic planning, even at the decision point of undertaking the project.At this early stage there is no point or chance of decomposing the design process into elementary activities.During time estimation we use activity units of a size, which can be considered as separable sub-tasks.In the comparison of the historical processes it makes no difference what size of junks we choose.Several different time estimation methods are used in project management: intuitive, experience based and calculation based forms.Nowadays the company management systems record a great deal of data about the timeliness of the design process, for example how long an engineer worked on a model preparation, which can be used later on for estimating the time need of designing a similar product.
In this paper we present a calculation method that uses previous experience and data.The initial condition is that the activities of the current process can be estimated from the designing time of previous related products and their sub-units.A projective method is used to estimate the time of the design process, which analyze the patterns of past time and extrapolates them to the future.Such methods are the moving averages and exponential smoothing.The classical exponential smoothing gets the forecasted value from the weighted average of the forecasted need just before the time period and the actual need.The methods presented in this paper not only take time as a basis in determining the weights, but also similarity is considered.

Similarity matching
We identify, classify, and systematize the similarity between the previous design processes and the present process by given aspects.By similarity we mean the matching of the features of the processes, which are determined by the requirements towards the result of the development process.
It is not possible to directly determine how similar two objects are -in our case the design process of a product, component or even a part.Although similarity and distance are two opposite concepts, at the same time their measures can be converted into each other, so the objects' (i, j) similarity (h i j ) can be calculated from their distance (d i j ), expressed in percentage.
where d max is the largest element of the distance matrix.
In our research the first step in order to get the similarity of the processes is to determine the degree of the distance between the features (degree of difference) in a 1 to 10 scale, based on expert evaluation.In the next step the Pearson-distance was used which is based on the Euclid distance, but the distance between the objects is standardized, the given attributes compare with the standard deviation.Hereby the specific distances can be set against each other and become free of scale.However, every distance calculating method's starting point is the raw-data-matrix which sums up the objects (i=1 ,2. . ., n), their attributes (k=1 ,2,. . ., p) and the corresponding values (x np ).
The Pearson-distance formula: where the standard deviation of distances.The distance values (d i j ) identified throughout the process are summarized in a distance matrix.

Estimation by exponential smoothing
The specificity of the exponential function is that an additive unit change in the independent variable causes the dependent variable change by the current value of the function.It is well suited to describe the phenomena in time and differentiate the deviations implied by the similarities.
The point of the method is that the time of a given activity is estimated so that it includes the time of similar activities in the past, reducing the weight as similarity decreases.An alpha factor, valued between 0 and 1, has to be chosen for the weighting.In case the chosen weight is close to 1, the time of similar activities will only be smoothed to a small extent, namely the values of most similar activities will get greater weights and the less similar ones will get lower weights.In turn, when the chosen weight is close to 0, a strong smoothing will be applied on the activities.The very similar activities will get small weights and the activities less similar will get great weights, but following a geometric progression it will get less and less weight.
Expected time of a given activity: (3) so Forecast error and variance of the activity time: D t = the actual time of the activity in case t F t = the estimated value of activity t α = weight 0 ≤ α ≤ 1 Infinite number of data could be used for the smoothing, but this is practically not feasible, so before the calculation not only the value of the weight has to be determined, but also the point where the progression is cut off and from which similarity stage to start the calculation from, namely how much data should be used.This is called initialization.The "first" estimated and actual values are considered to be equal in our method, F 0 = D 0 .To define the point of the initialization, the value of alpha is worth to take into account, since the larger the value of alpha the smaller amount of members is needed to be considered from the sequence of the previous forecasts, because after a couple of members their weights are virtually near zero.
If alpha is small, more members have recognizable weight for calculating the following forecast.(For example if α=0,9, there is no point of working with more than two-three members, if α=0,1 then it makes sense to calculate with twenty-thirty or even forty members) [2].

Process modelling
Process is a certain chain of events (activities) which takes place in space and time and fulfills some kind of need, aiming at a goal.Studying the relations and connections of the process gives an opportunity to separate and systemize them.
A system is a group of things or parts, which work as a whole together, and a set of concepts, theories, principles based on which something can be done [7].The totality of series in time of the state changes in the system is the process, the constructional design process.
In order to describe a system as concrete as possible, the parts and the relationships between the parts need to be given.The description of the parts are given from the product structure, and the DSM contains the qualities of relationships between the parts.
A dynamic graph model is used to illustrate the process, where the time is one of the factors and there are different figure signs, guided arrows, and connecting lines, too.
Activity-time figures are used to describe the design processes.Block diagrams, process plans are used to present the product structure connections and the network graph is to illustrate these in time.These methods are mathematically based on the graph theory.

Process graph methods
The tools for process graphs are methods which make it possible to design, structure, control and supervision over the complicated, extended processes.Beside the systemic form of project management, the bigger and bigger projects make it necessary to use the tools of process graphs.
The network graph methods can be divided into two major groups by the structure of the activities [5]: • deterministic methods, • stochastic methods.
In the case of the deterministic methods the project can be defined in advance, meaning that all the branches of the designed network will have run off by the realization of the project.CPM (Critical Path Method, and its variants, CPPS, LESS), and MPM (Metra Potential Method, and its variants PDM, PPS) are such methods.
The stochastic group contains methods in which cases the run off of the project cannot be predicted with certainty.In these systems the result of certain work steps, activities effect the project's future direction and time.Two main different groups may be distinguished, based on the uncertainty of the direction or time of the run off.The PERT (Program Evaluation and Review Technique) method models the tasks time with random variable -mostly with beta-distribution.In case of the other group of the stochastic net, decision points are standing after the work steps and further path of the project can be given by random variables.Unlike the deterministic methods, using the stochastic ones, not all previously designed project paths could be walked through.The stochastic networking graph methods are also called decision nets.GERT (Graphical Evaluation and Review Technique), GAN (Generalized Activity Networks) and the methods based on the Petri-nets are such procedures [1].

Modelling based on PERT (Program Evaluation and Review Technique)
PERT is an activity-on-arrow method which was developed during a project in 1958.It is for describing research and development processes where the cost and time needs can only be estimated approximately.The time of the activities and hereby the final deadline is handled as a random variable.In the classical PERT method experts give estimations for the time needs of the activities.
In our method the required activity time is estimated by smoothing based on similar activities from previously determined and ranked similar projects.The expected value of time and standard deviation of the activity results from the calculation.The probability distribution in PERT is following a beta distribution whose validity is confirmed by experiences.We also rely on this cumulative distribution function as well, of which probability distribution can be calculated by the known formula below, from two parameters (α, β): So for the expected time and the standard deviation of the activity can be calculated upon α and β parameters: The method permits an objective time interval estimation for the finishing point of the activities, which represents the uncertainty of the process.The uncertainty of the activities and the probability that the events happen as scheduled can be determined from the probability distribution.

Network graphs
For the adequate modeling of the process arrow networking is used.The network diagrams are built up from two basic elements: the arrows, representing time-consuming activities, and the nodes (circles), representing an event, which is either a starting or a finishing point.When all activities and events are linked together in the proper order and in the logically correct manner, they form a network.
The DSM includes the activities and their relations, which actually equals to an adjacency matrix, so it is easy to convert it to an incidence matrix, which will include those relationships, which the process characteristic graph could be mapped from.A starting and an ending node is added to the graph, altogether with the edges as a result we get the skeleton of the process describing network graph.Afterwards, the activity arrows are loaded with time data.
The activity-on-arrow diagrams can handle time overlapping.In order to achieve this, the activities have to be made pausable, and pseudo-activities have to be introduced.Since not elementary activities are used to describe the processes, it might happen, that a certain activity does not have to fully finish before another, partly dependent activity would start.In the matrix it is represented in the following way: in the case of total dependency the value is "1" in the matrix, while when the dependency is partial, the process readiness percentage value is highlighted which has to be reached before the partly dependent activity would initiate.For example, if the entrance limit is 65% to be reached for the next activity to be initiated, in the matrix there will be 0,65, which will result in an overlapping when converted it into graph.
There is no problem to calculate the time of the critical path, the separate paths, etc. of such graphs.The only difficulty is to handle that the arrows do not have discrete values, but a probability distribution.

The sum of cumulative distribution functions of the activity times
To determine the total process time the separate activity times have to be summed.A distinction has to be made between the sum-up of the serial and parallel activities.There is a mathematical operation available for both, namely the convolution and addition of distribution functions, but these may be calculated by integrating distribution functions only with standardized periods, furthermore the function having the appropriate value can be gained in a complicated way from the standardized distribution function.(Since the determination of the total process length requires to measure the length of the activities by the actual time, and not by the standardized time.) In our method this mathematical problem is circumvented by cumulating the distribution functions in a way that we determine the shortest (a), the longest (b), and the most likely (m) times, and further calculations are made with these three discrete values.
The PERT distribution is a special form of beta-distribution, so from the expected value of the finish of the activity: where α = 4 (empirical value) [1] and from Eqs. (8) and (9) as a result the parameters of related Beta-distribution are:

Summing serial activities
In the case of a sequential process the duration of the development is the sum of the times of the activities.The time of the consecutive activities can be calculated so, that the shortest (a i ), the longest (b i ), and the most likely (m i ) times are summed separately, then following the beta-distribution, expected value and standard deviation are calculated.

Summing parallel activities
For the calculation of the total duration of concurrent activities the critical path should be taken into consideration, meanwhile the rest of the activities "run free".Although knowing the PERT probability distribution values it is not possible to identify one and only critical path, because it might happen that on one arrow, the activity brunch has a minimum value (a x ) as the optimistic critical path, while on another arrow, the activity brunch has a maximum value (b y ) as the pessimistic critical path.Similarly, a different path could be the one belonging to the most probable value.For this reason the time of the concurrent activities can be calculated by taking the maximum of the three values of the probability distributions, then expected value and standard deviation of the distribution function are calculated according to these maximum values; the expected value will characterize the time of the parallel activities.
To determine the total process time, we calculate the shortest (a), the longest (b), and the most likely (m) times from the distribution function featuring all activities, then we aggregate the pure serial activities in the above mentioned way.Following this the parallel components (this might contain aggregated activity time as well) are to be maximized in the explained manner, then the serial method should be applied again.This iteration will go on until our whole network process gets reduced to a sole factor, characterized by three values.Those three values, namely the shortest (a x ), the longest (b y ), and the most likely (m z ) time values of the total process are used to calculate the expected value and the standard deviation, which are characteristic of the process time.As a result the pattern-based time-estimation of the design process could be gained.

Summary, further considerations
The faster the technology development and the change of customer's requirements, the greater the uncertainty.The timeestimator needs to balance the time and cost probabilities, the competition, the workload, and the available resources.Considering the patterns in the time estimation of the design process will lead to more precise and faster results.Using the pattern method the most similar processes and activities are used to estimate the duration of any of the activities.Those times are characterized by their probability distribution, and according to experiences they were found to follow beta-distribution.For aggregating the times as random variables a method was developed based on design process networks.The result is the estimated time of the design process.Furthermore, the design process network could be used e.g. for calculate and register float.Consequently, it can be measured to what extent the delay of an activity is hazardous to the project deadlines.On this basis the time risk factors could be estimated.Supporting the design processes with such a methodology enhances the flexibility and the productivity of the projects.

Fig. 1 .
Fig. 1.Example of DSM representation and graphic exposition

Fig. 2 .
Fig. 2. Example of a network graph