Enabling Automatic LCA at Any Stage of the Building Based on Its BIM Model

The Life Cycle Assessment (LCA) of a whole building is a well-known process used to assess its environmental impact. The construction domain does not use this process at this time because it requires too much information and collecting it is very labor intensive. This paper identifies the information needed to perform an LCA at any level of development of a Building Information Modelling (BIM) model and proposes some solutions to fill the information gap of an early stage BIM model. After the required information is identified, the interoperability strategy is analyzed to propose a framework introducing a way to organize the LCA of a whole building, as well as a new file format to share information between BIM and LCA software. The proposed framework enables an LCA to be performed, without manual input, at every iteration of the BIM model. This framework was previously presented at the Creative Construction Conference 2019 and this paper is an extended version of that paper.


Introduction
The construction industry has been evolving at a fast pace for some years now. One of the most impactful changes in the industry is the adoption of the Building Information Modelling (BIM) process (Eastman et al., 2011). The BIM process proposes centralizing data to share the information needed by all stakeholders of a building for its design, construction, maintenance and demolition. All this aggregated data is called the BIM model. The data co-location allows anyone to analyze the data in order to detect a problem, or clash, between an electricity conduit and a ventilation dock, for example. These types of analyses conducted at an early stage of the project avoid problems and save time and costs. While the cost of a project is an important concern for clients, new preoccupations are emerging, such as its potential environmental impact (Diaz and Anton, 2014;Olinzock et al., 2015;Shin and Cho, 2015). It has been published that the building sector contributes more than 39 % of greenhouse gas emissions in the US (Russell-Smith and Lepech, 2012;Shin and Cho, 2015). Consequently, there is room for improvement and upcoming generations are more and more concerned for the environment. One method to assess the environmental impact of a proposed building is to perform a Life Cycle Assessment (LCA). This paper focuses on the current difficulties associated with conducting this assessment and proposes solutions to perform an automated LCA for a whole building at all of the life cycle stages of a building.

Literature review 2.1 LCA of a whole building
The LCA process is well known in the construction community but currently rarely used (Olinzock et al., 2015). The first barrier to its use is that the client does not require it as part of a construction project. Generally, clients do not understand the benefit of performing an LCA. Nevertheless, this is changing gradually with upcoming generations that are increasingly concerned about the environment. Another incentive that makes LCA more popular is the impact on the price of the project. LCA helps decision makers choose more sustainable materials that tend to reduce the price of the project (Shin and Cho, 2015). An LCA, by nature, requires a lot of data (Dupuis et al., 2017;Iris et al., 2011;Olinzock et al., 2015;Shin and Cho, 2015;Wang et al., 2011) and can be both time consuming and costly (Diaz and Anton, 2014;Dupuis et al., 2017;Iris et al., 2011;Olinzock et al., 2015;Shin and Cho, 2015). The LCA of a whole building focuses heavily on all the ecological impacts produced by the materials used in the construction phase (Shin and Cho, 2015). After the material inventory is obtained, the environmental impact for each of the materials needs to be described; this step is called the Life Cycle Inventory (LCI) (ISO, 2006a;2006b). To complete the LCI step, the use of a specialized database (LCI database) created and validated by a community of environmental experts is required. These databases contain the impact of basic materials that can be used to assess a building. The only thing that is left to do is to get the quantity of each material used for the building. This information can be extracted from the BIM model.

BIM
To be able to use BIM data efficiently for an LCA of a whole building, there is a need to understand what a BIM model is and how it typically evolves at each stage of a construction project. A BIM model describes a building using objects to represent elements of the building instead of a simple graphical line in a CAD plan (Eastman et al., 2011). The BIM objects contain data elements that can be used as a data source to perform an LCA (Dupuis et al., 2017;Iris et al., 2011;Russell-Smith and Lepech, 2012;Shin and Cho, 2015). In actual construction projects, it was observed that at various project phases, the BIM model did not always include all the data required to perform an LCA without manual intervention (Russell-Smith and Lepech, 2012). This is quite normal since the data presence will vary depending on the stage of the project, and the level of detail for each data element entered into the BIM model increases (Eastman et al., 2011). At the start of a project, a BIM model typically contains only what is needed for bidding and demonstrations to clients. If the bid is accepted, the BIM model gradually evolves to include more and more details as well as data elements. This level of detail is characterized by what is called a Level Of Development (LOD), a term suggested by the BIM Forum (BIMForum, 2014). Based on the BIM forum description of each LOD definition, Dupuis (Dupuis et al., 2017) summarized the important information that is available to automatically perform an LCA (see Table 1).
In reality, the BIM model data availability situation is more complex than suggested by the BIM Forum because each of the data elements, or groups of data elements, in a BIM model have different LOD levels at any one time (Boton et al., 2015;Boton et al., 2018), and the information contained evolves with the project (Boton et al., 2018). To achieve an easily automated LCA for a whole building using BIM as a data source, the LCA framework needs to handle multi-LOD models and be able to extract all the required data directly from these models.

BIM model interoperability
One of the prerequisites for the automation of an LCA using BIM is to ensure access to the data needed to perform an LCA. This is not always possible because the current BIM models lack the ability to easily exchange information with other systems (Diaz and Anton, 2014;Eastman et al., 2011;Russell-Smith and Lepech, 2012). To address this challenge, a data entry strategy needs to be put in place early in the project in order to avoid manually entering information in the BIM that is provided by one of the stakeholders/sub-contractors using different software. Sabol (2008) proposes three strategies to improve the interoperability of the data for everyone involved in the construction project using a BIM model: 1. using a specialized file format strategy; 2. using an Open Database Connectivity (ODBC) technology strategy; or 3. designing software that uses an existing Application Program Interface (API) strategy.
The first strategy proposes using a simple file to transfer data (e.g. using an IFC file format). The format of this file needs to be chosen based on its capacity to be processed With a specific object without a specific assembly Size and shape 350 With a specific object with a specific assembly Size, shape and assembly detail 400 With a specific object with a specific assembly and with the installation detail Size, shape, assembly detail and installation detail 500 With a field verified representation Size, shape, assembly detail and installation detail easily and on the quality of the information it contains.
To check the quality of the information, the file needs to contain at least the object, object type and individual material quantity. After that, the information needed is defined by the LCA framework used, which will be explained later.
Proprietary formats like the Autodesk Revit file format (i.e., using an ".rvt" file extension) is avoided because it cannot be used by another software, as the data fields are not documented. Other BIM software do not support such proprietary formats and the Autodesk license agreement prohibits their use in other software without prior agreement. To avoid problems with such usage terms, the use of an open file format becomes an interesting and safe alternative. At the time of writing this paper, many open file formats are available, such as: AgcXML, BIMXML, CityGML, COBie, gbXML and IFC. Each of these formats has a different use, focus and purpose (see Table 2). When studying each of these formats in detail, it was concluded that none of these file formats contain enough data elements to produce the material inventory needed for an LCA, with the exception of the IFC format.
The IFC format is known to be one of the best file formats to ultimately standardize the BIM data exchange (Sabol, 2008), and is reported to allow for the exchange of a wide variety of information. To allow this flexibility, IFC relies on what is called a STEP format and defines more than a thousand data entities to encode complex building information (BuildingSMART, 2016). Today, the majority of commercially available BIM tools support the import and export of a BIM model using the IFC file format. One issue with IFC is that its import and export functionalities vary greatly from one BIM tool to another (Eastman et al., 2011). Even if the IFC format is described by an international standard, the ISO 16739-1 (ISO, 2018), the resulting IFC file can vary depending on the BIM tool used.
The IFC format offers an XML variant of the format called ifcXML. However, the XML format contains only a subset of the IFC data schema. Consequently, some information will only be in IFC based on the STEP format (Eastman et al., 2011). Wei Yan (Yan et al., 2011) reported successfully using the IFC format to import a building model in a game engine. To succeed this import, he had to convert the IFC file into a VRML file format. VRML file formats are specialized for 3D computer graphic use and this additional step would hinder and complicate an LCA process if it is used. In summary, to extract data required for the automation of the LCA using an IFC file, customized software needs to be designed and coded specifically. The software developer will need to consider the many varieties of data elements of an LOD level as well as the many IFC export utility implementation issues.
The second data entry strategy proposed is to use an ODBC technology to solve the file transfer issue between the BIM and the LCA software. This strategy would extract data from the BIM model and store it in a database such as a Microsoft Access database. ODBC is a well-known technology that easily allows for the transfer of data to a database. Once the data is transferred to a single database using the Structured Query Language (SQL), it becomes easy to query and extract the data without any programming skills. The drawback of using an ODBC technology is that the resulting database schema will not be standardized, as it will vary for each BIM tool on the market.
The third and last strategy proposed is designing specific software to use an existing API to transfer the data. This API is a program that offers an interface that can be used by software developers to access the data within an application. The use of an API is required to extend the capability of BIM tools (Eastman et al., 2011).

New LCA framework
At LOD 350 or more, all the data required for an automated LCA should be present in the BIM object type and its related material. At this LOD, the BIM model can contain many thousands of objects and it becomes a challenge to display this amount of information efficiently in a way that architects, engineers and LCA experts can use it. To organize the information into logical groups, the most common classification systems used in North America are: Uniformat, MasterFormat and OmniClass. Only Uniformat and OmniClass are used at the start of a project (Boton et al., 2018;Sabol, 2008). In this research project, Uniformat was chosen over OmniClass because it is used by default in the North American construction industry and is easily supported in Autodesk Revit. For these reasons, this LCA framework uses an LCA process tree (see Fig. 1) with the same hierarchy as the one used by Uniformat II to organize BIM objects, BIM object types and material.
Another challenge is to link the data elements extracted from BIM to a process in the LCI database. Either the BIM object's type or material can be linked to a process in the LCI database by matching the names. To simplify the linking step, the LCA framework includes several processes describing common material and object type names found in BIM models.
As mentioned earlier, LOD350, 400 and 500 models are good candidates for an automated LCA. In the early stages of a construction project (i.e., LOD100, LOD200 and LOD300) there is a data gap for performing an LCA, since not all of the information needed for the purpose of an LCA is entered in the BIM model.

Managing the data gap
To achieve an automated LCA at an early stage in a construction project, the BIM model data gap needs to be addressed. Russel-Smith (Russell-Smith and Lepech, 2012) suggests that more detailed data be added to the BIM model by the expert. However, adding data just for the purpose of an LCA will not be very easy in practice and few companies will expense effort and time to do this. To avoid entering the missing data manually, we impute them. Imputation techniques allow us to estimate the missing values, generating some uncertainty. To be able to choose between two designs, the LCA result will need to be accurate enough to differentiate them. Instead of having to enter all the missing data, this approach shifts the effort of the LCA by compelling the user to concentrate on the data elements in the BIM model that generate the most uncertainty in the final results. This will reduce the uncertainty of the imputed data until the results of the LCA are accurate enough to make a decision. Using imputed data will reduce the manual effort associated with overdetailing data elements that have little or no impact on the LCA results.
At LOD300, BIM elements are described as a specific object but can be missing information for some material in the assembly (e.g. the insulation layer for a wall). The function of the material available in Autodesk Revit is used to detect a missing layer in the assembly. The Revit functions available to do this are: Structure, Substructure, Thermal / Air layer, Finish and Membrane layer. To assess the impacts of a missing layer, the BIM object type is linked to a market process based on statistics for this material's function (see Fig. 2). This process is similar for LOD200 where BIM data elements are described as a general object (e.g. Generic wall) and the assembly information is unknown. To assess impacts of a LOD200 object, the object is linked to a market process for its Uniformat code (see Fig. 2). Since Uniformat code is used at the start of a project, it is always available.
At LOD100, BIM elements contain no information at all, except for some size and volume (e.g. massing form). Berg (2014) proposes that the BIM data at this level is not detailed enough to perform an LCA, however by using the metadata and other high-level data included in the BIM model, the quantity of generic objects could be imputed. At this level, the imputation of this data will result in a certain amount of uncertainty, but it may not be a big problem as it would be helpful in understanding the environmental impact between two high level building designs.
In summary, to automatically assess the environmental impact of a building design at each stage of its design and construction life cycle, the BIM information data gap needs to be addressed. The data needed is the quantity of material for each assembly, for objects, for object types (e.g. generic or specific) as well as their Uniformat classification. For a design at LOD300, the function of each material is required to detect any missing material layer in an assembly. In the case of very early design at LOD100, some metadata is also required.

Analyzing IFC for LCA purposes
One of the first data elements required for an automated LCA are the quantities for every object, with generic or specific type. These quantities can be retrieved in many ways, as the content of an IFC file varies based on many factors that cannot be easily controlled (i.e., normalized). For example, the export options of a BIM tool, the data elements contained in the BIM tool and the contents of the resulting IFC file vary depending on the BIM tool software version used. After a detailed analysis of different IFC file contents, we found two reliable methods to extract the quantity of an object. The first method, which is the most reliable in our opinion, is to calculate a quantity (such as the volume and the area) from the 3D geometry of each object. The IFC format describes the 3D geometry using a combination of one or multiple entities based on 24 types of representations (BuildingSMART, 2016). During a case study, we found that processing all of these entities was too complex and took too long for the scope of our research project. As an example, the type of representation "Brep" uses one side to define any 3D object. This object can be completely closed or can have some open side. The challenge in calculating its volume is first to detect whether the object is open and after that, determine if the object can be represented as a volume and how to calculate it. The other method would be to rely on the BIM tool to include the base quantity in the IFC file. The quantity of an object is contained in some property sets linked to the object, however, some variations in the file structure were detected between the exports tested on different BIM tools as shown in the results presented in Table 3. This variation could be managed by mapping property names and property set types to each BIM tool and BIM tool version.
Next, in order to extract the quantity of material, the material information could be found directly in the object but most of the time, in our observations, it was described in the object's type. The object's type and object are linked by the entity "IFCRELDEFINESBYTYPE" and the list of materials is linked in the entity "IFCRELASSOCIATESMATERIAL". In the list, each layer of material is described by two entities: "IFCMATERIAL" containing the name of the material and "IFCMATERIALLAYER" containing the thickness of this layer of material. The volume of each material can be calculated by multiplying the thickness of the material and the area of the object. At this point, the IFC format contains almost every data element required to do this, but after trying several export options and analyzing the resulting IFC file in detail, there is nowhere to describe the function of a material layer that is needed for this LOD300 element. We then proceeded to analyze the IFC specification described by ISO 16739-1 (ISO, 2018) to verify whether the format has a means to add user information about material. The common way recommended to add user information concerning an element in the IFC format is to use the "IFCPROPERTYSET" to link to that element with an "IFCRELDEFINESBYPROPERTIES". To be able to link an element to an "IFCPROPERTYSET", the element needs to be an "IFCPRODUCT", like a BIM object or an object's type. The material is described only in "IFCMATERIAL" and "IFCMATERIALLAYER" and they are not an "IFCPRODUCT". We then concluded that the IFC format couldn't contain the material's function needed to detect a missing material layer.
Other information that was not found during our testing is the metadata required by an LOD100 element. The strategy to add metadata in a BIM model using Autodesk Revit is to add a global parameter (e.g. Building Lifetime: 75). Global parameters, even when different IFC export options were used, were not transferred to the resulting IFC file. After a complete analysis of the IFC specification, we found that the IFC format offers a global element named "IFCBUILDING" and this element is an "IFCPRODUCT". That means that any global parameter can be contained in an "IFCPROPERTYSET" and be linked to an "IFCBUILDING". This finding confirmed that the metadata can be present in an IFC file but currently Autodesk Revit does not support it.
This leads to the problem that data elements can be absent from an IFC file extract because of the specific implementation of the IFC file export feature for each BIM tool. This is an even greater problem as the data elements exported also varies based on the many different export options chosen by the user at the time of the export, which forces  (Sabol, 2008). In conclusion, new analysis such as automatically calculating the environmental impact of a whole building using only the BIM model information would benefit from a specific user-defined standard format (Eastman et al., 2011) to consistently obtain all the data needed.
6 Extracting data from BIM As we have seen, using the IFC file extracts places some burden on the user and some on the software developer to design and implement a repeatable and stable process to extract the data needed to perform an automated LCA. The user will be required to manually enter any metadata missing in the BIM model using a specific and consistent syntax, and to select the right options within his BIM tool IFC export function. Concerning the software developer, even if the user does his part of the process to perfection, he will need to process each of the different data elements exported and calculate the quantity of material from the 3D object described by complex forms.
To avoid having the user enter data in the BIM model, as well as having a software developer perform the calculations on the extracted data, the creation of a custom software relying on APIs is a better option, in our opinion, over using an IFC file extract process.
Using an API allows for a flexible level of customization and can produce a specialized piece of software to automate the environmental impact assessment using data from the BIM or to automate some data entry. API's are different for each BIM tool and are not at all standardized. Sabol (2008) suggests avoiding API's because they tend to change from version to version as the BIM software evolves. This means that it is a good idea to minimize the dependency on a specific BIM tool API in order to be more flexible and unaffected by version changes. Instead of integrating an LCA software over the BIM tool, we chose to create a small custom application software to export the data needed to a file and use it in an external LCA software. The exported file is based on a custom format focused on LCA needs; we named it Intelligent Ecological Data (IED). With the data needs of the automated LCA previously identified, the first design step was to define the data schema (see Fig. 3). The resulting schema contains four entities: BIMProject, BIMObject, BIMObjectType and BIMMaterial.
The first version of the IED format was based on a comma-separated value format where the comma was replaced by a control character. This first attempt to establish a standard format for the LCA was evaluated and rejected by experts because it was judged to be too different from the standard IFC format.
To address this remark, the second version of the IED format shifted to XML syntax, to align with IFC and more precisely, the ifcXML variant. The ultimate goal is that the IED format becomes an ifcXML subset, but we would need to wait for the IFC format to contain the missing information on the material. Another advantage to using XML is it can be imported by several software, like Microsoft Excel, that is used in several research studies (Peuportier, 1998;Shin and Cho, 2015). At this moment, only our experimental UBUBI software and Revit, with an add-in from UBUBI, use the IED format to its full extent to successfully perform an LCA of a whole building.

Conclusion
This paper presents a new framework to automate the LCA and a new file format to support it. The framework introduces uncertainty to fill the data gap induced by the level of development of the BIM model. The framework uses multiple strategies to evaluate the environmental impact of an element such as linking elements to market processes based on statistics. To support these strategies, multiple data are required from the BIM model and no standard format contains the data required. The best candidate was the IFC format but it lacks data for the materials used in assembly. To share the data needed to perform an automatic LCA, we created the IED format. This format is less flexible but much simpler than the IFC format. The new format describes all the objects, object types and material in a BIM model, allowing us to analyze the BIM model at any LOD.
In the future, we will need to reassess the IFC format when a new version will be released to check if it supports more information on BIM materials. In the meantime, we will continue to develop the IED format. The next step is to include the factor of uncertainty for the quantity retrieved from the BIM model to provide a more accurate LCA.