Data mining is used in a lot of feilds. But I could not find any applications in civil engineering. What are the applications of data mining in civil engineering?
Data sensing and analysis can help in monitoring the conditions of infrastructure both above and below the ground. Data from these sensors can be stored and data mining can be used to predict the health from the available data. Possible correction mechanisms can also be associated.
Traffic engineering
Data sensing, analysis and mining can be used to facilitate decision making and intelligent transportation systems.
You may apply data mining and machine learning to understand data collected from sensors. In civil engineering this is useful in applications including regulation and control of electricity grids (all appears in the literature as Smart Grids), and control and maintenance of water systems. In both systems, for example, you aim to collect reading from a sensor network distributed across the whole system, and use that to build a probabilistic representation of consumption, demand, quality and status. With this model you can predict all those quantities and then use that to control the system.
You may find theory on distributed data fusion, graphical models, and distributed control useful for such applications.
It all depends on how strong is your data mining. If it is just based on AI expert rules, than the applications are as in the books. I'm not a Civil Engineer, Tariq on the other hand mentions a number of advanced and exciting applications...
If your data mining is very good, you have then the freedom to use any data collection that deals with construction, planning, maintenance, accidents, whatever that is in reach. The process of working with such a mix of data would be simply asking, what phenomena exist there that can be formulated and used to better the engineering? the use, quality, cost, business or other objectives that one may have.
If you have a good data mining tool, data tells always new unexpected things some of which were not asked about but rather popped out as a discovery.
Anyway, as an engineer it would be better to do what Tariq suggests, start with existing knowledge and see if it falls in line or not.
You could study transportation aspects. Patterns in the flow of people when they move frome home to work. You can get data from public transportation companies! Or maybe google. Trying to discover emerging patterns or problems in the transportation system.
Actually, applications in traffic restricted by public companies. But, for civil engineering and buildings, private comanies work, too. So, I think the buildings have more chance for work in data mining.
Vahid, quote " I do not know the problems in this field about buildings."
Big data is funny in the way it turns upside down our usual "linear" thinking, building solution from the target top end.
What it offers instead is extending our options to choose a target that is worth digging in. It is confusing but worth exploring in my view.
My suggestion is to ask first "what information do I have that is easily accessible?".
Then list the many variables/attributes that are in the data and ask "what are the areas described in my data?".
Third question would be "what professionals in these areas deem as the hardest restrictions on their current solution?" -- this I think is where the answer to your question starts.
After that, see "which restriction has the most information in the data"; this would be my choice if in your place.
You can use spatial data mining to find out best possible location for construction or also find out the best quality soil or cement depending on certain attributes...
Dear Vahid Nouri, Yes, data mining has mny applications in specializations of civil engineering. A few of them are Transportation, Rainfall predictions and finding extremes, remote sensing,....
Here you have an example of a data set for predicting buildings energy efficiency: http://archive.ics.uci.edu/ml/datasets/Energy+efficiency
Data Set Information:
We perform energy analysis using 12 different building shapes simulated in Ecotect. The buildings differ with respect to the glazing area, the glazing area distribution, and the orientation, amongst other parameters. We simulate various settings as functions of the afore-mentioned characteristics to obtain 768 building shapes. The dataset comprises 768 samples and 8 features, aiming to predict two real valued responses. It can also be used as a multi-class classification problem if the response is rounded to the nearest integer.
An example in transportation geotechnics is shown in the publication by Professor Antonio Gomes Correia. If you check his publications you'll find other examples.
Come along to the 2nd International Conference on IT in Geo-Engineering
Dear Mario Gonzalez , I downloaded that data set. This data set has 12 columns. X1 to X8 shows the features. But, I cannot understand what the Y1 and Y2 are. Also, which column is showing the class? Could you help me, please?
There are some very important applications that you can thinks of:
-complaint databases of municipalities with respect to malfunctioning of infrastructure (e.g. flooding due to problems with urban rain age systems)
-Insurance databases contain a lot of info with respect to damage caused due to e.g. failure of underground infrastructure (water main, sewers and the like).
At TU Delft were are conducting several PhD projects in which dating is an important tool.
Mario, energy conservation is an interesting application. Yet, wouldn't it be even more interesting to collect real data and discover something from the greater simulator of all - the true world?
What I do is using free data captured from related activities (say: maintenance and energy bills), which include many indirect variables, and use them all. My view is that INDIRECT VARIABLES ARE AS RELEVANT APRIORI AS DIRECT ONES.
Vahid, the aim is to predict two real valued responses, Y1:Heating Load, Y2:Cooling Load. That is, you have a regression problem. The authors and maintainers of the data set also suggest a multi-class classification problem if the response (Y1 or Y2) is rounded to the nearest integer (the classes). The aim is to use the eight features (X1 to X8) to predict each of the two responses, that is you use the X-features to predict Y1, as well, you can use the X-features to predict Y2.
Edith, your comment is very fruitful, thanks. I think it is very interesting to approach a problem from indirect variables, as you suggest. Googling, I got to http://en.wikipedia.org/wiki/Mediation_(statistics), which I find very appealing.
Mario, the Mediation term is interesting. Surely many of the indications that we see are a little from some larger phenomena. It is similar to asking what is the root cause. I've just gave a presentation in the local quality society about the hidden patterns and how misleading can be correlations.
For example, suppose a specific machine is correlated to most of manufacturing rejects, should we replace it? - Well, not necessarily. If the root cause is a failing purchasing procedure, it is much more effective to correct this procedure.
The difficult part is to discover the Mediation factor among an extensive number of combinations of possible variables.
There are many applications for data mining in civil engineering, especially when you are dealing with large dataset. We have recently used the concept to develop decision trees for damage prediction in RC structures.
If you have enough data about some environment, you may discover through data mining relationships between certain variables (for instance soil/ground properties and type of structure) that may influence the design of the required structures to build.
If you were to decide what type of structure you want to put up in an already built environment, mining data related to the different activities in that urban area may help decide what to invest in. For instant, data from retail shops in the area may point to the type of customer that may transit through the area over a certain length of time and may suggest that putting up a hotel would be good business.
Data mining and machine learning have endless applications in Civil Engineering and any other area really. The problem is how good is the data you are mining and how good are the mining tools you are using.
Data sensing and analysis can help in monitoring the conditions of infrastructure both above and below the ground. Data from these sensors can be stored and data mining can be used to predict the health from the available data. Possible correction mechanisms can also be associated.
Traffic engineering
Data sensing, analysis and mining can be used to facilitate decision making and intelligent transportation systems.
Dear Vahid Nouri, I am providing the link below for an article entitled "DATA MINING IN RESERVOIR OPERATION AND FLOOD CONTROL USING ARTIFICIAL NEURAL NETWORKS". May be of your interest.
Several data mining research works are focused on diagnosis by using classification methods to determine the existence of patterns for structure failures, deviation from accepted boundaries of quality or critical points.
But in the same way that within any other area, data mining could be more useful for civil engineering to determine unknown hidden data. I mean there are several facts that won't be obvious until some data enlightens its path.
Think about answers for questions like: What's the weakest link of an structure? What's the least useful piece of a building? What are the most critical combination for an electrical installation?
I do believe there are really a lot of applications! Try searching with a better i-net search program! BTW: If you are interested in environmental engineering bordering to civil eng., i.e. water treatment, flood prediction, drinking water supply optimization … you may just refer to papers in my library on RG … Cheers, Boris
Data mining is used to analyze a large amount of data to study their trends. The terminology is relatively new, but it has been used by various researchers and engineers for years without any reference to its trendy name! It can be used for performance evaluation of various civil engineering structures.
In addition to the applications mentioned above by the other commenters, it has been extensively used for performance evaluation of various paving materials, as well as pavement management and asset management as a whole. If you search "Data mining in pavement management", you will encounter many interesting applications of that. Below are a few examples of them:
Hi Nouri, please check my web page (in my web page you have links to the publications), I have several Data Mining applications in the Civil Engineering domain:
A.G. Correia, P. Cortez and J. Tinoco. Artificial Intelligence Applications in Transportations Geotechnics. In Geotechnical and Geological Engineering, Springer, 31(3):861-879, June 2013, ISSN 0960-3182.
J. Tinoco, A.G. Correia and P. Cortez. Application of Data Mining Techniques in the Estimation of the Uniaxial Compressive Strength of Jet Grouting Columns over Time. In Construction and Building Materials, Elsevier, 25(3):1257-1262, March 2011, ISSN 0950-0618.
T. Miranda, A.G. Correia, M.F. Santos, L.R. Sousa and P. Cortez. New Models for Strength and Deformability Parameters Calculation in Rock Masses using Data Mining Techniques. In International Journal of Geomechanics, ASCE, 11(1):44-58, January/February 2011, ISSN 1532-3641.
The application is therefore numerous.. As long there is dataset collected from multiple sources within civil engineering areas, you as miner may determine what and how to apply it.
Language: English ISBN-10: 3639207564 ISBN-13: 978-3639207569
Data alone are worth almost nothing. While data collection is increasing exponentially worldwide, a clear distinction between retrieving data and obtaining knowledge has to be made. Data are retrieved while measuring phenomena or gathering facts. Knowledge refers to data patterns and trends that are useful for decision making. Data interpretation creates a challenge that is particularly present in system identification, where thousands of models may explain a given set of measurements. Manually interpreting such data is not reliable. One solution is to use data mining. This book thus proposes an integration of techniques from data mining, a field of research where the aim is to find knowledge from data, into an existing multiple-model system identification methodology. In addition to providing information about the candidate model space, data mining is found to be a valuable tool for supporting decisions related to subsequent sensor placement.
An important use of the data mining in civil engineering is for analysis wind potential of microsites regarding to the technical feasibility for wind farms.
The systematic measurements of the wind speed, using a scientific methodology is the key to analyse the wind tendency.
The methodology basically employs a tall tower with two ANEMOMETERS(devices that record the average wind speed in 10minutes, 24 hours) placed at two different heights plus WIND VANES(equipment that identify the wind direction) as well as thermometers and barometers. The monitoring station for a minimum period of 02 years allows to evaluate the wind energy potential of the site.
As mentioned before, data mining ans machine learning can be applied to a wide range of topics. We at Microsoft apply it to software development and code bases. But the most important part is to understand the data in the first place. It sounds trivial but it is not always that easy, although it is key. Why is the data looking the way it does and how to interpret the results of recommendation or prediction models. Is the data clean or is it biased. Do the entities represent the general population of entities or only a subset? If you know how well your data is representative and how to interprete the raw data as well as the modelled or predicted data, you can apply data mining to almost anything.
At Microsoft, we had quite success modeling dependency graphs (e.g. social networks) and organizational structure to predict and estimate their impact? Hoe should we restructure the organization to gain goal X or how does collaboration or missing communication influence quality. I guess the very same should hold for civil engineering as well.
As mentioned before, data science has tremendous application in infrastructure health monitoring. The future of IoT is seeming feasible as the capabilities of large scale, distributed data mining is growing.
I had the opportunity to work on street light infrastructure monitoring and it involved a good amount of data mining.
Article Urban Street Lighting Infrastructure Monitoring Using a Mobi...
I have worked in several Civil Engineering applications where Data Mining techniques provided interesting results. A few examples:
- Jet Grouting Column Diameter Prediction Based on a Data-Driven Approach, http://dx.doi.org/10.1080/19648189.2016.1194329;
- An evolutionary multi-objective optimization system for earthworks, http://authors.elsevier.com/sd/article/S0957417415002936
- Modelling tyre-road noise with data mining techniques, http://acoustics.ippt.gov.pl/index.php/aa/article/view/1483/pdf_127
- Artificial Intelligence Applications in Transportation Geotechnics, http://dx.doi.org/10.1007/s10706-012-9585-3
- Application of Data Mining Techniques in the Estimation of the Uniaxial Compressive Strength of Jet Grouting Columns over Time, http://dx.doi.org/10.1016/j.conbuildmat.2010.09.027
- New Models for Strength and Deformability Parameters Calculation in Rock Masses using Data Mining Techniques,
I have used the data mining technique to define Level of Service criteria of Indian roads. I used the myriad traffic data from city Mumbai by studying it using speed probe vehicle and GPS.
I concur with all colleagues above. Given the growing number of, e.g., bridges, that are being instrumented with a wider range of sensors and an increase in number of WSSN's, big data, combined with AI, is becoming one of the most challenging and promising fields in SHM. These data come from a wide range of sensors or other sources. How to fuse them, process them, synchronize them,.... and then how to extract the necessary features and then use the right algorithms to interpret and detect information from this data. I shoudl emphasize that the Data Mining in itself is one challenge (Big Data and increasing Big Data), but more importantly is the extraction of the information which can be done only by algorithms that are based on AI. Moreover, the uncertainty associated with the data, whether due to fusion process, noise, reliability of sensors and sensor network etc., is another important and growing area of research.
I enjoyed this discussion very much. In the next 20 years the most important buildings, infrastructural systems and heritage buildings will be self-diagnosing systems providing automatic warnings when a fault or a damage occurs allowing cost effective maintenance and retrofitting strategies. This will be a key achievement in highly seismic regions made possible by new developments in sensing networks, self-sensing materials, data analysis, IoT and machine learning. This will be the main field of scientific and technical developmemt in civil structural engineering and earthquake engineering of the next years