Hello everybody,
I am currently working on a PhD project for a car manufacturing company, which basically consists of creating a predictive maintenance application for the machines that are currently used to fill the air conditioning circuits of vehicles. In essence, each cycle consists of two phases designed to perform checks on the circuit followed by a last one in which the corresponding refrigerant gas is charged. Specifically, in a first phase the circuit is pressurized in order to detect leaks from inside to outside the circuit, in a second phase a vacuum is exerted on the circuit in order to detect leaks from the outside to the inside, and finally, if no leak is detected, the circuit is filled. Regarding the data collected, for each of the phases different readings are taken of the pressure reached inside the circuit, except for the gas loading phase:
- First phase (pressurization): A total of three pressure readings are taken at different times (pressurization, stabilization and control).
- Second phase (vacuum): A total of 4 readings are taken at different time instants (release of the circuit pressure to the atmospheric one, vacuum, vacuum stabilization and control).
- Third phase (charge): Grams of gas charged in the circuit.
The attached FillingCurve.png file shows the typical theoretical curve of a filling cycle with the three aforementioned phases. As for the data, the attached SampleDataTable.png table presents a small sample of them.
The objective proposed to me is to model and monitor these variables so that, following a predictive maintenance strategy, it is possible to predict their trends and detect possible anomalies in real time, allowing to anticipate failures in the pressurization of the machines or in the pump in charge of the vacuum. With regard to the results of the cycles, it is worth mentioning that only those NOKs associated with the filling console have been taken into account, discarding the cycles that have been NOKs because the vehicle circuit itself had defects (leaks, bad connections, etc.). In any case, it is to be noted that the factory does not fully trust the assigned NOK labels... so maybe it would be better to just consider the OK samples...
As far as I understand, these data constitute time series, a completely new field for me. I have some experience in supervised and unsupervised classification problems using classical machine learning algorithms, as well as in computer vision using deep learning, but none with respect to time series. One of the problems I have encountered is that the classical techniques for dealing with these types of data, such as ARIMA and its variants are only valid for equispaced time series. However, this does not apply to my case because of the industrial context from which it comes: the machine is not filling continuously, there are line stops, breaks, vacations, maintenance stops, etc.
Can anyone guide me on the way to go? Does anyone know the techniques that can be applied to this type of time series? I would appreciate any kind of help, idea or suggestion, because although I thought it would not be so complicated and that the modeling of the time series was in a very mature state, the truth is that I am quite lost.
I believe that in order to apply the classical techniques, one option would be to summarize the data in new time intervals (hourly, for example), although this is not an alternative with which I feel very comfortable.
Thank you so much in advance.