...

Using AI Tools for Measuring Atmospheric Methane

TLDR: A dataset of simulated absorbance scans and a 1D convolutional neural net were used to train a model to predict methane mole fraction to within 0.25% of actual mole fraction. This provides the potential to rapidly improve data processing of TDLAS-based sensors.

1. Initial Absorbance Simulation

We start with simulating methane absorbance. By using the HITRAN API, we can download all existing lines for the wavenumber range of interest. I have a script and associated files available for download on my github (davidvs-rwl). The Python script can be modified to simulate absorbance spectra for all of the available molecules in the HITRAN database.


You first download hapi.py from the https://hitran.org/hapi/ and save it to your working directory. The latest simulation script (CH4_V3.py) will generate a csv file with the following file name format, TTTP_P-LLLL-XXXXX.csv. TTT is the three digit temperature, P_P is the three digit temperature where the decimal is replaced with an underscore. LLLL is the four digit pathlength  in cm with leading zeros if necessary, and XXXXX is the mole fraction in ppm with leading zeros if necessary.

2. Line Selection and Detector Simulation

For line selection purposes, I simply chose a region in the common methane absorbance bands that was free of spectral interference from the common atmospheric and combustion related molecular species. I also intentionally did not limit myself to isolated lines which is typical when using a scanned-direct absorption approach. This led me to three lines near 2,281 nm. I should note that I confirmed with our laser manufacturer that the laser would be capable of scanning across all three peaks of the target lines in a single scan.

Next I simulated the detector signal that would be generated by a non-absorbing scan of a laser, i.e the non-absorbing baseline, Io. See Figure 1. Triangular waveforms are typically used when implementing a scanned-direct absorption approach. The reason for this is that fitting a baseline intensity to a transmitted intensity signal, It, is usually easier for triangular or sawtooth waveforms than for sinusoids. Since the overarching goal of this effort is to reduce or eliminate manual fitting of baselines, a sinusoid was with the added benefit of reducing dramatic changes in output wavenumber of the laser that is sometimes associated with triangular or sawtooth waveforms.

Figure 1 – Incident laser intensity, Io

Figure 2 shows the output wavenumber of the laser intensity with respect to time. Until we get a real laser in-hand to characterize the actual laser response to output wavenumber vs time, we simulate the change in wavenumber vs time as proportional to the change in output intensity of the laser. This is typical of most tunable diode lasers I have worked with in the past.

Figure 2 – Wavenumber vs Time

Next we simulate transmitted intensity by using Beer’s Law, 𝛼 = -ln(It /o), and solving for, It. 𝛼 here represents absorbance. Figure 3 shows what the simulated scan would look like after calculating absorbance and plotting the signal vs wavenumber. For those of us who are easy to forget, wavenumber can be converted to wavelength in nm by using, nm = 10e6/𝜈, where 𝜈 is wavenumber (shoutout to Dr Marcel Nations for teaching me this quick conversion).

Figure 3 – Absorbance vs Wavelength (nm)

A peak absorbance near 20 is quite absurd. So why target this region of methane absorbance? Apart from non-interference from common atmospheric or combustion gases (e.g. H2O, CO2, CO), these three particular transitions were chosen for their varying peak absorbance values. For example, at a 10,000 ppm (1%) methane mole fraction the peak values of the 4,384.3712 cm-1, 4,384.8199 cm-1, and 4,385.3870 cm-1 transitions are approximately 19.7, 5.51, and 2.1, respectively. They span nearly an order of magnitude in peak absorbance. Providing potential for a wide ranging dynamic range of a methane sensor. Normally these absorbance values would be wildly high and impractical. Most notably due to the lack of light intensity at the detector and reduced signal-to-noise ratio of the signal.

This sensor development demonstration seeks to understand the limits of machine learning in absorbance scan recognition. We aim to understand the limits of a 1D-convolutional-neural-network in training a predictive model on simulated detector scans and the predictor’s ability to take T, P, and a simulated detector scan as the input then output an estimated mole fraction. These results will guide development efforts for use on a real methane sensor and ultimately provide guidelines for mole fraction detection requirements.


For reference, Figure 4 shows what the transmitted intensity, It, looks like as a detector signal in the presence of methane (10,000 ppm). The pathlength, L, is 30 m.

Figure 4 – Transmitted intensity as a voltage signal from the detector vs time.

Now that we have a simulated absorbance scan, we can iterate through a wide range of mole fractions to generate a dataset large enough to be able to train a neural network that can predict what the mole fraction is for an unknown absorbance scan like that shown in Figure 3. Because we’re unsure as to how well a neural network will be at predicting unknown scans, we decided to generate a wide ranging dataset with respect to mole fraction, 1 ppm to 50,000 ppm (5%).

Temperature was varied between -20 – 50 ° C. Pressure was varied between 0.8 – 1.1 atm. The dataset is made up of 50,000 scans with unique mole fraction values. For each scan the temperature and pressure are sampled randomly from the stated ranges.

3. Data Generation and CNN Model Training

I initially chose 50,000 scans to generate the initial dataset so that the dataset would at least have one representative scan for every ppm value between 1 and 50,000 ppm. I did not initially know how long it would take to generate such a dataset so I just decided on a number. 50,000 seemed like a good start. I was using a intel core i5 XPS Linux machine and it took me 83 min to generate the first dataset. I needed to run through this process a few times but decided to experiment with using AWS resources to see if I could speed up the process. I ended up being able to use 32 CPUs (my AWS limit at the time) in an EC2 instance on AWS. I can now generate the dataset in under 3 min.

Now that the dataset is generated, it can be used to train a model that will predict mole fraction based on an absorbance scan, temperature, and pressure. The Python tools I used for this that pertain to the model training are TensorFlow/Keras and scikit-learn. For training purposes the dataset is split into 70% training, 10% validation, and 20% test. During the training process, the model adjusts internal convolutional filters and weights to minimize Mean Square Error (MSE) between predicted and actual mole fraction values.

Results

To confirm the performance of the model in predicting  mole fraction, I gave it an unknown absorbance scan along with the known T and P values for that particular scan. Similar to what you would collect for a real TDLAS sensor. You know the ambient temperature and pressure but you don’t know immediately what the mole fraction is based on the detector voltage scan alone. The unknown absorbance scan with an actual value of 10,000 ppm was predicted to be 9,975 ppm. That prediction is within 0.25%  of the actual value (US Provisional Patent 63/887,353).

This is quite exciting given the speed of the prediction script, just a couple of seconds. For the sake of this exercise, I chose to use the absorbance scan for training and limit the spectroscopy to a scanned-direct absorption method to keep things relatively simple. Given the performance and simplicity of this approach, it could easily be translated and applied to many other species and TDLAS methods, i.e. Wavelength Modulation Spectroscopy.

I’ll be exploring this method further in the near future with real laser and detector scans. Reach out to me with any questions and/or subscribe to our newsletter.

Leave a Comment

Your email address will not be published. Required fields are marked *

Seraphinite AcceleratorOptimized by Seraphinite Accelerator
Turns on site high speed to be attractive for people and search engines.