...
Modeling pollution maps

Modeling Fine-grained spatio-temporal pollution maps with low-cost sensors

TLDR

Air quality data was collected in Delhi, India using a combination of low-cost (Kaiterra) sensors and existing government sensors over 2 years. Using a neural network based mode researchers were able to predict next day pollution levels to within 10%.

Abbreviations

  • RMSE: Root Mean Square Error
  • MAPE: Mean Absolute Percentage Error
  • MPRNN: Message Passing Recurring Neural Network

Motivation and Approach

Pollution prediction in densely populated cities is critical to guide public policy and issue health warnings. The granularity needed to provide useful guidance can be costly with reference-grade pollution air quality monitoring systems. In this study, researchers explore the use of noisy low-cost sensors for augmenting reference grade systems and leveraging the size of a pollution network.

The two major aims of the study are;

  1. Use a network of 28 low-cost portable sensors (Kaiterra) over a 1,480 sq km area in south Delhi to build a detailed heatmap that provides accurate pollution predictions
  2. Are existing local-government monitoring efforts improved by augmentation with low-quality sensor data? In turns out they are

Researchers collected PM2.5 data over 2 years and were able to track seasonal changes. A message passing recurring neural network (MPRNN) is used to model pollution at any point in Delhi. MPRNN allows for the modeling of individual sensors.

They are able to achieve <10% root mean square error (RMSE) in being able to predict pollution levels up to one hour in advance. This is compared to the 30% RMSE of baseline modeling approaches. They also report a mean absolute percentage error (MAPE) of less than or equal to 10%. Additionally, they were able to augment the high-quality government monitoring stations and create a sensor network with 60 sensors spread across 700 sq km of Delhi.

Data Collection

  • PM2.5 data averaged to the hour
  • 28 Kaiterra sensors
  • 32 government monitors
  • Collected over 24 months, 2018/05/01 to 2020/05/01
  • 75% of data was used for training, up to 2019/10/30
  • Remaining 25% of data was used for testing
  • RMSE and MAPE are reported
  • Model is evaluated on the two separate networks of 28 and 32, then on the combined network
  • Predictions are then compared with measurements

The model with the best results was the per-sensor spline with spatio-temporal hierarchical model (STHM) imputation + MPRNN. It yielded RMSE and MAPE values of 10.1% and 9.6%, respectively.

Key Findings

  • Predictions of air quality were predictable with a 10% error
  • Per-sensor spline with STHM imputation + MPRNN provided the best RMSE and MAPE, 10.1% and 9.6%, respectively
  • Using an average spline only increases RMSE and MAPE by 1.1% and 0.7%, respectively
  • Prediction error drops and tails off to 10% with the inclusion of 30 sensors
  • Adding additional sensors provides little value to predictive performance
  • RMSE error is lower than the observed variance in the PM2.5 Kaiterra data (actual sensor variance is not reported)

Opinion

Perhaps if Kaiterra were to implement the neural network model and allow some level of crowdsourcing for data, then I could see local governments or institutions actually use these sort of results to provide guidance to local communities. In California we’re I’m based, I could see the crowdsourcing aspect to be helpful during fire season. I don’t see how the predictive aspect would be helpful for predicting fires without more complex inputs to the model.

In the study, researchers found that pollution levels exceeded local and World Health Organization air quality standards for 371 of the 641 days in which data were collected. That’s 58% of the days monitored. I could see these type of data sets being used to bring attention to persistent air quality problems in historically marginalized communities.

Leave a Comment

Your email address will not be published. Required fields are marked *

Seraphinite AcceleratorOptimized by Seraphinite Accelerator
Turns on site high speed to be attractive for people and search engines.