India Air Quality Analysis


This project performs exploratory data analysis on India's Air Quality dataset to understand patterns and trends. It is also using random forests, LSTM, and CNN architectures to forecast future quality levels. This project has the potential to make a significant contribution to the understanding and management of air pollution in India.

Description

This project is doing exploratory data analysis on India’s Air Quality dataset to understand patterns and trends. It is also using ensemble random forests, LSTM, and CNN architectures to forecast future air quality levels. This project has the potential to make a significant contribution to the understanding and management of air pollution in India.

Predictions

Predictions Ensemble

Exploratory Data Analysis

Firstly I grouped the data into various frequencies (day, month, year) to identify possible trends:

Particulate Matter

Nitrogen Compounds

Next I plotted the similarities between features so that I get a better explanation on the relationships between the variables:

Pair Plot

Finally, through a correlation matrix, I can easily visualize the correlation degree between the variables.

Correlation

Time Series Forecasting

Ensemble Models

  • Random Forest
  • Gradient Boosting
  • AdaBoost
  • Histogram Gradient Boosting
  • XGBoost

Predictions Ensemble

Deep Learning Models

  • Long short-term memory (LSTM)
  • Stacked LSTM
  • Bidirectional LSTM
  • Bidirectional Stacked LSTM
  • 1D Convolutional Neural Network (CNN)
  • 1D Convolutional LSTM

Predictions DNN