Industrialization and Urbanization of Indian Society. In the recent years, the industrialization and urbanization of Indian society has led to an increase in the concentration of pollutants in the atmosphere. Air pollution is defined as a mixture of solid particles and gases in the air which has harmful and poisonous effects. Various experiments and studies have shown that long term exposure to such air pollution can lead to serious health issues such as: aggravated cardiovascular and respiratory illness, accelerated aging of lungs, diseases like asthma, bronchitis, cancer and a shortened life p.
According to the World Health Organization (WHO), over 12 million people die from environmental health risks annually. Air pollution has become the 4th highest risk factor for premature deaths.Such degradation in the air quality levels has made air pollution a serious threat at a global level, especially for the developing countries, towards the sustainability of mankind. This has grabbed the attention of public as well as the government agencies.
An air quality index (AQI) is a parameter used by the government agencies to communicate to the public how polluted the air quality currently is and how polluted it is forecast to become. As the AQI of a region increases, an increasingly large percentage of population of that area will experience adverse health effects.Several projects have been launched to combat air pollution in all major countries worldwide.
For e.g.: 1): Hebei Air Pollution Prevention and Control Program (HAP- 2016:18) project in China to reduce the emissions of specific pollutants in Hebei; 2): The Odd-Even Scheme implemented by the Indian Government in national capital Delhi (2016). There are ceaseless fighting efforts for air pollution reduction all around the world.
As an endeavor on the course of machine learning based air quality forecasting, this report presents an initiative and algorithmic details of various statistical models in solving this challenging problem. The Machine Learning models used in this paper, to facilitate the prediction of pollutant concentrations, include:
Linear regression
Logistic Regression
Polynomial regression
Random Forest Classification
Decision Tree Regression
Decision Tree Classification
Support Vector regression
Support Vector Classification
KNN Classification
We target our air pollution forecast to the city of Delhi, India as it is at the forefront for battling against air pollution. We focus on predicting the Air Quality Index (AQI) level of Delhi, as it is a quantitative method to profile air pollution level. In order to reduce the pollution levels in Delhi, we will be analyzing 5 pollutants and 5 other environment parameters responsible for increase in AQI levels. The fixed station data is taken for 3 stations namely: NSIT (Dwarka), RK Puram and Shadipur .
Compare results of Air Quality Index (AQI) values obtained by different regression models and then propose the best model.
Classify the dataset into 5 different AQI categories, and then use Classification models to forecast the pollution category for next month.
Analyze the most prominent pollutant, using Back Propagation, responsible for air pollution and suggest methods to control it.
The rest of this paper is organized as follows:
Section II describes related work, and Section III provides background on data sources, participatory sensing systems and details the 5 regression and 5 classification models used in this study.
Section IV describes the steps in our model, while model implementation and estimation accuracy is studied in Section V. The paper concludes in Section VI.

Industrialization and Urbanization of Indian Society