MapKnitter Aerial Image labelling to detect Environmental issues using machine vision.

About me

Name: Saurabh Dubey

Affiliation: JIIT Noida Place: Noida

why?

Mapknitter is a visionary project and by integrating machine vision into it we can make it revolutionary.

There has never been a more important time to observe human activities and their impact on the earth. We can use MapKnitter as a tool to understand and evaluate how much damage we have done to the ecosystem, Sooner or later, we will have to recognise that the Earth has rights too, to live without pollution.

We can help environmentalist, journalists, citizen scientists, humanitarian agencies, social justice activists, archaeologists, and other researchers to discover such patterns by using Mapkintter. With the help of Mapknitter, they will able to identify, characterize and track the impacts which have not been detected earlier and mobilize to make significant changes.

The Mapknitter project will democratize the aerial image intelligence. By providing a means by which interested parties can quickly and easily scan extremely large aerial imagery for specific environmental issues and by democratizing MapKnitter, We can hope people will get a tool which will give them the ability to research and will be able to hold big organizations accountable for the damage they have done to Earth, which would otherwise remain in the dark.

For example Companies like Hunt oil commit grave violations against the ecological balance by risking the destruction of rainforest, threatening its indigenous peoples and endangering rare species on the coast. which was evident in the mining operations. Cases like this highlight the need for private citizens to step up and take charge.

The primary reason why the aforementioned is one of the driving points for this project is the unique convenience aerial imagery provides in such cases. When mapping out the damage done to an area as big and harsh as the Amazon rainforest ground forces prove to be severely inadequate, this is where aerial imagery shines because of its convenience and sheer scope.

It's high time humans introspect and one line sums it up.

If you really think the economic growth is more important than the environment, then try holding your breath while you count your money.

This project has the potential to outshine all environmental reconnaissance we have done till date by giving power back to the people on how their lives affect their planet and their role in it, all the while empowering everyone to be the change they want to be. There will be many important stories about exploitation of the earth and now with the help of aerial imagery and computer vision we are trying to make such stories discoverable.

How?

Implementing the Machine learning model in simple steps:

Collect the pair of images and label(data),
Write a program that predicts labels for given images(model),
Testing our model on our performance metric (testing).

Real World Objectives/Goals and Constraints:

Predict as many labels as possible correctly.
Even it takes little more time to train our model, but we must ensure the accuracy of the model is high.
Cost of errors would be bad user experience.

Flowchart

So how we will get the dataset and make it a training set?

A DCNN neural net needs to be trained on a set of tagged images, the most important part is data. The larger and more accurate sample size will lead to better results.

OpenStreetMap is an astonishing crowdsourced project with the huge amount of data it is an opensource organization so we can use their data, which has generously categorized large parts of the world with its Nominatim taxonomy. We will train our DCNN using 466 of the Nominatim categories (such as "airport", "marsh", "gas station", "prison", "monument", "church", etc.), with approximately 1000 satellite images per category. without OpenStreetMap it is almost impossible for us to make and train our model, So this is the site which contains tagged images OpenStreetMap_tag. which is stored in SQlite formate and we can download it from this link download

To understand that what we will get after downloading the data this is statistics which shows it diversity and power.

To get the overview of how the images were tagged which will eventually help us during Data-preprocessing.

So, We have the data we needed, Now we have to pre-process the data so we can use it to fit it in our Deep convolution neural network(DCNN) model.Data preprocessing

Steps needed to accomplish this are:

Find many lists of categories.
Generate a list of our 1000 categories.
Enter these into some sort of regular format/database/spreadsheet/csv/tsv/key-value store.
Extract/Scrape/Download many examples of each category as either address or lat/lng.
Enter these into our regular format/database/spreadsheet/csv/tsv/key-value store.
Convert each of theses that are an address into lat/lng.
Convert each of these lat/lng pairs into a URL for an image.
Scrape/API/Download all of the images.
Batch-process the URLs into the correct folders (with the correct names).

Which convolution neural network we will use and how we will train it?

Step:1 Selecting the appropriate model- ResNet

Residual Nets was a major breakthrough in CNN because deep feed-forward conv nets tend to suffer from optimization difficulty. Beyond a certain depth, adding extra layers results in higher training error and higher validation error, even when batch normalization is used.

As you can see that in this above image as the number of layer increases it leads to higher errors. The authors of the ResNet paper argue that this underfitting is unlikely to be caused by vanishing gradients, since this difficulty occurs even with batch normalized networks. The residual network architecture solves this by adding shortcut connections that are summed with the output of the convolution layers.

Why ResNet works so well?

ResNet work best for Deep CNN which is shown in this graph as the number of layers increases the error decrease proportionally
Allowing the training model with 100's of the layers for greater accuracy.
Each layer has less work to do(no copying).
Adding a new layer would not hurt our previous performance as regularization will skip over them.
If new layers are useful performance will not decrease but it can increase.

Step:2 Transfer learning and Using pretrained ResNet-34 model

Transfer learning A very common trick used in ML which is also known as transfer learning which means instead of training your model with random initialization we can initialize the parameters we got from another similar model who already trained on different data set. which is basically a great head start. Simply put, a pre-trained model is a model created by someone else to solve a similar problem. Instead of building a model from scratch to solve a similar problem, you use the model trained on other problem as a starting point.

For example, if you want to build a self-learning car. You can spend years to build a decent image recognition algorithm from scratch or you can take the inception model (a pre-trained model) from Google which was built on ImageNet data to identify images in those pictures. A pre-trained model may not be 100% accurate in your application, but it saves huge efforts required to re-invent the wheel.

From where we get the pretrained ResNet-34 model? Terrapattern is also using pretrained ResNet-34 model which we can find in this Github link and we can download it easily from this link download_link which is working very well for terrapattern. So our pretrained model is trained on dataset Imagnet. In ImageNet aerial image categories are called "synsets". Each folder will contain 1000 images of a satellite photograph containing an image of the entity described by the tag. In terms of resolution and format: for reference, ImageNet is distributed as JPGs averaging 482x415 pixels with the full dataset of 1.2M images coming to 138GB. This is about 4.3 bits per pixel.

Why pretrained ResNet-34? As you can see in the image below ResNet-34 has greater error rate than ResNet-50. Then why we will not use ResNet-50

The answer is that actually ResNet-34 is very less time consuming and uses very less computational power than any other model.

Step:3 Fine tuning on our dataset.

What is Fine Tuning of Network

Fine Tuning is the process where we fine-tune our existing networks ResNet-34. that are already trained on a larger dataset like ImageNet(1.2M labelled images) by continuing training it (i.e. running back-propagation) on the OpenStreetMap dataset we have. Given that, our dataset should not be largely different in context to the original dataset (e.g. ImageNet), the pre-trained model will already have learned features that are relevant to our own classification problem.

Why to Fine-Tune a Network

When we are given a Deep Learning task, say, one that involves training a Convolutional Neural Network (Convnet) on a dataset of images, our first instinct would be to train the network from scratch. However, in practice, deep neural networks like Convnet has a huge number of parameters, often in the range of millions, therefore it will take a lot of time to train a model. Hence, we should always try to find an existing trained neural network that accomplishes a similar task to the one you are trying to tackle, then just reuse the lower layers of such network: this is called Transfer Learning. It will not only speed up training considerably but will also require much less training data.

Secondly, Fine Tuning solves the limitation of high computational resources requirements. Even if we have a lot of data, training generally needs a lot of iterations and it takes a toll on computing resources. Since we will freeze the initial layers of architecture, the parameters that need to be updated is less and the amount of time and resources needed will also be less.

How Fine-Tuning of Network Works

The goal of fine-tuning our network is to tweak the parameters of an already trained network so that it transforms itself to the new task at hand. The initial layers learn universal features(like edges and curves) and as we go higher up the network, the layers tend to learn patterns more specific to the task it is being trained on. Therefore, for fine-tuning, we want to keep the initial layers as it is( or freeze them ) and retrain the later layers for our task.

Which performance metrics we will use to test our model?

The primary obstacle is the imbalance in the dataset which makes detecting rare labels a difficult task and to classifying environmental issues rare labels are very important. So, Micro-Averaged F1-Score performance metric work best in our case.

Performance metric Micro-Averaged F1-Score (Mean F Score): The F1 score can be interpreted as a weighted average of the precision and recall, where an F1 score reaches its best value at 1 and the worst score at 0. The relative contribution of precision and recall to the F1 score are equal.

The formula for the F1 score is: F1 = 2 * (precision * recall) / (precision + recall) In the multi-class and multi-label case, this is the weighted average of the F1 score of each class.

'micro f1 score': Calculate metrics globally by counting the total true positives, false negatives and false positives. _
_

'macro f1 score': Calculate metrics for each label, and find their unweighted mean. This does not take label imbalance into account.

Our goal is to minimize Micro avg F1 Score as much as possible this will help us to evaluate our model

Timeline

Community Bonding period: I will use the community bonding period to

To learn more about the Public lab community
I will discuss the project in detail with my mentor and the interested maintainers.
Research upon suggested changes and make any necessary modifications.

27th May to 23rd June:

Week 1:

Setting up my working environment, Revising the required technologies, Revising the coding standards.
Exploratory Data Analysis(EDA): After taking the dataset from OpenStreetMap. Perform initial investigations on data so as to discover patterns, isolate anomalies. Further test hypothesis and check assumptions with the help of summary statistics and graphical representations.

Week 2:

Data Preprocessing: Data preparation involves the rationalization and validation of data to make sure data is formatted consistently and cohesion is maintained post extraction from the source material.
Deleting duplicate files, Removing special characters, Changing tags into lowercase.
Find many lists of categories, Generate a list of our 1000 categories, Enter these into some sort of regular format/database/spreadsheet/csv/tsv/key-value store, Extract/Scrape/Download many examples of each category as either address or lat/lng.

Week 3:

Enter these into our regular format/database/spreadsheet/csv/tsv/key-value store, Convert each of theses that are an address into lat/lng
Convert each of these lat/lng pairs into a URL for an image, Scrape/API/Download all of the images.
Batch-process the URLs into the correct folders (with the correct names).

Week 4:

Buffer time for data preprocessing.
Analyze and visualize the dataset using graphical means.
Reach conclusions on a front line of attack, revealing intricate structure in data that cannot be absorbed in any other way. We will discover unimagined effects, and we will challenge imagined ones.

24th June to 28th June: Evaluations Evaluation goal: Converting OpenStreetMap dataset into a training set to aid in the training of our model.

29th June to 21st July:

Week 1 :

Obtain the pretrained model Resnet-34
Understanding the architecture of ResNet-34
Understanding the codebase of ResNet-34

Week 2 :

Code to augment pretrained ResNet-34 model to suit our needs.

Week 3 :

Begin fine tuning of the existing model on our Dataset.
Optimize the tag threshold to maximize F2 score.

22nd July to 26th July: Evaluations Evaluation goal: We established the Pretrained ResNet-34 model as the standard and made efforts to fine tune it

27th July to 25th August:

Week 1 :

Tune learning rate (LR) manually, to identify the LR with the best performance score.
To complete fine tuning our model on our dataset.

Week 2 :

Buffer time to complete fine tuning.
Buffer time to complete training.

Week 3 :

Testing our model with the help of performance metric(Micro avg F1).
Applying some experimental Machine learning approaches to obtain minimal Micro avg F1 Score.
Consolidate Micro avg F1 score by further fine tuning.

Week 4 :

Begin documentation process.
Initiate the final debugging process.
Conclude with a well-documented, bug free DCNN for MapKnitter.

Final Evaluation Evaluation Goal: We have wrapped up the creation part of the deep CNN model and the model is ready to be deployed for further application.

Future Deliverables

I will continue to associate myself with Public Lab Community and for the starter, I will try to train our model on the dataset which we are creating on zooniverse Public lab test project through batch processing.

An Insight of me.

During my fifth semester of study, we undertook Environmental Studies as a subject. But, at that point in time, I pondered on the relevance of Enviorenmatal studies in an engineering course. At the end of the semester, after going through multiple case studies, I realised how giant organizations exploited our resources to mint money and that pollution is perhaps the biggest threat to our existence and wellbeing. I reached to a realization that we as a collective must take responsibility to make the change that matters, and subject like Environmental Studies is the need of the hour.

So, at that point in time, I thought, what can I do as an Engineering student. Then, I found, Public lab and thought that following the organization's footsteps, I can use my technical knowledge to contribute as an Environmentalist. Thus, I created an issue on Github plots2 repository issue where I have been discussing the ideas with Public lab community and Jeffrey Warren suggested me to work on AERIAL IMAGE TAGGING FOR MAPKNITTER. He mentored me throughout my research and keeps motivating me by providing useful links and resources. We have been working on a zoonvierse Project-Public lab test project, In which we are making our own dataset for classifying environmental issues. The project is still in the development stage and we hope it will be ready soon. In the near future, we can train this model on our own dataset we are creating on zooniverse. So, during research, I also realized how important this project is and how much is this needed for all of us. By enhancing MapKnitter, We are giving people the power to research easily and reliably. Thus, this makes me highly motivated to work on this project.

Archimedes once said,

Give me a lever long enough and a fulcrum on which to place it, and I shall move the world.

As for myself, although this project looks small, it has the ability to make a great impact on the World.

I am working on machine learning and deep learning for the past one years this is project and case studies, which I have done.

Benchmarking of machine learning classification algorithms on the dataset provided by https://www.kaggle.com/datafiniti/consumer-reviews-of-amazon-products
Chatbot using deep natural language processing in my university's minor project.
Sentiment analysis on the dataset provided by https://www.kaggle.com/bittlingmayer/amazonreviews
Object detection classifier on CNN

I am working on this project for the past 2 months and during my research, I have read tons of research papers and articles on aerial image labelling the most insightful ones are mentioned below

And few case studies which I have done on multilabel classification and aerial image labelling

Planet: Understanding the Amazon from Space on the dataset provided by kaggle https://www.kaggle.com/c/planet-understanding-the-amazon-from-space/data which is an almost similar task we are trying to do.
StackOverflow tag prediction https://www.kaggle.com/miljan/predicting-tags-for-stackoverflow almost similar task the only difference is there are text instead of images.

After detailed research, I have devised a plan of action that I can pull off within the academic environment of Google Summer of Code.

thank you!

5 Comments

I really, really appreciate this proposal SO MUCH

Reply to this comment...

Thank you @liz for your valuable feedback.

@warren @bansal_sidharth2996 please review it

Hi! This is a big ambitious project, and thank you so much for posting it! I have some input which doesn't address the technical components directly but rather the logistics around it and also the community involvement side.

re-usability: I'm really interested in how each component might be re-usable by other groups, or how good modularity might mean that future contributors could swap out a part that has been improved on by a new approach. Can you describe a little bit about how this might be possible? Would the connection points between each step, or the input-output formats be well documented and-or based on common standards?
community input will be really valuable, in many areas, but especially on helping contribute and classify imagery (say in zooniverse). How would people be able to see how their help improves the system? Is there some kind of feedback loop where, for example, they could see that 10, or 100 people have contributed classifications in the past week, and this was fed into the model and produced better matches by some percent? Is there any way to track that kind of progress over time that could help to close the feedback loop with people helping out by contributing images or classifications?
Is there a way that adding more metadata to OSM would be helpful? Would a "map-a-thon" of landfills, or spills, in OSM be a useful way to contribute, and potentially result in more re-usable data than Zooniverse, or at least a different kind of data?
Could there be a pathway to regularly import new contributed data in OSM or Zooniverse and see its effects on the system, and how might that be documented?
Actually maybe it's a good idea to update the flowchart above to show who can help with what portions -- like, where does community input help and how often would new community training data or OSM tagging be incorporated into that flowchart? This may help people who don't do machine learning coding understand the role they can play in improving this system
What might the UI for using this system look like? How might people use the system to find a type of pollution?
Re-usability: would it be possible to run this in a Docker container, and so to easily start up the project anywhere and know that the system will run smoothly? We have Google Cloud credits that could make this easier. I think it's really critical that the complex work that you do be somehow easy to boot up and run by other people. Thanks!

Just a note also - here are some good sources of tiled image data that could be helpful, either for training, or for analysis: https://github.com/publiclab/leaflet-tile-filter#multispectral-tms-tile-sources

Thanks so much! Lots of questions here but it's because it's a really interesting proposal! Thanks for your initiative and your passion!

Is this a question? Click here to post it to the Questions page.

Thanks, @warren for your important feedback here, are my answers

Re-usability: DATA: The dataset we are going to create will be independent and we can use again it for any similar task. MODEL: The model can be integrated into any website with aerial images without even making a change. FOR IMPROVEMENT: Machine learning and deep learning is very fast paced industries and new researches coming almost every month. even the ResNet model is only 3 years old and everybody using it. for anybody to improve our model they can use our Dataset which we are creating in our project and test different algorithms and models. DOCUMENTATION: It is a very important part and I've dedicated almost one week in my timeline only for documentation. I will well document the whole input-output formate totally on common standards and anyone can swap out and can try any other algorithms or models to improve our system.
I have updated the flow-chart and dedicated the community involvement section it. So, how people will able to see their contribution really making a change? In our project, the performance metrics we are using is Micro-Averaged F1-Score which will increase proportionally with the increase in the dataset. It will never decrease because we are using ResNet model and it will skip the training layer if the new data is irrelevant. I will suggest retraining our model monthly because we required about 5000-10000 tagged images to observe significant changes in our Micro-Averaged F1-Score and accuracy. so at the end of the month, we can able to analyse it.
Yes adding more metadata to OSM will be really helpful because it will increase the quality of the dataset and which will help us to make our model better. Starting a Map-a-thon will be great. Zooniverse data is very important because we can transform it as per our requirement.
There is no such pathway exists I think because we have to preprocess the data first, to import dataset directly will not be possible yet But, It is a very good idea and we can work on this later.
I have updated the flowchart with the community input section.
There is no need for its own interface. As the project will be completely integrated inside the existing MapKnitter website interface by using Flask server as we have discussed earlier. and for testing, we could hack up a quick test interface for the playing with the features. The model will analyse the images for different forms of pollution and labelled it appropriately and people can use the tags to find similar patterns and pollution on MapKnitter website.
Both the flask server and the model can be run inside the docker container that could be added to our current composed setup.

thanks for the questions if there are any doubts I have left unaddressed then feel free to ask them.

Public Lab Research note

Public Lab

Research note

MapKnitter Aerial Image labelling to detect Environmental issues using machine vision