Photo from Kevin Ku
Dear friends,
In this newsletter, we would like to try out something new and give you an update on the status of all our currently running projects.
So, we hope you will enjoy it and like this new format!
And as always, put your capabilities to help Public Good delivery into action, and let us together make the world a better place for all of us!
Your MI4People Team
General Computer Vision for Healthcare Project
In this project we are currently focusing on identifying pathologies on x-ray images of the chest region and making this system available for medical staff in developing countries. In this project our interdisciplinary team of volunteers works in two directions:
Our domain experts from medical field are working on the concept of a corresponding application, on how to integrate it into existing processes in hospitals and on collecting feedback for our concept from doctors in developing countries (currently mainly from Ghana).
Our data scientists and engineers are working on data pipelines, data augmentation, and AI model training. While we started with NIH Chest X-Ray Dataset and performed research and experiments on how to augment this data and which model architectures to train, we are currently investigating the option of using the library TorchXRayVision that enables us to access even more training examples and some already pretrained models.
We also would like to mention that the major challenge of this project is the bias towards population in developed countries in the existing public data – a system trained on this data would have rather low accuracy if applied in developing regions. Therefore, the main goal of our current efforts is to produce an MVP that will convince hospitals, doctors, researchers, and authorities in developing countries that our system provides large added value and motivate them to help us collecting x-ray data that represents the population of each considered country. Then we will use this data for transfer learning on our pre-trained model, so, that the resulting system will cover the relevant population in developing regions. In order to ensure data privacy, we are also currently debating the option to apply federated learning in this project. This technique enables us to train an algorithm across multiple decentralized servers holding data samples locally, thus allowing us to address critical issues such as data privacy, data security, data access rights and access to heterogeneous data.
If you have interest to read more about the background of this project, pls check our website.
CoVision Project
In our CoVision project, we are working together with the Bavarian Federation of the Blind and Visually Impaired on a free-of-charge and easily accessible open-source Computer Vision app that can classify CoVid rapid test results using mobile devices. This system will enable blind and visually impaired to perform rapid antigen CoVid tests by themselves, will increase the convenience and privacy for this group of people and make the tests more accessible.
We are currently resolving the last few bugs and expect that we can release the beta-version within 1-2 months. But it will not be the end of the story and we want to improve the CoVision app beyond beta! Therefore, we are already thinking about building automated data collection and preparation pipelines for gathering of the new training data, continual learning of our existing models, automatization of model observation, and as required, improving existing models and experimenting with new architectures.
For these interesting tasks, we are currently searching for one or two additional volunteers with experience in data science and/or machine learning engineering who would help us advance this important open-source project for inclusion of blind people.
So, if you like to apply your AI/ML skills for something important and support us on this project or know somebody who would like to, please contact our MD Dr. Paul Springer on LinkedIn or via email paul.springer@mi4people.org
To get more background details on the CoVision project, pls visit our website.
Soil Quality Evaluation System
In this project, our volunteers are creating a free of charge Machine Intelligence (MI) system that can predict the most important quality indicators for the soil at particular locations based on satellite images and without performing expensive chemical lab tests. Nonprofits and governments will be able to use this system to better direct their resources to promote agriculture, increase yields, and better secure food chains. Small farmers can also use it to better understand their soil and increase yields. Overall, the projects aims to help reduce hunger in the world and increase wealth in developing regions.
We have already performed many experiments including application of classical machine learning algorithms and computer vision algorithms and developed a Proof-of-Concept for organic carbon content. The classical machine learning algorithms are very common for evaluation of satellite images and consider pixel by pixel of a given satellite picture. These algorithms are fast, robust, and inclusion of additional data like weather data can be done very easily. However, these algorithms miss an important point – they do not evaluate surroundings of the considered pixel and, thus, miss a lot of important information.
Computer Vision algorithms are designed to overcome this problem and this is why we are currently focusing on them. However, standard computer vision algorithms require hundreds of thousands of images for training and our available data set is much smaller. Therefore we must use transfer learning, meaning that we take a computer vision model pretrained on another data set and retrain it using our data. This leads to the effect that relationships learned during the initial training of the model can be reused during the retraining and the model shows a good performance only with a few training examples. Our first tests in this direction were promising but the results were not so good as expected. The problem is that publicly available pretrained computer vision model were pretrained on images from internet and not on satellite images and these kinds of images are very different. Therefore, we decided to pretrain our own model with architecture that suits our purpose with large amount of data from so called BigEarthNet. This data was originally collected and labeled for a very different task than soil evaluation, but it can be used for pre-training a model. Later we will retrain it with data relevant for soil quality evaluation.
The initial training of this model will take a lot of computational resources and training time. Therefore, it will be quite costly. To reduce the costs as much as possible, we plan to use so called on-spot-instances of our cloud provider AWS. It means that we will use AWS servers only at times when nobody else want to use them. For us it means that we must stop our calculations from time to time what is associated with a considerably more complex code and that the training will take much longer. But we will be able to pre-train our model with a fraction of required budget! This is what the team is currently working on.
To read more about the idea of this project, check out this article.
Marine Litter Detection via Satellites Project
This is our youngest project and the corresponding team of volunteers from Alexander Thamm is currently performing a lot of research work, understanding and replicating some of the published results, and becoming familiar with the public data we want to use for this project. We will report more on the progress of this project in the course of the year!
Concluding Remark
At this point, we want to give a big applause to all present and former volunteers of all our projects! Without their work and dedication MI4People would not be possible! We also want to motivate you to show your recognition to these great people! And if you also want to support us, you are always welcome. Please visit our website (https://www.mi4people.org/) to check out the ways you can support! 😊
コメント