Malaria Detection
Classifying malaria parasites in blood slide images to enable early-stage diagnosis
About the competition
Spotting the early stages of malaria parasites in blood samples is a challenge that can make the difference between life and death. Early and accurate diagnosis is essential for effective treatment and management of the disease.
The goal of this competition is to train an AI model to identify malaria parasites in blood slide images using a dataset of annotated samples from Uganda and Ghana. These local datasets ensure that the model is tailored for the blood types of these areas, minimizing diagnostic biases and improving detection accuracy.
Lacuna Fund
Lacuna Fund mobilizes funding to create high-quality labeled datasets that address challenges in low- and middle-income countries worldwide. By supporting data scientists, researchers, and social entrepreneurs, Lacuna Fund provides the necessary resources to either create new datasets that address underserved populations or critical issues, enhance existing datasets to ensure greater representation, or update outdated datasets.
Lacuna Fund datasets are locally developed, owned, and openly accessible to the international community, while ensuring practices regarding ethics and privacy.
Relevance
Malaria is a major public health challenge in Africa, causing hundreds of thousands of deaths each year, particularly affecting pregnant women and children under five. Early detection and treatment are essential to prevent severe health consequences. In regions like Uganda and Ghana, traditional diagnostic methods are resource-intensive and demand skilled technicians, both of which are often in short supply.
This competition aims to develop an AI model that can automate and ease the process of malaria detection, offering early diagnosis. Which ensures that patients receive timely treatment, potentially saving lives.
“ Early detection of malaria is key to safeguarding the health of vulnerable populations ”
Technical details
The dataset for this competition focuses on detecting and classifying malaria parasites in blood samples, which were captured using phone cameras, adding to the challenge. The goal is to develop a multiclass object detection model that can accurately identify the trophozoite stage of malaria and differentiate between infected and uninfected blood cells. The presence of other elements, such as larger white blood cells that also need to be annotated, and image artifacts such as smudging, further complicates the task and increases the risk of false positives.
The team is researching multiple types of object detection models. Since the dataset contains images of different resolutions and has many other variations, it is important that the team can develop a model which can generalize well.
UN Sustainable Development Goals
This competition contributes to the UN Sustainable Development Goal #3: Good Health and Well-Being. By participating, we are advancing early-stage malaria detection to ensure timely and accurate diagnoses, particularly in underserved regions, improving access to quality healthcare.
Status: Ongoing
Identifying Bird Calls
Recognising birds from audio recordings to assist researchers in monitoring bird populations
About the competition
Recognizing a bird solely by their sound is a skill that only a few experts possess. This is not just a fun Sunday morning activity; it plays a crucial role for Bioacoustics researchers and conservationists in assessing and monitoring threats to bird species and understanding their impact on biodiversity.
The goal of this competition is to train an AI model to recognize birds by their sound using a large dataset of soundscapes from the Western Ghats, a mountain range in India. It serves as a home to a diverse set of ecosystems. The large human population that resides here depends heavily on the forests and their natural resources. Unfortunately, the mountain range’s biodiversity is suffering due to the effects of climate change.
Cornell Lab of Ornithology
Cornell Lab of Ornithology conducts research on the Earth’s biological diversity through research and education with a focus on birds and nature. This organization is supported by Cornell University in Ithaca, New York.
Relevance
Birds are a great indicator of environmental health. The absence or presence of them can quantify the health of an ecosystem. The Western Ghats is a vulnerable area that needs monitoring. Detecting and classifying birds are a point of measure for taking action in conservation measures.
The frequent conduct of traditional observer-based bird surveys over large areas is expensive and time-consuming. The reliable AI models developed during this competition can streamline these practices and help researchers gain new insights.
“ Reliable AI models can streamline researchers’ practices and help them gain new insights. ”
Technical details
The competition dataset includes short bird call recordings from Xeno-canto.org, a global wildlife sound repository. These recordings are converted into spectrograms, visual representations of sound frequencies over time. Convolutional Neural Networks (CNNs) then analyze these images to identify bird species by their unique call patterns visible in the spectrograms.
A key challenge is that predictions must be completed within two hours on a CPU, ensuring the models can run on less powerful field computers. Understanding and innovating within these limitations is key to excelling in the competition, encouraging participants to develop streamlined algorithms that do not sacrifice performance for efficiency. This approach not only advances the field of bioacoustic research but also enhances the practical deployment of these technologies in natural settings.
UN Sustainable Development Goals
This competition contributes to the UN Sustainable Development Goal #15 Life on Land. By participating we’re helping revolutionize researchers’ understanding of biodiversity and ecosystems, contributing to global conservation efforts.
Status: Finished
Predict New Medicine
Predicting small molecule-protein interactions for drug development
About the project
Small molecule drugs (ligands) can interact with proteins in the human body. As a result of these interactions, the protein can structurally change. This change of information can initiate a reaction cascade. A classic approach to test the binding affinity (how well a molecule will bind) of a ligand with a protein is to physically make it react. To revolutionize this process, Leash Biosciences created a dataset, Big Encoded Library for Chemical Assessment (BELKA), and now hosts a competition that asks engineers to use machine learning to predict Protein-Ligand binding.
Leash Biosciences
Leash Biosciences’ catchphrase is “Unleashing machine learning to solve medicinal chemistry”. It is a biotechnology company that is building a large dataset of protein-molecule interactions for machine learning purposes.
Relevance
Cancer is one of the most common diseases globally. Chemo-therapy is often applied as treatment. Sadly, it lacks target selectivity and comes with several severe side effects. What’s necessary is a drug delivery system where drugs are delivered and taken up at specific target sites, improving the quality of life of a patient and decreasing toxicity exposure. An effective approach for tumor-selective drug delivery is the use of functional ligands, small molecules that interact with specific proteins, e.g. overexpressed receptor proteins in malignant cancer cells. Ligands have to be tested on binding affinity with the protein. Classically this is done in the lab, which is labor-intensive and time-consuming.
To revolutionize this process, Leash Biosciences has tested 133 million ligands on binding affinity with three target protein molecules. This dataset, BELKA, is intended to encourage AI engineers on the competition platform Kaggle, to build predictive models to estimate the binding affinity of unknown chemical compounds to protein targets, using BELKA.
“ Using machine learning to search for the perfect ligand will revolutionize the way medicine is developed. ”
Technical details
The competition’s main challenge is predicting whether a ligand (described by a string of chemical structures) binds with three specific proteins (seH, Hsa, brd4). Although LEASH provides a vast dataset, it includes only ligands with a certain structure, complicating predictions for ligands with different structures.
The two key challenges are: developing a model that identifies essential ligand characteristics for protein binding and removing training set bias to generalize beyond the set. While model choice (CNN, GNN, transformer) impacts generalizability, experts emphasize that molecule representation is crucial. The strings can be encoded into fingerprints, embeddings, atom graphs, or pharmacophore graphs, each presenting unique challenges and opportunities for model learning and generalisation.
UN Sustainable Development Goals
The competition was chosen based on the UN sustainable development goals. This competition links to goal #3: Good health and well-being. Using machine learning to search for the perfect ligand in the 1060 chemicals in the drug-like space will revolutionize the way medicine is developed and could speed up the process of curing diseases.