
CryoET Object Identification
Annotating protein complexes in 3D cellular images to accelerate biomedical discoveries and disease treatment
About the competition
Protein complexes are essential for cell function, and understanding their interactions can advance health and disease treatments. Cryo-electron tomography (cryoET) generates 3D images of proteins in their natural environments, at near-atomic detail, offering detailed insights into cellular function.
However, standardized cryoET tomograms remain underexplored due to the difficulty of automating protein identification. Manual annotation is time-consuming and limited by human capabilities, but automating this process could reveal cellular "dark matter", leading to discoveries to improve human health.
This competition aims to train an AI model to automatically annotate five types of protein complexes using a real-world cryoET dataset, accelerating discoveries and unlocking the mysteries of the cell.
CZ Imaging Institute
The competition is organized by the “Chan Zuckerberg Imaging Institute”. They develop advanced imaging technologies, enabling new insights into health and disease. By creating cutting-edge tools and biological probes, it empowers researchers to visualize cellular structures with unmatched resolution.
Focused on collaboration, the institute drives innovation through open-source tools, groundbreaking research, and accessible imaging systems, advancing biology and biomedicine.
Relevance
Understanding protein complexes is crucial for addressing global health challenges. Diseases like cancer, neurodegenerative disorders, and infections are rooted in the interactions of these cellular components. CryoET imaging provides new insights, but without automated annotation tools, these images remain largely underexplored.
This competition directly supports advancements in healthcare by developing AI models that make cryoET data more accessible and interpretable. These tools will empower scientists to identify molecular targets, design effective treatments, and improve patient outcomes on a global scale.
“Revealing the hidden complexities of the cell to drive breakthroughs in health and disease”
Technical details
The dataset for this competition consists of a cube containing 3D protein samples. The goal is to develop a multiclass localization model capable of identifying the center of any protein sample present in the data.
One of the key challenges lies in the extremely low signal-to-noise ratio (SNR). This is due to the nature of electron tomography, the technique used to sample the proteins. During this process, a high density of electrons cannot be used, as it would damage the delicate protein samples. Navigating around the low SNR while maintaining accuracy remains a significant hurdle.
The team is currently focused on researching and implementing segmentation models, which are later converted into localization coordinates during post-processing. Both YOLO and U-Net models have proven to be the most effective for this use case.
UN Sustainable Development Goals
This competition supports UN Sustainable Development Goal #3: Good Health and Well-Being. The AI model aims to enhance understanding of cellular processes, enabling earlier and more accurate diagnostics and advancing biomedical science.
Status: Finished

Literacy Screening
Scoring audio clips from literacy screeners, helping educators provide effective early literacy intervention
About the competition
Literacy is a fundamental skill that shapes personal growth, academic success, career opportunities, and societal participation. Yet, 250 million children worldwide struggle to meet basic reading standards. Addressing literacy gaps early, starting in preschool, has proven to be a promising approach.
Teachers need reliable tools to identify students who need support, but current manual literacy assessments are slow and inconsistent. Machine learning can improve this process, providing faster, more accurate results.
This competition aims to develop an AI model to score audio recordings from literacy screener exercises for children through 3rd grade, helping educators pinpoint students who would benefit from early literacy intervention.
Reach Every Reader
The competition is organized by “Reach Every Reader,” a collaboration between Harvard, MIT, and Florida State University. This initiative unites educators, researchers, and technologists to address literacy challenges at scale.
Focused on early identification and personalized support, it combines research, evidence-based practices, and technology to empower teachers, engage families, and inspire students, aiming for a world where every child learns to read.
Relevance
Literacy is a critical skill for personal and academic growth, yet many children in the U.S. face significant challenges in developing strong language abilities.
By automating this process with machine learning, we can provide teachers with fast, reliable insights to identify students who need support. This has the potential to improve early literacy intervention, reduce disparities in education, and set children on a path to greater success in school and beyond.
“Transforming early literacy through innovative teaching tools to unlock children's potential”
Technical details
The objective of this competition is to develop an automated scoring model that evaluates audio recordings from literacy assessments given to students in kindergarten through 3rd grade.
A key challenge in this competition is developing models capable of performing Automatic Speech Recognition (ASR) on children's speech. Due to factors like smaller vocal tracts and unpredictable pronunciations, children’s speech introduces greater acoustic variability, making it difficult for ASR systems designed for adult voices to perform effectively. Additionally, the lack of diverse datasets for children's speech further complicates the development of effective ASR systems.
The team is exploring various machine learning techniques, such as encoder-decoder, Transformer models and Contrastive models for audio processing. To improve efficiency, data augmentation strategies are being used to enhance the model’s robustness and generalization capabilities.
UN Sustainable Development Goals
This competition supports UN Sustainable Development Goal #4: Quality Education. By improving early literacy screening, the AI model helps more children succeed academically, promoting educational equity and opportunity.
Status: Finished

Malaria Detection
Classifying malaria parasites in blood slide images to enable early-stage diagnosis
About the competition
Spotting the early stages of malaria parasites in blood samples is a challenge that can make the difference between life and death. Early and accurate diagnosis is essential for effective treatment and management of the disease.
The goal of this competition is to train an AI model to identify malaria parasites in blood slide images using a dataset of annotated samples from Uganda and Ghana. These local datasets ensure that the model is tailored for the blood types of these areas, minimizing diagnostic biases and improving detection accuracy.
Lacuna Fund
Lacuna Fund mobilizes funding to create high-quality labeled datasets that address challenges in low- and middle-income countries worldwide. By supporting data scientists, researchers, and social entrepreneurs, Lacuna Fund provides the necessary resources to either create new datasets that address underserved populations or critical issues, enhance existing datasets to ensure greater representation, or update outdated datasets.
Lacuna Fund datasets are locally developed, owned, and openly accessible to the international community, while ensuring practices regarding ethics and privacy.
Relevance
Malaria is a major public health challenge in Africa, causing hundreds of thousands of deaths each year, particularly affecting pregnant women and children under five. Early detection and treatment are essential to prevent severe health consequences. In regions like Uganda and Ghana, traditional diagnostic methods are resource-intensive and demand skilled technicians, both of which are often in short supply.
This competition aims to develop an AI model that can automate and ease the process of malaria detection, offering early diagnosis. Which ensures that patients receive timely treatment, potentially saving lives.
“ Early detection of malaria is key to safeguarding the health of vulnerable populations ”
Technical details
The dataset for this competition focuses on detecting and classifying malaria parasites in blood samples, which were captured using phone cameras, adding to the challenge. The goal is to develop a multiclass object detection model that can accurately identify the trophozoite stage of malaria and differentiate between infected and uninfected blood cells. The presence of other elements, such as larger white blood cells that also need to be annotated, and image artifacts such as smudging, further complicates the task and increases the risk of false positives.
The team is researching multiple types of object detection models. Since the dataset contains images of different resolutions and has many other variations, it is important that the team can develop a model which can generalize well.
UN Sustainable Development Goals
This competition contributes to the UN Sustainable Development Goal #3: Good Health and Well-Being. By participating, we are advancing early-stage malaria detection to ensure timely and accurate diagnoses, particularly in underserved regions, improving access to quality healthcare.