Identifying Bird Calls
Rank: 247th out of 975
Recognising birds from audio recordings to assist researchers in monitoring bird populations
About the competition
Recognizing a bird solely by their sound is a skill that only a few experts possess. This is not just a fun Sunday morning activity; it plays a crucial role for Bioacoustics researchers and conservationists in assessing and monitoring threats to bird species and understanding their impact on biodiversity.
The goal of this competition is to train an AI model to recognize birds by their sound using a large dataset of soundscapes from the Western Ghats, a mountain range in India. It serves as a home to a diverse set of ecosystems. The large human population that resides here depends heavily on the forests and their natural resources. Unfortunately, the mountain range’s biodiversity is suffering due to the effects of climate change.
Cornell Lab of Ornithology
Cornell Lab of Ornithology conducts research on the Earth’s biological diversity through research and education with a focus on birds and nature. This organization is supported by Cornell University in Ithaca, New York.
Relevance
Birds are a great indicator of environmental health. The absence or presence of them can quantify the health of an ecosystem. The Western Ghats is a vulnerable area that needs monitoring. Detecting and classifying birds are a point of measure for taking action in conservation measures.
The frequent conduct of traditional observer-based bird surveys over large areas is expensive and time-consuming. The reliable AI models developed during this competition can streamline these practices and help researchers gain new insights.
Technical details
The competition’s main challenge is predicting whether a ligand (described by a string of chemical structures) binds with three specific proteins (seH, Hsa, brd4). Although LEASH provides a vast dataset, it includes only ligands with a certain structure, complicating predictions for ligands with different structures.
The two key challenges are: developing a model that identifies essential ligand characteristics for protein binding and removing training set bias to generalize beyond the set. While model choice (CNN, GNN, transformer) impacts generalizability, experts emphasize that molecule representation is crucial. The strings can be encoded into fingerprints, embeddings, atom graphs, or pharmacophore graphs, each presenting unique challenges and opportunities for model learning and generalisation.
UN Sustainable Development Goals
The competition was chosen based on the UN sustainable development goals. This competition links to goal #3: Good health and well-being. Using machine learning to search for the perfect ligand in the 1060 chemicals in the drug-like space will revolutionize the way medicine is developed and could speed up the process of curing diseases.