Bridging the AI application gap

The focus of the Concept to Clinic challenge is to make AI advances useful — not just for data scientists interested in cutting-edge methods, but for clinicians working on the front lines of lung cancer detection and the patients they serve.

Greg Lipstein
Co-founder

Bridging the AI application gap between developers, radiologists and patients to detect lung cancer earlier

bridge image

The focus of the Concept to Clinic challenge is to make AI advances useful — not just for data scientists interested in cutting-edge methods, but for clinicians working on the front lines of lung cancer detection and the patients they serve.

In early 2017, the Data Science Bowl produced a set of cutting-edge algorithms for detecting lung cancer in CT scans. These algorithms have demonstrated the potential for artificial intelligence to help radiologists catch lung cancer earlier and more reliably separate out false positive results. Through the Concept to Clinic challenge, we’ve engaged a global community of participants to collaboratively develop the functionality and experience that matter most in a clinical setting. Through this process, we are building an open source application that starts from cutting edge algorithms for detection and actually turns it into artificial intelligence that is useful to clinicians and patients.

A range of technical abilities from data scientists, software developers, and frontend engineers are needed to build the end-to-end application that loads images from patient scans, applies machine learning algorithms to find and interpret potential threats, and gives radiologists a clear user interface for their clinical work. Meanwhile, the experience of practicing radiologists must be built in from the very beginning to the way the functionality and user flow is designed, and ongoing feedback provides focus on the development priorities that are most important for real clinical value.

This challenge brings people together from across the globe to improve the tools we have to fight the world’s deadliest cancer. From the more than 500 developers using thousands of lung CT scans to build out the application, to the radiologists informing the most useful design decisions, and the lung cancer patients they are ultimately serving through their work -- get to know some of them below!


Meet some of the participants in the global fight against lung cancer

Picture of Serhiy

Serhiy Shekhovtsov

Full Stack Software Engineer
Lviv, Ukraine

Serhiy was a top backend contributor in the first milestone and is currently among the top five overall contributors to the project.

Who are you (briefly) and what do you do professionally?

I am a self-taught programmer from Lviv, Ukraine. Have been working as a full stack software developer for more than 8 years. Co-founder of ShortPoint - successful startup supported by Silicon Oasis Founders and 500 Startups accelerators.

What got you interested in this Concept to Clinic project?

  • An opportunity to contribute to something that really matters on a global scale.
  • Getting hands-on experience with cutting-edge technology stack.
  • Tasks you implement and code you write will (mostly) be useful, no matter what is your position on the leaderboard, unlike it's with a vast majority of other online competitions.
  • A chance to get a prize even if you are not at the top of the global leaderboard.

For more from Serhiy check out his e-chat with DrivenData.

Picture of Musale

Martin Musale

Software Developer
Nairobi, Kenya

Martin was a top community contributor in the first milestone and is currently among the top five overall contributors to the project.

Who are you (briefly) and what do you do professionally?

I am Martin Musale, a Software Developer based in Nairobi. I am proficient in Python, JavaScript and currently I'm tinkering with Golang. I enjoy the OSS world and everything it offers. Professionally, I work at Gravity as a Software Developer.

What got you interested in the Concept to Clinic project? How did you find the challenge and why did you choose to participate?

I love challenges! But what really got me interested was the scope of the project, to build a cutting-edge open source tool for clinics. It also involved meddling in some interesting technology i.e. ML, AI and algorithms. The technology stack was also very appealing. I loved the setup.

For more from Martin check out his profile on Concept to Clinic.

Picture of Jason

Jason Hostetter

Clinical Fellow, Neuroradiology
Baltimore, MD

Dr. Hostetter has been practicing radiology for four and a half years in Baltimore, after working as a full-time software developer before starting medical school. He is a member of the Technical Advisory Panel and has been instrumental to the design of this project.

Why did you choose to participate as an advisor for this project?

I believe in the power of like-minded groups of people, and have personally benefited from hundreds of open source projects. I loved the idea of leveraging the power of programmers across the world to tackle one of the biggest needs in medical imaging right now, data collection. Machine learning and AI is exploding right now in the world of medical imaging, but we have precious few reliable datasets to train algorithms, and fewer still good ways to collect it.

What do you see as the biggest areas where better uses of machine learning can help radiologists with their work?

Automating or streamlining mundane but important tasks. For instance, finding lung nodules and tracking their size over time. Same thing with lymph nodes or tumors in cancer follow-up studies. These tasks are tedious and time consuming for humans, but play a very important role in treatment. They also tend to be quite routine and with well-described parameters for reporting, which is perfect for a computer algorithm.

What are the benefits of an open source challenge like this? In what ways is this project useful?

Open source can not only drive the production of great software, it also brings together a community of people who believe in the same end goal. Newcomers can be brought into the conversation and learn about what is being created. Medical imaging is a relatively small world compared to the wider world of software development. This project helps skilled programmers see what problems our industry is struggling with, and gives them an opportunity to dive in to our world. With the ultimate goal of better health care for real people, I hope contributors can see the potential impact of what they produce.

For more from Dr. Hostetter check out his short interview with DrivenData.

Picture of Sally

Sally Samuels

Lung Cancer Survivor and Advocate San Francisco, CA

Sally is a lung cancer survivor whose cancer was detected early. Her life and advocacy offer a beautiful example of why early scans are so important.

Sally serves on the Addario Patient and Caregiver Advisory Board (APCAB) at the Bonnie J. Addario Lung Cancer Foundation.

Hear Sally’s story firsthand and why, in her words, it’s so important to demand a scan.


Get involved

select and detect

Open development in Concept to Clinic runs through January 25. A version of the in-progress application is now being used to collect feedback from clinicians and inform priorities for continued development.

Looking for ways to contribute? Visit the challenge home and sign up!

Like the project? Please help spread the word #concepttoclinic!


The Concept to Clinic challenge is a collaboration between DrivenData and the Bonnie J. Addario Lung Cancer Foundation.

Stay updated

Join our newsletter or follow us for the latest on our social impact projects, data science competitions and open source work.

There was a problem. Please try again.
Subscribe successful!
Protected by reCAPTCHA. The Google Privacy Policy and Terms of Service apply.

Latest posts

All posts

insights

Life beyond the leaderboard

What happens to winning solutions after a machine learning competition?

winners

Meet the winners of Phase 2 of the PREPARE Challenge

Learn about how winners detected cognitive decline using speech recordings and social determinants of health survey data

resources

Open-source packages for using speech data in ML

Overview of key open-source packages for extracting features from voice data to support ML applications

tutorial

Getting started with LLMs: a benchmark for the 'What's Up, Docs?' challenge

An introduction to using large language models via the benchmark to a document summarization challenge.

winners

Meet the Winners of the Goodnight Moon, Hello Early Literacy Screening Challenge

Learn about the results and winning methods from the early literacy screening challenge.

resources

Where to find a data job for a good cause

Finding data jobs for good causes can be difficult. Learn strategies, job lists, and tips to find organizations with open positions working on causes you care about.

winners

Meet the Winners of the Youth Mental Health Narratives Challenge

Learn about the winning solutions from the Youth Mental Health Challenge Automated Abstraction and Novel Variables Tracks

winners

Meet the winners of the Forecast and Final Prize Stages of the Water Supply Forecast Rodeo

Learn about the winners and winning solutions from the final stages of the Water Supply Forecast Rodeo.

insights

10 takeaways from 10 years of data science for social good

This year DrivenData celebrates our 10th birthday! We've spent the past decade working to use data science and AI for social good. Here are some lessons we've learned along the way.

tutorial

Goodnight Moon, Hello Early Literacy Screening Benchmark

In this guest post from the MIT Gabrieli Lab, we'll show you how to get started with the literacy screening challenge!

tutorial

Youth Mental Health: Automated Abstraction Benchmark

Learn how to process text narratives using open-source LLMs for the Youth Mental Health: Automated Abstraction challenge

winners

Meet the winners of Phase 1 of the PREPARE Challenge

Learn about the top datasets sourced for Phase 1 of the PREPARE Challenge.

resources

Teaching with DrivenData Competitions

Inspiration and resources for teaching students data science, machine learning, and AI skills with DrivenData competitions.

winners

Meet the winners of the Pose Bowl challenge

Learn about the top solutions submitted for the Pose Bowl: Spacecraft Detection and Pose Estimation Challenge.

winners

Meet the winners of the Water Supply Forecast Rodeo Hindcast Stage

Learn about the winning models for forecasting seasonal water supply from the first stage of the Water Supply Forecast Rodeo.

tools

Cookiecutter Data Science V2

Announcing the V2 release of Cookiecutter Data Science, the most widely adopted data science project template.

resources

How to make data science projects more open and inclusive

Key practices from the field of open science for making data science work more transparent, inclusive, and equitable.

winners

Meet the winners of the Kelp Wanted challenge

Dive into the solutions from the super segmenters who best detected kelp in Landsat imagery!

winners

Meet the winners of the SNOMED CT Entity Linking Challenge

Meet the winners with the best systems for detecting clinical terms in medical notes.

winners

Meet the winners of the Pale Blue Dot challenge

Learn about the top visuals created for the Pale Blue Dot: Visualization Challenge and the solvers behind them.

Work with us to build a better world

Learn more about how our team is bringing the transformative power of data science and AI to organizations tackling the world's biggest challenges.