Tech Qs Flashcards by Lauren Tracy

Is user uploaded data directly useable by AI? It hasn’t been diagnosed.

We’re reviewing the images uploaded with a dermatologist to get a diagnosis.

How well did you know this?

Not at all

Perfectly

Is user uploaded data directly useable by AI? It hasn’t been diagnosed.

The digital scan is 1200 dpi which is pretty good quality. These slides were originally created by detail-oriented dermatologists, so they’re actually amazing quality.

How well did you know this?

Not at all

Perfectly

What is the quality of the user uploaded images?

Getting a good photo requires 2 things: 1) Giving users appropriate guidance and 2) direct feedback on the quality of the image. A telemedicine company we know has gotten to where 99% of the images upload are usable, so it’s a known science in teaching how to take a good photo.

How well did you know this?

Not at all

Perfectly

Are the digital scans useable by AI as is?

It depends on the image and the disease. We have internal tools that enables rapid data sorting and general cleanup. For AI usage, for some images we need to do additional steps like segmentation and background subtraction which we are working on.

How well did you know this?

Not at all

Perfectly

How do you make sure the images you’re getting from doctors are correctly labeled and organized?

We have customized tools to ensure we have the appropriate other information associated with each image and we’re using standardized taxonomy Snomed for organizing and labeling each disease.

How well did you know this?

Not at all

Perfectly

Why do you have a consumer tool? It seems like having fully diagnosed images from doctors is really the highest quality data, so this consumer product seems like a distraction.

The consumer tool allows us to build our own dataset of primarily healthy data, so we know what “healthy” looks like versus diseased from doctors.

How well did you know this?

Not at all

Perfectly

What’s special about your AI model?

AI accuracy requires the right algorithms and the right data to drive high quality. There’s a new algorithm every 6 months, so we have the people on the team to ensure we’re using the right algorithm for our application, but it’s really all about the data.

How well did you know this?

Not at all

Perfectly

If your AI isn’t special, what is protectable about what you’re doing?

The data we license from hospitals that’s digitized we have exclusive commercial rights to, in addition to the images uploaded from people at home. Our data is our protectable IP.

How well did you know this?

Not at all

Perfectly

Does the AI work?

Yes, we have built an AI model resulting in 80% top 5 accuracy across 23 diseases, which means we identify the right disease in the top 5 matches 4 of 5 times. More data makes this better. Right now our accuracy is between a primary care doctor and a dermatologist.

How well did you know this?

Not at all

Perfectly

What AI model are you using?

Tensorflow’s Inception-ResNet-V2 with transfer learning.

How well did you know this?

Not at all

Perfectly

How were you able to build a high accuracy model on only 13,000 images?

We augmented the dataset, including rotating, shifting, flipping and zooming caused accuracy to increase by 5-10% top 5 accuracy

How well did you know this?

Not at all

Perfectly

What off-the-shelf model structure are you using?

We manually compared VGG16, VGG19, ResNet50, Xception, InceptionV3, and InceptionResNetV2 model structures and found InceptionResNetV2 to have the highest performance.

How well did you know this?

Not at all

Perfectly

How do you know the AI is training on the right data if you’re mixing user pics with dermatologist pics?

Each image within our database has been tagged with it’s original source, so we can keep track of levels of quality, and make sure we’re only testing on images with high quality verification.

How well did you know this?

Not at all

Perfectly

What if someone uploads a picture of a disease your algorithm hasn’t been trained on?

We have an “other” option in the AI model, where it’s been trained on a variety of conditions not in our model. As we add diseases over time, the number of diseases in this category will reduce.

How well did you know this?

Not at all

Perfectly

Can you improve the AI to be better than 80% top 5?

The accuracy relies on the quality, volume and variety of the dataset. That’s the main reason we are focusing on getting more data.

How well did you know this?

Not at all

Perfectly

How do you know your tool works prospectively, if you’ve only done retrospective data evaluation?

Right now a dematologist is reviewing and building a differential diagnosis on the user uploaded images now. We will then compare that to the diagnoses generated by the AI model.

How many images and what accuracy do you need to go into primary care?

1 million across a certian variety and quality. We estimated this from projecting out our prior findings on a smaller scale, to build a tool that’s 95% accurate across: 100 diseases, 6 body parts, 2 sexes, 6 skin colors. It will take 18-24 months to bring in this data.

Aren’t there a lot of variables in taking a picture of skin (shadows, lighting, distance)? How can you build a highly accurate AI model?

Actually, all of the variables involved in taking a picture (shadows, lighting, distance) we’ve used to make the system more robust against these variables. It’s one of the benefits of Convolutional Neural Networks that previously was a barrier to progress in machine learning and images. Because we introduce these variables, our system trains on the features of the disease, rather than other features within the image that are “noise”.

Your background isn’t in AI. How can you do this work, if it often requires experts with PhDs?

Right now AI has come a long way in the last 5 years, and now off-the-shelf solutions are sufficient for skin disease AI, but trained on the right data, which is the difficult thing in medicine. What’s more important is the CTO has the capability of multiple domains and a focus on speed-to-market.

Are you diagnosing?

We’re conducting visual-based search, which means we care about identifying the right disease in the top 3 or 5. This reduces barriers to adoption by doctors but is still very useful for them. We may do a diagnostic tool down the road, but it’s not the focus now.

How do you troubleshoot errors in the model if AI works as a black box?

We have images that the computer never sees that we test on, which is how we make sure we’re not overfitting to the data.

What’s your sensitivity and specificity?

Our current accuracy addresses sensitivity, and we plan to add measure of specificity (what are the other 4 of the 5 most visually similar conditions).

What’s your performance on critical diseases that are potentially precancerous/cancerous or highly infectious?

Our measures of accuracy are across both prevalent and critical diseases, not just critical, but we’re forming a clinical advisory board of infectious disease experts and dermatologists to expand our accuracy and focus more on critical diseases.

Aren’t there things you can do to improve the model’s performance that don’t include adding more data?

Yes, but really the data is what can improve our accuracy more than 5%.

What if the AI is wrong and gives the wrong output?

We have a feedback loop to check if the AI is performing correctly with trained experts and software checks and balances.

Do you really need AI for this application? Shouldn't you be able to get far with more basic algorithms?

You need AI because of the variance in pictures of the body. AI is better than people at reading image data.

How is your data cleaning pipeline?

We have multiple pipelines to do data cleanup/sorting based on the source of the data.

How do you bring new datasets into your model?

Sorting based on genital/non-genital, skin color, duplicate detection and similarity using kNN. Cleaning involves segmentation, background subtraction.

Are you verifying the quality of the images uploaded by users?

One dematologist is going over them now and diagnosing them.

What if someone takes a bad picture?

We're investigating the quality of the pictures captured by users now and will add filters (blurriness, bad light, etc.) as needed in the future.

Who are you planning to hire (technical) with the money you're raising?

A senior machine learning engineer.