High fidelity simulations create an enormous amount of data that can mimic real-world values with excellent accuracy with modern tools and much of this is discarded as "too sanitized" for use in deep learning. But in cases where datasets are very lopsided, such as when a true positive is hard to obtain or rare to observe but true negatives are abundant, isn't sanitized synthetic positive data better than nothing?

In my current project, using an FEM model of the brain/skull system we are using the simulations of head impacts to supplement data from athletes who wear smart mouth guards in an attempt to gather important statistics for early concussion detection. This extra source of true positives has helped improve the overall performance of the ML platform that analyses signals coming from the mouth guards, but it isn't a perfect solution.

More Samuel J. Raymond's questions See All
Similar questions and discussions