Abstract
Today there are approximately 85,000 chemicals regulated under the Toxic
Substances Control Act, with around 2,000 new chemicals introduced each year.
It is impossible to screen all of these chemicals for potential toxic effects
either via full organism in vivo studies or in vitro high-throughput screening
(HTS) programs. Toxicologists face the challenge of choosing which chemicals to
screen, and predicting the toxicity of as-yet-unscreened chemicals. Our goal is
to describe how variation in chemical structure relates to variation in
toxicological response to enable in silico toxicity characterization designed
to meet both of these challenges. With our Bayesian partially Supervised Sparse
and Smooth Factor Analysis ($\text{BS}^3\text{FA}$) model, we learn a distance
between chemicals targeted to toxicity, rather than one based on molecular
structure alone. Our model also enables the prediction of chemical
dose-response profiles based on chemical structure (that is, without in vivo or
in vitro testing) by taking advantage of a large database of chemicals that
have already been tested for toxicity in HTS programs. We show superior
simulation performance in distance learning and modest to large gains in
predictive ability compared to existing methods. Results from the
high-throughput screening data application elucidate the relationship between
chemical structure and a toxicity-relevant high-throughput assay. An R package
for $\text{BS}^3\text{FA}$ is available online at
https://github.com/kelrenmor/bs3fa.
Citation
ID:
281876
Ref Key:
herring2019bayesian