THEME: "Experimental Challenges in Studies of Drug Discovery, Development and Lead Optimization"
Korea Institute of Toxicology, South Korea
Title: Beyond structural diversity of organic molecules: electron configuration fingerprint for inorganic bulk materials and engineered nanomaterials
Dr. Hyun Kil Shin
is an expert in cheminformatics particularly in development of machine learning
(ML) or deep learning (DL) model based on molecular structure datasets. His
strong support on safe-by-design concept led him to participate in diverse
research projects such as drug-induced liver toxicity prediction model
development, biocidal active substance neurotoxicity prediction model
development, and development of AI model designing safe compounds. He is currently
a researcher in Korea Institute of Toxicology (KIT). As image data is one of
most abundant data set, he also works with image data in research projects such
as smartphone deployable animal skin disease diagnosis model development and
atopy dermatitis region detection model development.
Artificial intelligence (AI) models have been
broadly applied in drug discovery; however, applicability domain (AD) of AI
models is mainly focused on organic molecules so far since 1) majority of available
database is composed of organic molecules, and 2) cheminformatics tools mainly
handle organic molecules alone. Particularly, lack of appropriate
cheminformatics tools for inorganic molecules becomes a significant technical
obstacle that should be overcome in order to expand AD of AI models over wider
range of chemical space beyond structural diversity of organic molecules. In
order to provide more cheminformatics tools for inorganic compounds, electron
configuration fingerprint (EC FP) was developed as a first fingerprint designed
for inorganic compounds. Furthermore, size-dependent EC FP (SDEC FP) is designed
by considering particle size in EC FP calculation to develop fingerprint for engineered
nanomaterials (ENMs) whose structural diversity is much complicated than bulk inorganic
materials due to compositional complexity in core, doping, and coating part of
ENMs in different sizes. By applying EC FP, artificial neural network (ANN)
models for prediction of physicochemical properties of inorganic compounds were
developed based on composition of inorganic compounds alone. The models were
developed with dataset containing almost all atoms in the periodic table to
make reliable prediction on inorganic compounds with diverse atomic
compositions. ANN models with EC FP outperformed other possible descriptors
calculated from composition of inorganic compounds. SDEC FP was applied to
develop prediction models for cytotoxicity and zeta potential of ENMs in
diverse composition and shapes. Given that previous studies developed models applicable
for one specific type of ENMs such as metal oxide, metal, coated, or
carbon-based ENMs, the models developed with SDEC FP achieved breakthrough by
developing general models applicable to any composition and shape of ENMs.