Highlights
● Insilico published a new study unveiling Target Identification Pro (TargetPro), a superior disease-specific model, and TargetBench 1.0, the first standardized benchmarking framework for target discovery.
● TargetPro achieved 71.6% retrieval of known clinical targets, a 2–3x improvement over large language models (LLMs) such as GPT-4o, Grok3, DeepSeek-R1, Claude-Opus-4, BioGPT, and public platforms like Open Targets.
● TargetPro’s predicted novel targets demonstrated 95.7% structure availability, 86.5% druggability, and 46% repurposing potential, outperforming competing systems on all measures.
● TargetBench’s explainable AI models revealed disease-specific feature importance patterns, emphasizing the value of disease-specific target identification models.
● New gold standard approaches improve accuracy, reliability, and transparency in AI-driven drug discovery.
Cambridge, MA – October 3, 2025 – Insilico Medicine (“Insilico”), a clinical-stage biotechnology company driven by generative artificial intelligence (AI), today announced the release of a major scientific paper, “Advancing Target Discovery Through Disease-Specific Integration of Multi-Modal Target Identification Models and Comprehensive Target Benchmarking System.” The study details the development of Target Identification Pro (TargetPro), a machine learning workflow trained on clinical-stage targets across 38 diseases spanning oncology, neurological, immune, fibrotic, and metabolic disorders. The study also reveals TargetBench 1.0, a benchmarking system designed to evaluate target identification models, including large language models (LLMs).
The new AI-powered model training framework for therapeutic target discovery was unveiled in a paper published on bioRxiv, introducing TargetPro and TargetBench 1.0 as complementary systems designed to accelerate and validate the earliest and most critical stage of drug development. By tailoring predictive AI models to individual diseases and creating the first standardized benchmarking system for evaluating target discovery platforms, Insilico is establishing a new gold standard of accuracy and transparency in the field.
Drug discovery has long been plagued by weak target selection, with nearly 90% of candidates failing in clinical trials, often because the biological targets prove unreliable or lack translational potential. TargetPro addresses this challenge by integrating 22 multi-modal data sources, including genomics, transcriptomics, proteomics, pathways, clinical trial records, and scientific literature, into disease-specific models that learn the biological and clinical characteristics of disease targets most likely to progress to clinical testing . TargetPro identifies context-dependent predictive patterns unique to each disease area, resulting in significantly higher accuracy in nominating viable drug targets compared to other target identification models.
In head-to-head benchmarking, TargetPro achieved a clinical target retrieval rate of 71.6%, a two- to three-fold improvement over state-of-the-art LLMs such as GPT-4o, DeepSeek-R1, and BioGPT, which ranged between 15% and 40%, and public platforms like Open Targets, which scored just under 20%. Importantly, while the performance of LLMs drops when prompted to generate longer target lists, TargetPro maintained consistently high performance across therapeutic areas, underscoring its robustness for real-world applications.
SHAP analysis was applied to analyze and understand the predictive drivers of TargetPro models across the five disease groups. It revealed that TargetPro’s decision-making is nuanced and context-dependent, with feature importance patterns varying across disease groups. Matrix factorization and attention score were universally the most impactful, highlighting their core contribution to the trained disease-specific TargetPro models, while information from omics datasets was particularly predictive of oncology target success. These results indicate that the model does not rely on simple, fixed rules but instead learns biologically relevant, disease-specific patterns.
Beyond rediscovering known clinical targets,
TargetPro excelled in nominating novel candidates with strong translational
potential. Of the predicted novel targets, 95.7% had resolved 3D protein
structures compared with 60–91% for LLMs, 86.5% were classified as druggable
versus 39–70% for other platforms, and 46% overlapped with approved drugs in
other indications, significantly outperforming other compared platforms in
repurposing potential. TargetPro’s targets also demonstrated superior readiness
for experimental validation, with an average of more than 500 associated
bioassay datasets published, which was 1.4 times higher
than competing systems, and a greater number of available modulators, ensuring
that nominated targets are actionable in laboratory settings. Alex Zhavoronkov, PhD, Founder, CEO & CBO of Insilico Medicine, said,
"These results validate our long-standing vision that AI in drug discovery
must be disease-specific, evidence-driven, and rigorously benchmarked.
TargetPro and TargetBench give the pharmaceutical industry unprecedented
confidence in AI predictions, helping de-risk research pipelines and prioritize
targets for costly clinical investment." Dr. Zhavoronkov added, "Most failures in drug
discovery start with weak targets. By building an AI system that outperforms
general-purpose models and widely used public resources, we are generating
actionable intelligence not just predictions. Combined with TargetBench 1.0,
this framework sets a new standard for reliability and transparency, enabling
biopharma companies to move faster and smarter toward breakthrough
therapies." TargetPro and TargetBench 1.0 are already being
applied internally across Insilico’s therapeutic programs and TargetBench 1.0
is already publicly available. The study
represents a milestone in Insilico’s mission to harness generative AI and
machine learning for drug discovery, advancing the industry beyond traditional
inefficiencies and laying a foundation for more reliable target identification. In 2016, Insilico first described the concept of using
generative AI for the design of novel molecules in a peer-reviewed journal,
which laid the foundation for the commercially available Pharma.AI platform. Since then, Insilico keeps integrating technical
breakthroughs into Pharma.AI platform, which is currently a generative
AI-powered solution spanning across biology, chemistry, medicine development
and science research. Powered by Pharma.AI, Insilico has nominated 22 developmental/preclinical candidates (DC/PCC) in its comprehensive portfolio
of over 30 assets since 2021, received IND clearance for 10 molecules, and
completed multiple human clinical trials for two of the most advanced
pipelines, with positive results announced. By integrating the technologies of AI and automation,
Insilico has demonstrated significant efficiency boost compared to traditional
drug discovery methods (often requiring 2.5-4 years), as announced in the
recent key timeline benchmarks for internal DC programs from
2021 to 2024: the average time to DC is 12-18 months, with 60-200 molecules
synthesized and tested per program. About Insilico Medicine Insilico Medicine, a global clinical stage biotechnology
company powered by generative AI, is connecting biology, chemistry, medicine
and science research using next-generation AI systems. The company has
developed AI platforms that utilize deep generative models, reinforcement
learning, transformers, and other modern machine learning techniques for novel
target discovery and the generation of novel molecular structures with desired
properties. Insilico Medicine is developing breakthrough solutions to discover
and develop innovative drugs for cancer, fibrosis, central nervous system
diseases, infectious diseases, autoimmune diseases, and aging-related diseases.
http://www.insilico.com/