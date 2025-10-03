Cambridge, MA – October 3, 2025 – Insilico Medicine (“Insilico”), a clinical-stage biotechnology company driven by generative artificial intelligence (AI), today announced the release of a major scientific paper, “Advancing Target Discovery Through Disease-Specific Integration of Multi-Modal Target Identification Models and Comprehensive Target Benchmarking System.” The study details the development of Target Identification Pro (TargetPro), a machine learning workflow trained on clinical-stage targets across 38 diseases spanning oncology, neurological, immune, fibrotic, and metabolic disorders. The study also reveals TargetBench 1.0, a benchmarking system designed to evaluate target identification models, including large language models (LLMs).

The new AI-powered model training framework for therapeutic target discovery was unveiled in a paper published on bioRxiv, introducing TargetPro and TargetBench 1.0 as complementary systems designed to accelerate and validate the earliest and most critical stage of drug development. By tailoring predictive AI models to individual diseases and creating the first standardized benchmarking system for evaluating target discovery platforms, Insilico is establishing a new gold standard of accuracy and transparency in the field.

Drug discovery has long been plagued by weak target selection, with nearly 90% of candidates failing in clinical trials, often because the biological targets prove unreliable or lack translational potential. TargetPro addresses this challenge by integrating 22 multi-modal data sources, including genomics, transcriptomics, proteomics, pathways, clinical trial records, and scientific literature, into disease-specific models that learn the biological and clinical characteristics of disease targets most likely to progress to clinical testing . TargetPro identifies context-dependent predictive patterns unique to each disease area, resulting in significantly higher accuracy in nominating viable drug targets compared to other target identification models.

In head-to-head benchmarking, TargetPro achieved a clinical target retrieval rate of 71.6%, a two- to three-fold improvement over state-of-the-art LLMs such as GPT-4o, DeepSeek-R1, and BioGPT, which ranged between 15% and 40%, and public platforms like Open Targets, which scored just under 20%. Importantly, while the performance of LLMs drops when prompted to generate longer target lists, TargetPro maintained consistently high performance across therapeutic areas, underscoring its robustness for real-world applications.

SHAP analysis was applied to analyze and understand the predictive drivers of TargetPro models across the five disease groups. It revealed that TargetPro’s decision-making is nuanced and context-dependent, with feature importance patterns varying across disease groups. Matrix factorization and attention score were universally the most impactful, highlighting their core contribution to the trained disease-specific TargetPro models, while information from omics datasets was particularly predictive of oncology target success. These results indicate that the model does not rely on simple, fixed rules but instead learns biologically relevant, disease-specific patterns.