From Target to Therapy: A Comprehensive Overview of the Drug Discovery Workflow

Drug discovery involves a series of complex, interdisciplinary steps that transform biological insights into therapeutic agents. This review explores each phase of the drug discovery process, from target identification through to clinical trials. It emphasizes the integration of medicinal chemistry, pharmacology, and computational biology in the design, optimization, and clinical testing of new drugs. Key stages such as target validation, hit identification, lead optimization, and preclinical development are discussed, with a focus on the strategic decisions that guide the translation of molecular targets into clinically viable therapeutics.

Get In Touch

1. Introduction

The path from an initial scientific hypothesis to an approved drug remains long, risky, and costly. Modern estimates put the average timeline for developing a new drug at 12–13 years, with only 1–2 of every 10,000 screened compounds eventually reaching the market. Discovery and pre-clinical development alone can take 4–7 years, while clinical phases may last another 8–10 years. Costs are equally substantial, with recent figures ranging from $2.5–3 billion per approved drug (DiMasi et al., 2016).

Attrition remains the greatest challenge. Only ~60–70% of compounds advance beyond Phase I trials, ~30–35% of Phase II candidates succeed, and the overall probability of approval from first-in-human studies is just 10–15%. Despite significant advances in computational biology, machine learning, and high-throughput screening (HTS), the failure rate highlights the difficulty of balancing efficacy, safety, and drug-like properties.

This article provides a comprehensive overview of the drug discovery workflow, structured around traditional stages but expanded to highlight assay development, emerging technologies, and future trends. Medicinal chemistry and pharmacology remain at the core, but their integration with AI, DNA-encoded library drug discovery, and precision medicine is reshaping timelines and strategies.

2. Target Identification

Target identification is the first and most critical step in drug discovery. This phase involves selecting a biological molecule, usually a protein, that plays a significant role in disease. The ideal target should be directly implicated in disease pathogenesis, druggable, and capable of being modulated to produce a therapeutic effect without adverse effects (Overington et al., 2006).

Approaches to Target Identification

Genomic and Transcriptomic Technologies: Genome-wide association studies (GWAS) and RNA sequencing uncover genes or pathways associated with disease, providing insights into genetic mechanisms (Visscher et al., 2017).
Proteomics: Mass spectrometry–based proteomics identifies proteins altered in disease states, helping uncover new molecular targets (Aebersold & Mann, 2016).
Phenotypic Screening: Screening compounds in cellular or animal models allows identification of active molecules first, followed by linking to underlying targets (Vincent et al., 2022).

The challenge in this stage lies in ensuring the target is both druggable and selectively modulated by small molecules or biologics, which is essential for therapeutic viability.

3. Target Validation

Once a potential target has been identified, the next step is to validate its role in disease. This phase confirms that modulating the target yields therapeutic benefit.

Methods of Target Validation

Genetic Validation: Tools such as CRISPR/Cas9 and RNA interference (RNAi) allow precise knockouts or knockdowns of candidate genes, providing direct evidence of disease involvement (Moore, 2015).
Pharmacological Validation: Chemical probes, biologics, or small molecules are used to modulate target activity. Demonstrating efficacy in preclinical models strengthens confidence in the target (Swinney & Anthony, 2011).

Validated targets should demonstrate clear disease association, a defined mechanism of action, and the potential for selective pharmacological intervention without off-target effects (Emmerich et al., 2021).

4. Hit Identification

Following validation, researchers seek compounds that modulate the target. This stage often employs high-throughput screening (HTS), where large chemical libraries (10⁴–10⁶ compounds) are tested in automated assays. While hit rates are typically low, identifying even a few promising molecules provides starting points (Macarron et al., 2011).

Alternative Approaches to Hit Identification

Fragment-Based Drug Discovery (FBDD): Screens small fragments that bind weakly to targets but can be optimized iteratively into potent compounds (Bon et al., 2022).
Virtual Screening: Computational docking and machine learning models predict binding affinity, narrowing candidates for experimental testing (Lyu et al., 2023).
DNA-Encoded Library (DEL) Technology: Each compound is tagged with a DNA barcode encoding its structure. DEL enables screening of up to 10¹² molecules in a single tube, requiring minimal protein and time. This approach has become central to DNA-encoded library drug discovery (Satz et al., 2022)

5. Assay Development

Before large-scale screening or optimization, assay development ensures that biological activity and toxicity can be reliably measured. Robust assays improve hit quality and reduce false positives.

Types of Assays: Include biochemical assays (enzyme activity), cell-based assays (signal transduction), and phenotypic assays (functional outcomes).
Key Considerations: Sensitivity, reproducibility, scalability, and physiological relevance.
Impact on Hit Quality: Assay selection influences which hits are identified. For example, highly artificial assays may capture binders with no therapeutic potential, while physiologically relevant assays enrich for clinically translatable molecules.

According to Danaher Life Sciences, assay development is increasingly critical as drug discovery integrates HTS, DELs, and phenotypic screening.

6. Hit-to-Lead (H2L) Optimization

The hit-to-lead (H2L) stage focuses on refining hits into more potent, selective, and drug-like molecules.

Strategies for H2L Optimization

tructure–Activity Relationship (SAR) Studies: Synthesize analogs to explore how chemical modifications affect activity.
Early ADME/Toxicity Profiling: Screen candidates for solubility, permeability, metabolic stability, and cytotoxicity.
Optimization for Drug-Like Properties: Modify compounds to improve bioavailability, minimize off-target interactions, and enhance selectivity (Campbell et al., 2018)

Vipergen highlights how DEL hit identification can be paired with rational H2L optimization, accelerating time-to-lead compared with traditional HTS approaches.

7. Lead Optimization

Lead optimization further refines molecules for potency, safety, and pharmacokinetics.

Key Aspects

Stereochemistry: Enantiomeric differences can significantly impact potency or toxicity.
Scaffold Hopping: Replacing core molecular scaffolds while retaining binding to improve solubility or reduce off-target effects (Hu et al., 2017).
Pharmacokinetic Optimization: Adjust lipophilicity, design prodrugs, or shield metabolic hot spots to enhance bioavailability and reduce clearance (Ballard et al., 2013).

At this stage, medicinal chemistry and pharmacology converge to deliver candidates suitable for preclinical development.

8. Preclinical Development

Before human trials, candidates undergo extensive testing in vitro and in vivo. This stage typically lasts 3–6 years.

Components

Toxicology Studies: Evaluate acute/chronic toxicity, carcinogenicity, genotoxicity, and reproductive effects.
Pharmacokinetic Studies: Assess ADME properties to confirm expected behavior in animals.
Formulation Development: Optimize delivery, stability, and bioavailability.

If successful, researchers submit an Investigational New Drug (IND) application to agencies such as the FDA or EMA. Approval is required before clinical trials can begin.

9. Clinical Trials

Clinical development remains the most resource-intensive phase, often spanning 8–10 years. Trials progress through three major phases:

Phase I

Participants: 20–100 healthy volunteers.
Focus: Safety, dosage, pharmacokinetics.
Success Rate: ~60–70% advance to Phase II.

Phase II

Participants: 100–500 patients with the target disease.
Focus: Efficacy, dose ranging, short-term safety.
Success Rate: ~30–35% advance to Phase III.

Phase III

Participants: Thousands of patients across multiple sites.
Focus: Confirm efficacy, monitor long-term safety.
Success Rate: ~50–60% achieve endpoints and support regulatory submission.

If Phase III is successful, a New Drug Application (NDA) is filed with regulators (FDA, EMA) for approval.

10. Emerging Technologies in Drug Discovery

Artificial Intelligence and Machine Learning

AI/ML lowers costs, shortens development time, and improves predictive accuracy. Applications include:

Target Identification: Mining omics datasets for druggable targets.
Virtual Screening: Rapidly evaluating millions of compounds.

ADME/Toxicity Prediction: Modeling safety liabilities before costly in vivo work.
Challenges remain in data quality, model interpretability, and integration into regulatory workflows.

High-Throughput Screening and DNA-Encoded Libraries

Traditional HTS can test up to 10⁶ compounds but is expensive. In contrast, DEL screening covers up to 10¹² molecules per experiment, requiring nanogram protein amounts and minimal assay time. Innovations such as in-cell DEL and Vipergen’s YoctoReactor® platform enhance physiological relevance and synthetic diversity.

CRISPR and Gene Editing

CRISPR enables precise genome engineering for target validation and disease modelling. Knockout cell lines and engineered disease models reduce uncertainty about whether target modulation translates to therapeutic benefit (Cytosurge on CRISPR applications).

Precision Medicine, Generative AI, and Regulatory Acceleration

Precision Medicine: Stratifies patients based on genetics/biomarkers to tailor therapies (Danaher Life Sciences).
Generative AI: Accelerates molecular design, especially when powered by GPU-accelerated computing.

Regulatory Innovation: Pathways like priority review, conditional approval, and orphan drug designations shorten timelines for high-need therapies.

11. Conclusion

The drug discovery process is a complex, interdisciplinary journey requiring sustained collaboration between biology, chemistry, and pharmacology. While the path from target identification to clinical approval remains long—12–13 years on average—emerging technologies are beginning to reshape success rates and timelines.

Artificial intelligence, DNA-encoded library drug discovery, and CRISPR gene editing offer powerful ways to accelerate early discovery and reduce risk. Trends such as precision medicine, generative AI, and regulatory acceleration further signal a shift toward more efficient and personalized development.

Companies like Vipergen are at the forefront of these innovations, offering proprietary DEL platforms that open new chemical space and improve hit discovery efficiency. As these tools mature, the industry can expect shorter timelines, lower attrition rates, and more effective therapies for patients worldwide.

References

Aebersold, R., & Mann, M. (2016). Mass-spectrometric exploration of proteome structure and function. Nature, 537(7620), 347–355.
Ballard, P., et al. “Metabolism and Pharmacokinetic Optimization Strategies in Drug Discovery.” Drug Discovery and Development, edited by R. G. Hill and H. P. Rang, 2nd ed., Churchill Livingstone, 2013, pp. 135–155. ScienceDirect.
Bon, M., et al. (2022). Fragment-based drug discovery-the importance of high-quality molecule libraries. Mol Oncol. 16(21), 3761-3777.
Campbell,I. B., et al. (2018). Medicinal chemistry in drug discovery in big pharma: past, present and future. Drug Discovery Today, 23(2), 219-234.
DiMasi, J. A., et al. (2016). Innovation in the pharmaceutical industry: New estimates of R&D costs. Journal of Health Economics, 47, 20–33.
Emmerich, C.H., et al (2021). Improving target assessment in biomedical research: the GOT-IT recommendations. Nature Reviews Drug Discovery, 20, 64–81.
FDA (2020). Investigational New Drug (IND) Application. U.S. Food and Drug Administration. Retrieved from https://www.fda.gov.
Hu, Y., et al, (2017). Recent Advances in Scaffold Hopping. Journal of Medicinal Chemistry, 60(4), 1238-1246.
Lyu, J., et al. (2023) Modeling the expansion of virtual screening libraries. Nat Chem Biol, 19, 712–718.
Macarron, R., et al. (2011). Impact of high-throughput screening in biomedical research. Nature Reviews Drug Discovery 10, 188–195.
Moore, J., (2015). The impact of CRISPR–Cas9 on target identification and validation. Drug discovery today, 20(4), 450-457.
Overington, J. P., et al. (2006). How Many Drug Targets Are There? Nature Reviews Drug Discovery, 5(12), 993–996.
Satz, A.L., et al. (2022) DNA-encoded chemical libraries. Nat Rev Methods Primers 2(3), 1-17.
Swinney, D. C., & Anthony, J. (2011). How were new medicines discovered? Nature Reviews Drug Discovery, 10(7), 507–519.
Vincent, F. et al. (2022). Phenotypic drug discovery: recent successes, lessons learned and new directions. Nature Reviews Drug Discovery 21, 899–914.
Visscher, P. M., et al. (2017). 10 years of GWAS discovery: Biology, function, and translation. American Journal of Human Genetics, 101(1), 5–22.

Do you have an inquiry?

Get In Touch

Related Services

Service
Small molecule drug discovery for even hard-to-drug targets – identify inhibitors, binders and modulators	In living cell In vitro
Molecular Glue Direct	In living cell In vitro
PPI Inhibitor Direct	In living cell In vitro
Integral membrane proteins	In living cell In vitro
Specificity Direct – multiplexed screening of target and anti-targets	In living cell In vitro
Express – optimized for fast turn – around-time	In living cell In vitro
Snap – easy, fast, and affordable	In living cell In vitro