Drug Delivery & Targeting e.g. Nanoparticles – Subtopic Landscape

A synthetic biology perspective

The subset of SynBio – drug delivery & targeting e.g. nanoparticles related patents were further investigated to identify subtopics and assess trending areas. The topic model leverages a hybrid approach based on the optimised extractive summary for each publication. Using a combination of topic discovery via fine-tuned transformer based deep learning and ground truth cross referencing via keyword and classification codes. The process enables a patent to belong to more than one topic for accurate multi-classification trends, accounting for multiple invention embodiments. Please see the topic model page for further details regarding the topic model methodology to avoid duplication here.

Subtopic landscape

The synthetic biology – drug delivery & targeting e.g. nanoparticles topic model is visualised in figure 12.9, based on the dimensionality reduction of vector embeddings to map each patent to a contextually relevant x & y coordinate, the categorical clusters are colour coded to support review. The visual is based on patents assigned to one key subtopic for simplicity. However, trend analysis also enables a patent to belong to more than one subtopic which is consistent with the topic model methodology throughout this project.

Subtopic model – technology cluster totals

The hybrid topic model methodology identified 20 diverse topics which are ranked based on the total number of published applications in figure 12.10. A patent application can be counted more than once as it can belong to multiple topics.

In figure 12.10, the analysis enables multilabel classification for each patent application, to account for multiple invention embodiments. During the 20 year publication period 2004-2023, nearly 47% of the patents identified were classified within the cancer related subtopic. Nanoparticles are an emerging drug delivery platform for cancer treatment and diagnosis, with improved drug delivery, imaging and enhanced immunotherapy potential. Almost 41% of patents were classified within the antibody uses/therapeutics topic. Antibodies can be conjugated to nanoparticles, drugs, etc. and can also be used to target gene delivery vectors to specific cells or tissue during gene therapy. Nearly 36% of documents were classified in the pegylated topic. Pegylated nanoparticles (coated with polyethylene glycol) are targeted drug delivery vehicles with improved circulation time and reducing immunogenicity. Other technologies such as drug-peptide (34%), liposomes (27.3%) and lipid nanoparticles (6.4%) represent important delivery vehicles.

Therapeutic areas are led by the subtopics including gene therapy (27.5%), neurodegenerative e.g. Alzheimer’s (14.4%), vaccine related (13.1%), metabolic related (11.9%) and immunotherapy peptides and receptors (6.9%). Fusion proteins (16.8%) and antibody drug conjugates (13%) are important therapeutics. There exists a microorganisms focus with the delivery of antivirals (18%) and antimicrobial peptides and compounds (22.6%). Nanoparticles can also be used for gene / genome editing and may be used to treat cancer, where bacteria benefit from enhanced permeability and retention within tumours. Approximately 7% of documents were classified within the genetically modified microorganisms subtopic.

The drug delivery & targeting e.g. nanoparticles subtopic publication year trends are shown in figure 12.11. Publication trends discussed below are based on EP A1/A2 applications, identified patents can belong to more than one subtopic due to multiple invention embodiments.

In figure 12.11, all subtopics identified had positive compound annual growth rates during 2014-23. Three subtopics were above the 20% threshold; genetically modified microorganisms (33.6%), lipid nanoparticles (21.5%) and immunotherapy peptides & receptors (20.7%). In the GMOs area, UK based PROKARIUM has developed a microbial immunotherapy platform which leverages the anti-tumour mechanism of a proprietary strain of bacteria (ZH9). Functionalised and engineered microorganisms such as bacteria modified by genetic engineering could enable enhanced drug delivery. Lipid nanoparticles (LNPs) represent one of the most advanced technologies for efficient in vivo delivery of nucleic acids such as mRNA, exemplified by the COVID-19 vaccines. Cancer immunotherapies via nanoparticles have the potential for increased anti-tumour efficacy and improved drug retention. These technologies are growing rapidly, benefiting from a large portfolio of research and development.

The vaccine related subtopic is just below the 20% threshold at 18.1% CAGR, for example, Pfizer-BioNTech and Moderna developed mRNA lipid nanoparticles for the COVID-19 vaccine and mRNA vaccine technology is now expanding beyond the initial use during COVID-19. Antiviral (16.3%), fusion proteins (15.9%) and the antibody uses/therapeutics (15.6%) subtopics are above the 15% threshold. Gene therapy (14.9%), antibody-drug conjugates (14.5%), antimicrobial peptides & compounds (14.4%) and liposomes (12.8%) are comfortably above the 10% threshold. Larger topic areas such as cancer related is growing at a respectable 10.4% CAGR and represents an important therapeutic area. The pegylated subtopic had one of the largest publication totals in 2023 (428 publications) and grew at 9.4% during 2014-23, representing an important type of drug delivery vehicle.

Subtopic top 20 assignees distributions (2014-23)

The patent portfolios of the top 20 assignees within the SynBio – drug delivery & targeting e.g. nanoparticles dataset are analysed in figure 12.12. The portfolios are restricted to publications during 2014-23, mapped to the 20 subtopics identified, the counts represent total EPO publications.

The heatmap in figure 12.12 reveals the distribution of the top 20 drug delivery & targeting e.g. nanoparticles assignees during 2014-23, publications can be assigned to more than one subtopic, reflecting multiple invention embodiments. In figure 12.12, MODERNA are a leading assignee for lipid nanoparticles, followed by TRANSLATE BIO and CUREVAC. All of the top 20 assignees have pegylated based drug delivery vehicles such as nanoparticles, led by MODERNA and TRANSLATE BIO. UK headquartered GLAXOSMITHKLINE has a diverse portfolio which is focused towards vaccines, antimicrobial peptides & compounds, liposomes, pegylated drug delivery vehicles, antibodies, drug peptides and gene therapy. There is also a smaller part of the GSK portfolio within cancer related technologies. Antibody drug conjugates are important areas for HOFFMANN LA ROCHE and MEDIMMUNE, followed by GENENTECH and NOVARTIS.

Genetically modified microorganisms are an important aspect of the UNIVERSITY OF PENNSYLVANIA portfolio. From a UK perspective, GLAXOSMITHKLINE is also active in this area. ALNYLAM PHARMACEUTICALS is active in the gene therapy field; interfering nucleic acids and gene expression regulation are key subtopics within it’s portfolio. PFIZER is active in the drug peptides subtopic and is focused towards cancer therapeutics with a focus on vaccines and pegylated drug delivery. Liposomes are also an important aspect of many assignees portfolios with GLAXOSMITHKLINE, MODERNA & TRNASLATE BIO having the largest distributions within the top 20 assignees.

The analysis does not account for earlier publications prior to 2014, which may have contributed to companies developing market share, etc. and potential licensing and acquisitions (subsidiaries). Data cleaning was carried out to clean names and consolidate. The analysis is an informative guide as some specific subtopics have strict content boundaries to enable differentiation, whilst others are broader to capture more generic areas.

Patent family territory analysis

The INAPDOC patent families comprising the identified drug delivery & targeting e.g. nanoparticles related EPO patents were analysed to identify the top 30 territories where patents are filed. Analysing the publication countries alone is insufficient as major countries such as France, the UK, Germany, etc. may not publish patents going through the European (EPO) route, especially when pending. To further supplement the available data, a bespoke analysis was conducted standardising the publication countries and including ‘protected countries’ to include patent rights which are pending or granted based on legal status. There are caveats which include:

  • The study methodology is focused on EPO patents and may not capture assignees/applicants that file only in home territories or don’t file in Europe via EPO filings.
  • The protected country data may not be fully up to date, due to INPADOC data availability and where EPO patents are recent filings.

The standardisation procedure ensures a territory is only counted once per family. The territory analysis is visualised in figure 12.13, EPO and WO (PCT) patents have been included for reference purposes. Despite the caveats, the analysis provides useful indicators regarding territories where applicants are filing patents within the drug delivery & targeting e.g. nanoparticles field, based on 2014-23 publications for a relatively recent perspective.

In figure 12.13, approx.91% of the patent families identified had at least one US (91.4%) national filing. Other key territories with at least one national filing include Japan (70.5%), China (66.7%), Canada (62%) and Australia (54.1%). Below the 50% threshold, key territories include Republic of Korea (44.2%), India (33.9%) and Germany (32.3%).

Investigating keyword trends provides a different perspective beyond the drug delivery & targeting subtopic model. The smart summaries used during the topic model stage were data mined for the most contextually important keywords leveraging transformer based embeddings. Identifying keywords and phrases most similar to the document plus manual auditing for relevance to the SynBio project, visualised in figure 12.14. The visualisation indicates how the cumulative publication counts have changed between the publication periods during 2014-18 & 2019-23. The methodology aims to identify contextually relevant and reliable keywords as a source of ground truth, signify important keywords within the corpus and audit the topic model subtrend analysis already carried out.

In figure 12.14, the following key findings are observed and also support the trending areas identified by the subtopic modelling:

  • Cancer is a major therapeutic area of importance, increasing from 484 to 1106 publications during 2019-23, tumor also grew from 230 to 495 publications. Antibody grew rapidly from 391 to 806 publications during 2019-23, a key therapeutic.
  • Nanoparticles have become an increasingly important delivery vehicle increasing from 336 to 584 publications during 2019-23. Lipid nanoparticles increased form 40 to 208 publications during this period. There was similar growth for liposomes (187 publications) and PEG - Polyethylene Glycol (163 publications) an important component of nanoparticles.

Subtopic keyword analysis

For a further perspective of contextually important keywords, a statistical procedure was applied selecting six subtopics from the corpus. The analysis contrasts how the usage or frequency of the keywords / phrases differs across the subtopics using a weighted log odds ratio. This aims to identify which differences are meaningful and weight the log odds ratio by a prior outlined in Monroe, Colaresi, and Quinn (2008). The statistical procedure requires the prior is estimated from the data itself rather than an uninformative prior, such as a Dirichlet prior. The procedure is an empirical Bayes approach with results identified in figure 12.15. A further motivation is to audit the subtopics for result relevance and transparency and provide insights into content. As a sidenote the transformer based keyword analysis provides powerful methods to review subtopics and extend the analytical power beyond procedures of evaluating a corpus such as TF-IDF (term frequency-inverse document frequency).

In figure 12.15, the keywords outlined are most characteristic of each subtopic based on the weighted log odds score which is labelled. Another implication of higher log odds scores is the ability to define the keyword identified as more likely to be used within the specific subtopic. This is interesting as some of the log odds scores are not very high, which is not surprising given the overlap encountered between the multiple subtopics identified within the specific topic landscape.

Some key findings observed are:

  • Lipid nanoparticles – delivery of mRNA and RNA molecules and mRNA therapies plus vaccines. The ionizable lipid keyword is an important component of the lipid nanoparticle for nucleic acid delivery. The ionizable lipid is neutral at physiological pH minimising toxicity but positively charged in acidic endosomes after cellular uptake. Enabling the nucleic acid to be protected and cellular entry facilitated.
  • Genetically modified microorganism - used for gene therapy, exemplified by Adeno-associated viruses (AAV) and with vaccine applications, lentiviral vectors are also important for drug delivery and targeting.
  • Cancer related – antibody drug conjugates (ADCs) are important drug delivery vehicles and immunoconjugates enable drug delivery. Specific therapeutics include Auristatin used as a payload for ADCs for cancer treatment, Camptothecin with anticancer properties and Paclitaxel (chemotherapy drug) and Ttrastuzumab (targeted cancer drug).

It is difficult to distil and characterise the coverage of the subtopics via restricted keywords and phrases, this is also complicated by the weighting not always being frequency led but reflective of the terminology and context which is more characteristic of one subtopic in relation to others. It is fair to conclude that the subtopic model has successfully captured an extensive set of subtrends which are distinct, overlap exists but the trends are accurate once audited. The keywords are relevant to real word applications and suggest the insights identified are a useful tool to examine the specific topic landscape.