Accelerating pancreatic cancer drug screening by leveraging genomics to select better in vitro models.

By November 6, 2015April 23rd, 2021Publications

Cell lines used for pre-clinical testing of oncology compounds are not always chosen based on how well they models patient tumors. Instead they are often chosen based on availability and literature prevalence. The advent of high throughput genomic profiling demonstrates a causative relationship between genomic features and drug response, suggesting that cancer drug discovery could be accelerated by using genomics as a criteria to find ideal cell lines for a given cancer type. The overall oncology clinical trial success rate is dismally low, especially in pancreatic cancer. Pancreatic cancer has a five year survival of 5-6% and is predicted to be the second leading cause of cancer by 2030 with a dearth of promising medicines currently in trials. In order to forecast optimal cell lines for drug testing in pancreatic cancer, we leveraged gene expression, mutations, and copy number variation (CNV) data to compare tumors from The Cancer Genome Atlas (TCGA) to cell lines in Cancer Cell Line Encyclopedia (CCLE). To approximate cell line usage, the number of hits for each cell line in PubMed and Google Scholar were combined. Less than 20% of queried pancreatic cancer cell lines represented more than 88% of the total search hits, demonstrating a robust bias towards certain cell lines. We calculated the CNV correlation between each cell line and each tumor. The cell lines that were popular in literature, such as DAN-G (24% of citations), were often ranked worst by CNV correlation with tumors while some cell lines which were rarely cited such as L33 had among the highest CNV correlation. Next, we filtered mutation data using publicly available mutation scoring algorithms to select the most cancer driving mutations. Hierarchical clustering was applied to the tumor samples and cell lines together based on the presence or absence of the top scoring mutations in order to pinpoint cell lines with mutational spectra similar to tumors. In support of observations made in CNV data, popular cell lines such as DAN-G clustered with other cell lines and L33 clustered predominantly amongst tumor samples, providing further evidence that L33 may be an ideal cell line for modeling pancreatic cancer drug response. In order to leverage all available data types, the selected CNVs and mutations were combined into a pathway level event matrix based on the number of relevant mutations or CNVs within a given pathway and then clustered. Unsurprisingly these results show that most cell lines are much more similar to each other than to tumors. However, a few cell lines (including L33) cluster with tumor samples. Overall our results demonstrate that comprehensively L33 shows the best similarity to pancreatic cancer tumors. We believe that selecting preclinical screening methods that best match relevant tumor biology and genomic drivers could help accelerate the development of new medicines for a variety of cancers.