With the large amount of data generated in the field of life sciences, Bioinformatics has greatly helped us to find a needle in the bunch of haystack. Here is a little story of one such successful application of bioinformatics data mining.
Researchers at the University of California, San Diego, have found biomarkers for ovarian tumors by mining two National Institutes of Health (NIH)-sponsored databases. The bioinformatics-enabled research program is believed to help in early diagnosis of ovarian cancer and improve outcomes in a tough therapeutic indication.
Researchers noted that 6 mRNA isoforms are found in ovarian cancer cells but not in healthy tissues by combing through the Cancer Genome Atlas and Genotype-Tissue Expression program. The discovery was made possible by custom bioinformatics algorithms and the two NIH-sponsored, publicly accessible databases. In total, the team analyzed transcriptome sequence data from 296 ovarian cancer samples and 1,839 normal tissues, leading to the identification of 6 mRNA isoforms with sufficient tumor specificity to enable an early diagnosis.
Most of the earlier studies attempted to detect cancer using DNA. However, Christian Barrett, a UCSD bioinformaticist and first author of the paper, said , “But we wondered if we could instead develop an ovarian cancer detection test based on tumor-specific mRNA that has disseminated from cancer cells to the cervix and can be collected during a routine Pap test.” To gather clinical data and to validate the laboratory-based findings will be their next move.
Further work is going on in applying their strategy to the other 30 tumor types for which there are abundant supplies of RNA sequencing data. NIH has enabled the creation of these resources by financing multicenter data-generation programs, the scale of which is beyond the reach of almost any individual research institution. Yet while these high-profile programs have been accumulating for years the data have not been efficiently utilized.
According to Barrett and his co-authors, the raw transcriptome data being produced by these efforts has tremendous discovery potential, but to date they have not been rigorously evaluated for tumor-specific molecules for diagnostic and therapeutic applications.
Hence, with rapid accumulation of raw data, there is a need for bioinformatics algorithms for making sense of these data.
—By Ushanandini Mohanraj