Cure discovery is promoted by genome research and cloud computing.
When Francis Collins and his team at the National Human Genome Research Institute published the majority of the human genome in 2001, he likened the genome to a book with multiple uses.
“It’s a history book […] a shop manual […] and a transformative textbook of medicine, with insights that will give healthcare providers immense new powers to treat, prevent, and cure disease.”
The story of genome research had just begun.
As biomedical researchers began delving into genome data, and subsequently adding their own chapters to the “book,” a new challenge arose: how to store, access, and share the massive amounts of data being generated. The scientific community needed new tools to enable data sharing and research collaboration.
Thankfully, advances in cloud computing have grown alongside the robust analysis of genome research to enable an unprecedented level of data sharing in the scientific world. Research-driven cloud computing platforms enable enhanced worldwide information sharing to support science to find cures for life-threatening diseases, including pediatric cancer.
Each year, about 10,000 children in the U.S. under the age of 14 are diagnosed with cancer. According to the National Cancer Institute, the five-year survival rate among children is 80 percent. It’s a significant improvement from the 1970s, when the five-year survival rate was 50 percent, but there is still a long way to go.
Adding to the challenge, yet propelling the case for easier scientific research collaboration is that pediatric cancer is less common than adult cancer. Genomics information overlaps by only 50 percent between the two. Some pediatric cancer types are so rare that the only way to accumulate sufficient samples to identify patterns is through collaboration. Cloud computing becomes the linchpin to enable genome research collaboration and make it possible to manage and reliably share ever-larger datasets.
Cloud Computing Propels Pediatric Cancer Research
While the origin of the term is murky, cloud computing was built on research from the early 1960s, when MIT professor John McCarthy surfaced the notion that “[c]omputing may someday be organized as a public utility […] each subscriber needs to pay only for the capacity he actually uses, but he has access to all programming languages […]”
It came into common parlance in 2006, when then-Google CEO Eric Schmitt used the term to describe a new way to access software, computer power, and files via the web. With cloud computing, files aren’t copied, but accessed via a link to the stored file. Instead of storing information to a hard drive or other local storage device, it’s saved to a remote storage location. The cloud, in simple terms, is essentially a metaphor for the Internet. It enables the connection between a collection of computers and this storage location.
Cloud computing makes it easier to move large datasets around, which in turn enables the kind of collaboration and data sharing researchers need to advance new discoveries. Instead of spending their time figuring out how to move data from place to place, they can focus on the research itself.
Goodbye, Slow Data Downloads
Cloud computing has freed researchers from the tyranny of slow data downloads, a cumbersome burden for many.
“Scientists always want their data as quickly as possible,” said Scott Newman, Ph.D., the group lead for bioinformatics analysis in the Department of Computational Biology at St. Jude Children’s Research Hospital. “They don’t like waiting around—they’ve got discoveries to make.”
He should know. It was partly his own impatience with slow data downloads that led to the development of St. Jude Cloud, the world’s largest public repository of pediatric cancer genomics data.
St. Jude Cloud provides next-generation sequencing data, analysis tools, and visualizations on a cloud computing platform designed to be fast and easy to use. Developed in partnership with Microsoft and DNAnexus, it aims to help researchers worldwide identify new genetic drivers of childhood cancer and advance cures.
The St. Jude–Washington University Pediatric Cancer Genome Project is a key data source for St. Jude Cloud, with 2,000 matched tumor-normal pairs. Additionally, data from two other St. Jude-supported genomics initiatives—the Genomes for Kids clinical trial and the St. Jude Lifetime Cohort study—are included in St. Jude Cloud. All data is aligned to the latest genome research and is processed through the Microsoft Genomics service.
St. Jude Cloud delights computational biologists like Newman who want rapid access to genomics data.
“The best thing is the instant vending of data,” Newman said. “So, if you wanted, you could get half a petabyte of data delivered to your own project in a few minutes. Compared with nine months to download a few terabytes […] that’s the truly amazing thing.”
Half a petabyte is equivalent to the amount of data that would fit on 750,000 standard CD-ROM discs.
In addition, the pre-packaged analysis pipelines and data visualizations aim to attract biologists who don’t code, but know their way around a cancer cell and can interpret and validate intriguing genomic observations.
Ensuring data security, particularly with genomics data, is paramount, so considerations for data security are addressed both by the cloud service and by the platform that supports the data stored in the cloud.
In addition to Microsoft Azure, Google Genomics and Amazon Web Services also provide cloud computing services to store human DNA for researchers who, like St. Jude and the larger scientific community, are working to unlock the secrets of the human genome.
Taking Aim at Childhood Cancer
About 1,790 U.S. children younger than 19 years old are expected to die from cancer this year; globally, that figure climbs to nearly 100,000. With platforms like St. Jude Cloud, we have a much clearer path to collaborative research and discovery that lead to finding cures and saving lives.
Keith Perry is senior vice president and chief information officer at St. Jude Children’s Research Hospital. He provides strategic counsel and leadership for the hospital’s information technology initiatives, and his role supports translational research, clinical operations, technology operations, audit/compliance and strategic planning. Perry joined St. Jude in 2015 from the University of Texas MD Anderson Cancer Center in Houston, where he served as associate vice president and deputy chief information officer.