Solid-State Drives (SSDs) deliver essential storage performance for data-intensive workloads common to medical and drug research companies.
By Gimmy Chen, Project Manager, Phison Electronics Corporation
It’s not exactly big news that the life sciences and pharmaceutical industries are data-intensive. Yet the scale and scope of data analytics in these fields may still have the power to surprise people. The data volumes are immense. Data analytics is central to these companies’ main strategic and profit-making workloads. Stakeholders therefore value technology that can improve analytics performance—driving reductions in time-to-market and costs in the process.
Data analytics in life sciences and pharma
Effective data analytics is critical to success in nearly every aspect of life sciences and pharma. It is integral to research and development, as well as clinical trials. Data analytics is essential for assessing at-risk populations, evaluating genomes, predicting future pandemics and more. In pharmaceutical research and development (R&D), data scientists use analytics tools to glean insights from varied historical and real-time data sources. These might include healthcare statistics, patient data, IoT sensor data streams and even seemingly unrelated data from social media. The resulting new-found knowledge informs the R&D process.
In drug discovery, a process that traditionally involves the manual testing of chemical compounds, data analytics enables researchers to predict factors such as drug interactions, side effects, toxicity and the like—dramatically speeding up the discovery process. Using artificial intelligence (AI) software, researchers can discover patterns in data from the so-called “omics,” including metabolomics, which refers to the study of small molecules; transcriptomics, which is the analysis of RNA molecules; and proteomics, the analysis of proteins. Drug researchers might conduct a genome-wide association study that looks for prediction of diseases contained inside the human body’s more than 20,000 genes. Data about gene mutations can help with the development of drug compounds to treat the disease.
Data analytics is then a key success factor at the clinical trial stage. Uses of analytics in clinical trials include patient recruiting, matching trial results with subjects’ genomic data, lifestyles, disease status and more. Regarding patient recruitment, this is an area where making the wrong choice can cause delay or even ruin a trial. Data analytics help avoid this outcome.
The use of data analytics also helps increase efficiency, cut costs and speed up the process of clinical trials. These outcomes are important because the “clock is ticking” on a drug’s patent, so to speak, and the longer it takes to get through clinical trials, the less time the drug company will have to recoup its sizable investment in R&D. Once the patent has elapsed, the drug is far less profitable. Longer trials compress the period of time when a company can sell the drug at optimal prices.
After a drug has come on the market, data analytics can help refine a drug’s impact and commercial success. This might occur through the mining of electronic health records (EHRs) to detect issues with drug interactions or even discover new uses for a drug. The process involves using specialized text analysis algorithms that can “read” through millions of EHRs.
For a sense of scale, consider that a Phase III clinical trial generates an average of 3.6 million data points, according to research from Tufts Center for the Study of Drug Development. Tufts further noted that this volume of data is triple what it was 10 years earlier. To put the issue in dollar terms, industry analysts predict that biotech ventures featuring AI-driven drug discovery processes will reach a collective valuation of over $100 billion in the next decade.
NAND flash is critical to life sciences/pharma research
Companies in life sciences/pharma benefit from improving the performance of their data analytics workloads. Running analytics faster means speeding up time to market. Higher-performing analytics solutions also enable researchers to run more analytics processes in the same amount of time—expanding the scope and impact of analytics. Both of these outcomes are good for the business.
How does this happen? A number of factors can potentially contribute to higher-performing data analytics. These include faster processors such as graphical processing units (GPUs), along with network designs that optimize the movement of data from storage to servers and back. Among the most relevant factors, however, is the storage itself. In general, the higher performing the storage, the faster the analytics process will run.
Today, high-performance storage means NAND flash memory. NAND flash is a form of non-volatile memory (NVM) that can deliver high-speed data reads and writes. It forms the basis of most solid-state drives (SSDs).
NAND flash is useful—and increasingly essential—for processor- and memory-intensive workloads like AI in life sciences and pharma. Going step by step, the process of “training” the AI model to work with the data set is very resource-intensive. For each of these workloads, GPUs and CPUs can demand a high-volume feed of data from storage. This demand leads to the need for storage solutions that deliver not only high bandwidth but also high capacity and power efficiency.
NVMe SSDs enable efficiency and speed in these types of challenging workloads. System owners can create and manage large data sets because they are able to separate storage from compute nodes. The NVMe controller allows for parallel processing between the GPU and storage media—in which the GPU accesses data in storage for multiple operations simultaneously. This architecture speeds up the analysis and maximizes GPU utilization.
Longevity is helpful in life sciences and pharma because data sets may need to be available for analysis and AI for years. From drug discovery to clinical trials, data analytics and AI are essential pillars in the life sciences and pharma industry. As such, leveraging NAND flash SSDs is key to supporting complex analytics workloads and security necessary to expedite analytics and AI workloads and accelerate time to market.