Data Integration And Management

As scientific breakthroughs in genomics and proteomics and new technologies such as biomedical and molecular imaging are incorporated into R&D processes, the associated experimental activities are producing ever-increasing volumes of data that have to be integrated and managed. There are two major approaches to solving the challenge of enterprise-wide data access. The creation of data warehouses [15] is an effective way to manage large and complex data that have to be queried, analyzed, and mined in order to generate new knowledge. To build such warehouses, the various data sources have to be extracted, transformed, and loaded (ETL) into repositories built on the principles of relational databases [16] [ Warehousing effectively addresses the separation of transactional and analysis/reporting databases and provides a data management architecture that can cope with increased data demands over time. The ETL mechanism provides a means to "clean" the data extracted from the capture databases and thereby ensures data quality. However, data warehouses require significant effort in their implementation. Alternatively, a virtual, federated model can be employed [17]. Under the federated model, operational databases and other repositories remain intact and independent. Data retrieval and other multiple-database transactions take place at query time, through an integration layer of technology that sits above the operational databases and is often referred to as middleware or a metalayer. Database federation has attractive benefits, an important one being that the individual data sources do not require modification and can continue to function independently. In addition, the architecture of the federated model allows for easy expansion when new data sources become available. Federation requires less effort to implement but may suffer in query performance compared to a centralized data warehouse.

Common to both approaches is the need for sorting, cleaning, and assessing the data, making sure they are valid, relevant, and presented in appropriate and compatible formats. The cleaning and validation process would eliminate repetitive data stores, link data sets, and classify and organize the data to enhance their utility. The two approaches can coexist, suggesting a strategy where stable and mature data types are stored in data warehouses and new, dynamic data sources are kept federated.

Genomic data are a good example of the dynamic data type. Since genomics is a relatively new field in biopharmaceutical R&D, organizations use and define data their own way. Only as the science behind genomics is better understood can the business definitions be modified to better represent these new discoveries.

The integration of external (partly unstructured) sources such as GenBank [ 18] [ SwissProt [ 19] [ and dbSNP [20] [ can be complicated, especially if the evolving systems use does not match the actual lab use. Standardized vocabularies (i.e., ontologies) will link these data sources for validation and analysis purposes. External data sources tend to represent the frontier of science, especially since they store genetic biomarkers associated with diseases and best methods of testing that are ever-evolving. Having a reliable link between genetic testing labs, external data sources for innovations in medical science, and clinical data greatly improves the analytical functionality, resulting in more accurate outcome analysis. These links have been designed into the CDISC PG/PR domains to facilitate the analysis and reporting of genetic factors in clinical trial outcomes.

Stakeholder Management

Stakeholder Management

Compliance Interpretation

Storage Policy Definition

Compliance Interpretation

Taxonomy Definition

Storage Policy Definition

Taxonomy Definition

Search

Info Grid

Inbox

Preferences

Reports

Browser

Work Area

Actions

Favorites

Process Flow

Workflow

Business Process Management

Process Flow

Business Process Management

Project Management Made Easy

Project Management Made Easy

What you need to know about… Project Management Made Easy! Project management consists of more than just a large building project and can encompass small projects as well. No matter what the size of your project, you need to have some sort of project management. How you manage your project has everything to do with its outcome.

Get My Free Ebook


Post a comment