IDBS - Products - Discovery Warehouse Welcome to IDBSHomeCustomer LoginProductsOverviewActivityBaseChemistryDecision SupportInventoryDiscovery WarehouseUtilitiesPredictionBaseXLfit 4PostersSolutionsServicesOverviewConsultancyTrainingHelp DeskNewsPress ReleasesArticlesIn Silico MagazineEventsConferencesWebinarsUser GroupsAbout IDBSWelcome to IDBSQuality AssuranceCareersContactDirectionsPartnersCase StudiesTerms & ConditionsPrivacy Search   Databases grow, company data systems operate increasingly on a global level, data mining requires the integration of all available data, local operational needs do not always match global data needs, etc. There are many different reasons to implement a system to consolidate and integrate different datasources in an organisation. Although it is not the only available option, the approach of creating a data warehouse and data marts has been applied very successfully in several industries, including biopharmaceutical research. Operational efficiency Archiving operational databases Faster data capture (record insertion) Local customisation possible Consolidation Integration Quality Query efficiency Discovery Marts A custom tailored solution Operational Efficiency In many organizations ActivityBase databases are an integral part of HTS or UHTS operations. With millions of data points generated in weeks to months over many years, these ActivityBase databases approach the size of several tens to hundreds of millions of records. Due to this large size the speed of new record insertion during the data capture process tends to decrease significantly. Although databases can be optimized for capturing data, this usually results in diminished query performance. By transferring data from ActivityBase to a Discovery Warehouse, the capture database (ActivityBase) and reporting database (Discovery Warehouse or Discovery Mart) are separated and can be therefore be optimized separately. Archiving operational databases Once the data from the ActivityBase databases is consolidated into a common data repository (Discovery Warehouse), data from the operational databases (ActivityBase) can be archived since that data is now available through the data warehouse. Faster data capture Now that the ActivityBase data has been archived it can be deleted from the source database. This makes the operational ActivityBase databases lean, which guarantees a renewed fast data insertion. Local customisation The tools to extract data from ActivityBase databases can be customised. That means that even if local ActivityBase databases are different, the ETL tools can be adapted to accommodate these differences. Therefore, these local ActivityBase databases can now be customised to better meet local needs without creating global reporting compatibility problems. Consolidation The IDBS Discovery Warehouse is focused on storing structured information only. Non-structured data such as document reports, as well as data from development labs and clinical studies, are usually managed in different data systems or document management systems. With all these many data sources, it has become a priority of many companies to reduce the number of databases that consolidate information. If you have comparable data sets in databases on multiple sites or different data types that are stored in different databases, a Discovery Warehouse helps consolidate these data sources. This significantly reduces the number of data repositories that have to be queried to bring all relevant information together at the corporate decision-making level. Integration In a large multinational organization, as well as in a small biotech with multiple sites or collaboration partners, databases are often spread around different locations. This becomes a major hurdle when the need arises to report on data that can only be found in different databases. You will very likely find that the access to the data is at best slow and that the correlation between the data is difficult to achieve and requires a lot of laborious work if you need to query on: Screening metrics across sites Historical comparisons Comparing all screening data for a set of compounds Cross assay data for SAR reports. The Discovery Warehouse can easily solve several of these problems. Data from different ActivityBase databases can be integrated into a single Discovery Warehouse. The result of that is that all data can be queried from a single data source. Quality The early data management systems were not always set up to cover future data needs. They have not always systematically stored qualifying information about the data. Without this qualifying information the correct interpretation of data is often impossible. In other words data from these systems may have become useless over time. With mergers and acquisitions, datasources have come together which use different data descriptors, languages, units etc., creating a source of inconsistency between datasystems. To resolve these inconsistencies can be long and painful and requires all involved parties to change work procedures. Both these quality issues are easily addressed with the implementation of a Discovery Warehouse. The ETL tools to create a Discovery Warehouse provide a unique opportunity for customisation to help in cleansing the data in the source databases to result in a single set of data of high quality. Query efficiency Often data is not stored in a database in the form in which it is most easily queried. For example, in a typical relational database structure, result values for a compound may be stored by compound and by result identifier (left table), where a more readable report format would be to have each compound in a separate row and the value of each biological result (Inhibition, IC50 and KEL) captured in an assay in a separate column (right table). The transformation between these two formats, called "pivoting", is often done on the fly, but for large amounts of data, the process requires significant amounts of time. In addition federated data integration approaches have significant difficulties achieving this rather complicated and demanding data pivoting task. Object Id Result Id Result Value IDBS00034 Inhib 77 IDBS00034 IC50 2.3 IDBS00034 KEL 22 IDBS00168 Inhib 46 IDBS00168 IC50 1.9 IDBS00168 KEL 56 Object Id Inhib IC50 KEL IDBS00034 77 2.3 22 IDBS00168 46 1.9 56 "Discovery Mart" makes data more presentable A "data mart" is a customised database that is focused on presenting the data in an easily queryable way. Thereby data marts, such as the IDBS DiscoveryMart, handle cross assay queries quickly and efficiently. The pivoting step has been performed off-line as part of the creation of the Discovery Mart. Discovery Marts are built using the data from the Discovery Warehouse rather than from the source ActivityBase database. And, because it is created from the Discovery Warehouse, it can be created, recreated and dropped at will to meet the needs of changing business processes. A custom tailored solution Implementing a data warehouse is not an "out of the box" project. It requires detailed analysis, consultation and fine tuning to make sure the final product fully meets all of your functional and technical requirements. Typically the steps include: Define goals for implementing a Discovery Warehouse strategy Detailed analysis of existing databases, types of queries, data volumes and end-user requirements Test deployment using a limited number of protocols and a subset of source databases Implementation Training, follow-up and measurement. Read Drug Discovery World article describing an installation of Discovery Warehouse. Products | Solutions | Services & Support | Partners | Customer SiteNews & Events|About IDBS | Contact | T & C's | Privacy | Site Map Data Integration and Consolidation.pdf www.techwithyou.com
www.mothersandmenus.com
www.warehousingcenter.com