The Concept of Big Data
A collection of data sets that is so large and complex that it becomes difficult to process using hand database management tools or traditional data processing applications. (Halevi, 2014) The advent of big data brings new challenges in translating datasets of various quality, quantity, and velocity (3 V´s of Big Data) into actionable information, and ultimately, to knowledge. Big data are of significant interest to the public health domain due to the size, diversity, and complexity of varied data sources that could prevent disease and promote health and wellbeing.
Big Data includes real world data such as electronic health records, registry data, claims data, data from wearable devices, social media platforms among others. Moreover, integration and analysis of the data with different nature, such as social and scientific, can lead to new knowledge and intelligence, exploring new hypothesis, identifying hidden patterns – which would be difficult (or even impossible) otherwise.
Data can often be collected in real time (e.g. monitoring patients through wearable devices) which require specific technology. In other hand, large amounts of data have already been collected for different purposes through the years. Hence, secondary data refers to data that have already been collected for some other purpose. (Schlomer & Copp, 2014) This highlights some constraints in analyzing and interpreting this amount of data - as it was not controlled for the current intended purpose.
Guaranteeing data quality through its life cycle requires a robust information system infrastructure. In contrast, a robust information system infrastructure requires the ability not only to provide and make available quality data, but also to receive data, so they need to support bi-directional communication (of alerts, population health statistics and case or care management) - to inform clinicians and decision-makers in real-time. Information architecture (AI) refers to the logical configuration of various elements, including hardware, software, information flow and technical standards needed to support the information needs of users. Robust AI can increase the effectiveness and scope of its performance by integrating internal and external information systems. A fundamental component of information architecture is interoperability. Interoperability is defined as “the ability of a system to exchange electronic health information and use information from other systems without additional effort by the user”. Problems with data interoperability (ie, send, receive, find, and eventually be able to of use) restricts data exchange with other interested parties. (Janssen & van der Voort, 2020; Magnuson J.A., 2020)