A strategy and rigorous, robust, and reliable framework for designing knowledge in an optimal way for collaborative human and machine learning at scale.
Rare diseases, though individually rare, collectively impact a significant population, posing unique challenges due to limited awareness and minimal incentives for research. Often, these conditions are neglected, leaving patients and their families feeling isolated.
Intelligence.AI, Roche, and Research to the People at Stanford have united to address this challenge. Their mission is to foster a safe, efficient, and effective collaboration between patients and scientists, harnessing the power of collaborative intelligence. This alliance represents a fusion of diverse expertise, leveraging intelligent design principles to amplify the impact of knowledge from a few to benefit many.
Ron Itelman and Juan Cruz Viotti, authors of “Unifying Business, Data, and Code” published by O’Reilly, present a methodology for unifying. This approach helps organizations identify and eradicate the root causes of inefficiency and failure, which often stem from inadequate collaboration across knowledge, processes, and people.
Just as Louis Pasteur's discovery of germ theory revolutionized the understanding of disease transmission, leading to transformative practices like handwashing, the unifying methodology provides a similar scientific framework for understanding and addressing breakdowns in collaboration. It highlights how invisible flaws in knowledge integration can be as detrimental to organizational health as unseen germs are to human health.
An organization's greatest asset is the capability to convert each individual action into a teachable moment for the whole community, including customers, peers, or networks of collaboration. This approach transforms every objective, choice, and result into the cornerstones of shared intelligence, which can then be disseminated to empower everyone to reach their objectives with unprecedented effectiveness and efficiency.
The root of inefficiency in collaborative systems often lies in misalignment. Consider the analogy of a rowing team: if each member rows at different angles, directions, and speeds, as depicted in Figure 1, their collective effort is diminished. This misaligned team, despite its resources, cannot surpass a team that rows in perfect sync. Similarly, in the business world, even the most talented and well-resourced teams may underperform due to subtle yet critical misalignments in processes, data, language, and knowledge.
Achieving optimal performance necessitates a scientifically grounded methodology to scrutinize and rectify these misalignments. This necessary approach is embodied in the unifying methodology, a scientific process that seeks to harmonize diverse elements of collaborative systems, thereby unlocking their full potential. When we align people, processes, and technology in a way that is optimized for AI, we enable collaborative intelligence.
Figure 1 depicts two contrasting rowing teams: one team is shown struggling with misalignment, where each rower’s efforts are disjointed, leading to a lack of progress despite considerable effort. In contrast, the second team demonstrates perfect synchronization, with each rower's stroke harmoniously aligned with the others, translating the collective effort into a powerful, unified forward motion.
In addressing the significant challenges of a collaborative intelligence Rare Disease research approach, which aims to optimally align processes, knowledge, and technologies, several pivotal focal points arise:
Identifying key scientific collaboration challenges: How can we quantify misalignments, ambiguities, knowledge gaps, and blind spots at their source?
Decision-making in a governance framework: Aligning people, processes, and technologies effectively through a governance framework.
Technology roadmap: Implementing a cost-effective collaborative intelligence strategy.
Human-centric design: Crafting user experiences for researchers, doctors, and patients in interaction with collaborative systems.
Safety and trust: A robust system for generating robust, and reliable knowledge.
Unifying provides a step-by-step playbook to align strategies and systems towards the mastery of creating knowledge necessary to not only solve issues, but prevent them from ever occurring in the first place.
In traditional scientific approaches, research processes are often linear and compartmentalized, with findings primarily shared within closed academic circles or through peer-reviewed publications that may be behind paywalls.
This model, while ensuring a high standard of rigor and peer validation, can lead to slower dissemination of knowledge and a compartmentalization of data that may hinder collaborative opportunities, especially pertinent in fields like rare disease research where data is scarce and highly valuable.
On the other hand, open science champions a more dynamic, collaborative framework. It emphasizes transparency, data sharing, and broader engagement across various stakeholders, from academic researchers to patients themselves. This approach not only accelerates the pace of discovery by removing barriers to information access but also fosters a more inclusive research environment, encouraging diverse inputs and perspectives.
Intelligence.AI's case study aims to scrutinize these two paradigms, assessing their efficiency, quality, and potential for enhancing the healthcare sector's scientific knowledge.
This comprehensive analysis will guide us in recommending more effective collaboration frameworks, crucial for expediting advancements in healthcare and ultimately benefiting patients who depend on the swift progression of scientific research
Collaborative intelligence in research manifests when organizations form networks to share knowledge and expertise, both from within and outside their walls. This approach aims to function like the map application you use on your phone, it automatically knows your current location, and when you define your goal it reliably plots the most efficient path.
Artificial Intelligence serves as an accelerator in this journey, swiftly analyzing extensive datasets to uncover patterns that might otherwise evade human detection. Such collaboration leads to the automation of data management, streamlining processes towards the accumulation of knowledge and mastery—an educational principle applied to research.
Drawing an analogy from mountaineering, just as sherpas with their deep understanding of the Himalayas guide climbers through treacherous paths, collaborative intelligence enables decision sherpas, which can guide research scientists and stakeholders through complex decision-making processes, which much like many parts of the Himalayan mountains, are too vast and difficult to find optimal paths to success.
The pinnacle of collaborative intelligence lies in leveraging AI to augment human decision-making and rapidly achieve key objectives with unmatched speed and quality. However, this begins with a fundamental element - the data.
Establishing a robust groundwork of knowledge is pivotal, streamlining goals and decision-making processes. Opt for structured knowledge and sound governance over intricate systems prone to failure. The aim should be to lay down precise knowledge frameworks from the start, avoiding the inefficiency of later untangling disorderly data. This proactive approach is not only more effective but also subverts outdated data management methods.
With this solid base, data scientists can forge intelligent systems that fully leverage your data's inherent value. A prevalent issue in today's data practices is the siloed approach where data is haphazardly passed to data scientists, burdening them with the task of generating quality AI - a flawed expectation. Data scientists should participate in strategic discussions to ensure the acquisition of high-caliber data that translates into tangible business value.
The methodologies outlined in "Unifying Business, Data, and Code," in our O’Reilly book, provide a structured blueprint for achieving this. Intelligence.AI is proud to be at the forefront of this shift from conventional to open science, particularly within the challenging yet crucial domain of Rare Disease research.
The first question to ask before starting any transformative process is to understand its purpose and benefit. The reason unifying is important in scientific knowledge is that the most harmful and destructive force in organizations is misalignment, and the impacts of misalignment are revealed in different ways at different scales.
On the organizational management scale, when people have different understandings of objectives, and are working toward unaligned goals, it creates a tremendous amount of redundancy, inefficiency, and waste.
However, this can be addressed with proper business process mapping, and thoroughly vetting strategic priorities across managerial levels and teams. What happens at lower levels of granularity, such as data, are more difficult to see, because of the digital divide between business leadership, which is primarily non-technical, and the highly complex and technical world of data, which operates invisibly under the hood, in foreign language and concepts to most project managers and business leaders.
At the micro-levels of data, misalignment introduces problems which are much more complex and difficult to identify. At its worst, misalignment at the business levels combines with misalignment at the data levels, which can truly cripple decision-making. The best analogy for misalignment in data is germs, and the advent of germ theory by Louis Pasteur.
When Ignaz Semmelweiss tried to convince doctors to wash their hands in maternity wards, he was mocked and ridiculed. He saw that the male-run clinics mothers were dying with rates up to 18%, compared to the women-led midwives who did wash their hands and had a 2% mortality rate.
On the other hand, Louis Pasteur was able to introduce a formal germ theory of what germs were, and how they were infecting and spreading to harm people. At that time, germs were invisible, there was no real understanding of why people were dying from the infections in the maternity wards.
Just as the invisible threats of germs were potentially fatal to humans, the invisible threats in data to organizations are just as real. They threaten decision-making, and advanced capabilities like AI are only as good as your data.
In order to prevent misalignment in data, the key factors to look for are:
Ambiguity: When a term can have multiple meanings.
Knowledge Gaps: missing information.
Blind Spots: When there is no understanding of the risk of ambiguity and knowledge gaps in decision-making and/or data.
Unifying’s core principle is to minimize ambiguity, knowledge gaps, and blind spots at the organizational management level, the data level, and level that combines them: knowledge.
What exactly is knowledge? It is important to separate technologies, such as knowledge graphs, which may have facts and relationships defined in a database, from the definition of knowledge, and the definition of data. In unifying, we define knowledge as the information needed to solve problems related to a goal.
In other words, you can have lots of data, but what turns it into knowledge is when it is a specific set of data that solves specific problems in achieving specific goals.
When the business strategy layer is aligned with itself, and the data layer is aligned with itself, and the business strategy and data layers are aligned with each other, that is what we define as being unified.
A data product approach means that for every unit of data or knowledge that is shared, we want to bundle it into a complete package, just as we would a product. Our bundled data products has four facets, and the purpose of a data product is to minimize and/or eliminate ambiguity and knowledge gaps for anyone using the data product. Through the act of creating data products, we also ask the right questions to eliminate / minimize any blind spots we may have had as to whether ambiguity and knowledge gaps existed in the first place.
You can create a data product with spreadsheets, databases, and programming syntax such as JSON and/or JSON Schema. In fact, you could create data products with a whiteboard or a piece of paper. Data products are about asking the right questions to remove ambiguity, knowledge gaps, and blind spots, and creating the ideal environment for anyone using the data product (the customer, or data user). The four facets of a data product are:
Context: Who created the data or knowledge, and why? Are there any compliance, lineage, or other pieces of information we need to know to work effectively with it?
Meaning: What is the specific meaning of every phrase or term, assuming the user of the data product may have similar, overlapping terms?
Structure: Show any hierarchical relationships, constraints, validation logic, or other type of relationships of the data.
Data: The data is any information you communicate and store bundled with the other three facets.
Using this approach, our first steps should be to define, what exactly do we mean by open science? If you were to ask different organizations, and even different people within organizations what exactly does open science mean, you will most likely get different answers. Therefore, in our goal to eliminate ambiguity, knowledge gaps, and blind spots, let’s examine open science in terms of a continuum, with one end being closed science, and the other open science. What are the various degrees between them, and how does one know where one is on the spectrum?
The beginning of wisdom is the definition of terms.
An initial hypothesis might be the level of legal restrictions in place, with true open science being truly unrestricted for the public good, with no financial or legal constraints in using. An analogy in software might be an MIT or Apache 2 license. A second continuum might be the level of transparency, such as whether models, data, and research are revealed, versus only one of those categories.
Figure 2: A continuum between “closed” and “open” science paradigms, with points in between.
For those participating in open science and collaborative networks, how might we understand the various categories they are in? For example, one participant organization might have spent a lot of time curating and creating valuable data, and may make it available for other research teams to train ML algorithms on, which is different than another team which publishes findings for free with data, but not algorithms. Figure 3, below, demonstrates a visual representation of how “open” a participant may be.
Figure 3: Using two continuums to represent how open or closed scientific knowledge is; the x-axis represents three levels of transparency: data, models, and findings, and the y-axis represents legal restrictions, with fully closed / patented knowledge at the bottom, and completely unrestricted licenses at the top.
The value of this exercise is we now can have an agreed upon and quantitative way to measure the “openness” quality of science, which is important if we want to measure the success of open science.
Another area that we aim to seek alignment in our understanding is how exactly is our network of collaboration in scientific knowledge structured? For this, we may seek to build upon our continuums from previous Figures above, as shown in Figure 4.
Figure 4: The participants of a collaborative learning network, with their various degrees of “openness” in sharing knowledge.
Likewise, a similar approach may be used to define goals of each group, so they may be mapped to quantify how aligned they are in certain goals and outcomes.
It has been previously suggested that a critical way that the scientific community measures success is in the number of citations in publishing, which unlocks grants and credibility. However, the current approach to publishing in scientific journals leaves the act of validating hypotheses and findings to a select few who stamp or reject a paper.
This is very different from validation coming over long periods of time where knowledge is tested in the field against various goals.
In the unifying methodology, we define success in spectrums, or as we call it, success spectrums, where each point is a goal, and there are binary true/false statements that tell us whether we complete a goal, similar to an OKR perspective.
This is important because it gives us a quantitative measure of all of the conditions that must be true in order to be considered successful, removing blind spots of which knowledge is necessary to go through each stage of success, and making those steps available to all members of the network.
This process reveals the knowledge gaps, and informs us which knowledge is the most important to acquire for success, because the dependencies on that knowledge may propagate across the entire network, and enable the network to focus on critical components, rather than work in isolation.
Figure 5: A success spectrum defines a sequential set of states toward success.
Figure 6: By formalizing the processes in a standard, it enables systems and applications to be built, to standardize knowledge according to function and purpose with clear boundaries, and to track progress of the impact of knowledge in accelerating patients towards end goal states. Each state can be broken into more granular levels, to minimize ambiguity and knowledge gaps. The states shown are examples only.
Finally, we come to ask the question: how do we take advantage of the knowledge that we have acquired and integrate it into solutions to create value for those in the collaborative network? This usually requires some form of strategic alignment and change management, which requires a degree of innovation economics to provide sound reasoning and risk assessments for leaders to adopt any initiatives.
In order to maximize the potential for BioPharma, government agencies, and the scientific publishing and academic communities to mobilize open science knowledge, all of these proposed standards from the case study will be fully open sourced.
Figure 7: Combining participants, science “openness” measures, and their relationships to goals in a success spectrum enables us to understand from a holistic viewpoint, the impact of the network relative to any specific set of knowledge. This view provides value of knowledge to the collaborative network relative to collective goals.
The key to unlocking knowledge mobilization will be to understand the misalignment which prevents the integration of this framework across organizational systems. Gathering data points of friction across strategic (high level) and data (low level) layers will reveal the friction in the transformation required to begin utilizing a collaborative intelligence approach.
As stoic philosopher and Roman emperor Marcus Aurelius said, The Obstacle is the Way. To address Knowledge Mobilization, we must identify the obstacles as the transformation opportunities to integrate collaborative intelligence approaches, and the fruits of knowledge they deliver.
When we are able to align processes via success spectrums from perspectives of external users (such as patients, providers, researchers, and technologists) with the perspectives of the internal users of collaborative networks, and align concepts via data products, we are pushing beyond traditional collaborative intelligence into what we define as unified intelligence.
When we have aligned data, strategy, and AI at the micro and macro scales, for all members of a collaborative learning network to be contributing knowledge in a flywheel of continuous improvement, we are operating as a symphony, both synchronized and harmonized.
In the unifying methodology, we refer to this as CLEAN Data Governance, where CLEAN stands for Collaborative Learning Networks.
When a unified intelligence system of human and machine learners are working together to find the most efficient and effective paths for success, we have achieved knowledge mastery. Enabling knowledge mastery to reduce patient suffering is a noble cause indeed.
Figure 8: CLEAN Data Governance introduces a formal approach for network-thinking in order to align processes across human and machine learners in networks using data products and creating a culture that prioritizes data hygiene.
Just as a metronome and sheet music enables a group of musicians to align and play seamlessly, so too are beacons required for effective collaboration and communication in scientific knowledge.
The proposed strategies, from data products and organizational alignment through defining foundational concepts, aim to serve as a beacon to create knowledge which can unite participant efforts to break through barriers in serving those with rare diseases, and unlock therapies to rapidly accelerate scientific advancement.
Intelligence.AI is honored at the opportunity to serve the Rare Disease effort using the unifying methodology from our book, and aims to be a beacon, and amplify the beacons of others in improving patient’s lives who suffer from Rare Disease.