More on this book
Kindle Notes & Highlights
Read between
December 4, 2019 - May 22, 2020
Network medicine embraces the complexity of multifactorial influences on disease, which can be driven by nonlinear effects and molecular and statistical interactions. The development of comprehensive and affordable -omics platforms provides the data types for network medicine, and graph theory and statistical physics provide the theoretical framework to analyze networks.
Post-translational modification of histone proteins (histone “marks”) can either facilitate or impair access to the transcriptional machinery. These histone marks, DNA methylation, RNA methylation, and noncoding, regulatory RNAs (vide infra) comprise the elements of epigenetic regulation, viz., determinants of phenotype that do not depend on the intrinsic genetic sequence of DNA itself.
Noncoding RNAs include microRNAs and long-noncoding RNAs (lncRNAs), which have important gene regulatory functions.
In order to identify a comprehensive set of interactions between genes and proteins, the development of large-scale datasets that capture the biological activities of cells is essential
All of these -omics approaches require substantial bioinformatics support and sophisticated statistical analyses.
Networks are composed of nodes (also known as vertexes) that are connected by edges (also known as links)
Key nodes that include multiple edges are often referred to as “hubs.” Graph theory is used to describe and analyze networks.
A network can be completely specified by the whole list of edges between nodes; such “edge lists” can be used to improve computational efficiency
The frequencies of the degree values in the network constitute the “degree distribution” for the network, which corresponds to the probability that a randomly selected node has a specific degree value
The minimum number of edges that must be traversed to travel between two nodes in a network is referred to as the “shortest path length” or “geodesic path.”
The mean shortest path length among all of the nodes in a connected network is also known as the “characteristic path length.”
The “betweenness” or “betweenness centrality” of a particular node or edge assesses how often that network component is present within the group of shortest paths in the network; it is calculated as the number of shortest paths in the network that pass through that node or edge divided by the total number of shortest paths in the network.
The “clustering coefficient” is another metric frequently used to characterize network structure; it describes the probability that two nodes that are connected to another node are directly connected themselves within the network.
The earliest network models, developed by Erdős and Rényi, were based on an equal probability of connections between nodes in the network (Steuer and Lopez 2008). Thus, the degree distribution of these random, undirected networks followed a Poisson distribution (Loscalzo 2012)
Many networks that have been studied in biological and other contexts have degree distributions that follow a power law; in such networks, most of the nodes have a low degree value, but a small number of nodes (the hubs mentioned above) have very high degree values—much higher than would be expected with a Poisson distribution of degree values.
These networks are described as scale-free because the slope of this power law distribution is invariant with respect to size, a property that represents a key feature of evolving molecular networks.
Scale-free networks are remarkably robust to loss of a random node in the network, while still maintaining their topological characteristics such as average path length. However, scale-free networks can be disabled by losing a small number of hub nodes—so they are susceptible to attack by processes such as infection or malignant transformation that affect these key network components.
In a directed network, a closed loop of directed edges can form a cycle. Directed acyclic networks (also known as directed acyclic graphs), which do not include any cycles, have an adjacency matrix with all of its nonzero elements above the diagonal. All of the eigenvalues of the adjacency matrix for directed acyclic graphs are zero.
By analogy, the network of highways in the United States provides a topological network, while the flow of traffic along the highways denotes a dynamic network.
Although most networks models developed to date have been deterministic, ignoring the impact of random biological noise on network behavior, these stochastic events can have important effects on biological processes such as gene expression.
Biological noise can result from intrinsic influences, such as having a limited number of molecules involved in a particular process (e.g., receptors or enzymes), or extrinsic processes, such as environmental exposures. Various stochastic modeling approaches, such as the stochastic simulation algorithm, have been applied in gene regulatory networks;
Structural networks depict the topological relationships between biological entities that have a known relationship, such as a protein–protein interaction network. In structural networks, biological entities are represented as nodes and the presence of a biological relationship is indicated by an edge.
Networks that are composed of two different types of nodes are referred to as “bipartite.” In a bipartite network (or graph), edges connect one type of node to another—never two nodes of the same type. Bipartite networks can be directed or undirected.
Key regulatory motifs, such as feedback loops, can often be identified in genetic regulatory networks (Bulyk and Walhout 2013).
Based on presence/absence or quantitative microbiomics abundance data, microbial networks can be built to capture ecological relationships between species. For example, pairwise relationships can be analyzed by assessing the similarity in co-occurrence of different microbial species across multiple biological samples, and a network of statistically significant pairwise relationships can then be constructed. Alternatively, more complex relationships between the abundance of multiple microbial species can be determined with regression or association rule mining approaches.
Environmental factors are key determinants of many human diseases and likely influence all of the -omics measurements described above; however, they are often difficult to identify and measure. Recent efforts to analyze a comprehensive set of environmental exposures inside and outside the body (the “exposome”) have been proposed (Wild 2012).
There is not a single biological network within a cell or an organism; rather, there are multiple interdependent networks that vary over time
Failure of a key dependent node in one network can lead to additional failures in related networks. Further research will be required to apply this approach to the multiple, dynamic biological networks involved in many human diseases.
Human diseases are influenced by a series of interdependent networks, including social networks, which specify interactions between people; disease networks, which analyze relationships between disease entities; and molecular networks, which include protein–protein interactions, genetic regulatory networks, and metabolic networks.
The mechanisms underlying human disease involve complex interactions across many levels of cellular organization, from protein–DNA interactions to signal transduction and metabolism. Despite the very different nature of the components and the diversity of the interactions between them, they have one important thing in common: they can all be described as networks.
Networks are defined as a collection of components and their interactions. The components are called nodes or vertices and their interactions links or edges.
The number of links a node has (i.e., the number of its direct neighbors) is called its degree k.
A network path refers to a sequence of links that connect two nodes A and B; its length l is simply given by the number of steps.
The diameter dmax of a network is the longest of all shortest paths between any two nodes. Most real networks have a surprisingly small diameter, a property called the “small world” phenomenon, referring to the popular notion that everyone is connected to everyone else by only a small number of intermediate acquaintances.
The interactome represents a comprehensive map of all biologically relevant molecular interactions, for example, binary, regulatory, or signaling interactions.
Instead, as discovered in 1999 (Albert, Jeong, et al. 1999; Barabási and Albert 1999), many real-world networks are scale-free, exhibiting a power-law degree distribution:
While the vast majority of nodes have only a few connections, there are some nodes in the network with a very large number of links, called hubs.
The presence of hubs impacts many network properties. For example, they serve as shortcuts, connecting different parts of a network, making them not just “small,” but “ultrasmall”
Different centrality measures of a node generally correlate with each other, and hubs tend to have high centrality, as they are likely to lie on many shortest paths.
Clustering describes the tendency for two neighbors of a node to also be connected to each other.
Small recurrent subgraphs in a network are called motifs
For example, a particularly simple motif found in the Escherichia coli regulatory network is a single node with an inhibitory self-loop, representing a transcription factor repressing its own expression.
Other motifs observed in regulatory networks include feed-forward loops, feedback loops, and oscillators.
At the same time, the functional interpretation of larger motifs is difficult since their interface with the rest of the network increases, thereby impeding their analysis in isolation from the rest of the network.
A community is loosely defined as a subgraph with high local link density, so that nodes within the community have a higher number of links to each other than to nodes outside the community.
Modularity is an attribute of a system that can be decomposed into a set of cohesive entities that are loosely coupled. Many cellular networks can be decomposed into functional modules—each functionally separable from the other modules.
E.g A functional module is defined as a group of genes or their products which are related by one or more genetic or cellular interactions, e.g. co-regulation, co-expression or membership of a protein complex, of a metabolic or signaling pathway or of a cellular aggregate (e.g. chaperone, ribosome, protein transport facilitator, etc.). An important property of a module is that its function is separable from other modules and that its members have more relations among themselves than with members of other modules, which is reflected in the network topology.
Two quantities allow us to measure the degree of network localization of a given set of nodes
1. Size of the largest connected component S, that is, the number of nodes that form a connected subgraph
2. Mean shortest distance.
the more significant the localization of a disease module, the more similar are the molecular functions of the proteins involved in it.