Dynamic networks

Dynamic networks reveal key players in aging
Milenkovic Lab

Dynamic networks reveal key players in aging

Contact:

Tijana Milenkovic, tmilenko AT nd DOT edu
Fazle E. Faisal, ffaisal AT nd DOT edu

Introduction: Because susceptibility to diseases increases with age, studying aging gains importance. Analyses of gene expression or sequence data, which have been indispensable for investigating aging, have been limited to studying genes and their protein products in isolation, ignoring their connectivities. However, proteins function by interacting with other proteins, and this is exactly what biological networks (BNs) model. Thus, analyzing the proteins' BN topologies could contribute to the understanding of aging. Current methods for analyzing systems-level BNs deal with their static representations, even though cells are dynamic. For this reason, and because different data types can give complementary biological insights, we integrate current static BNs with aging-related gene expression data to construct dynamic age-specific BNs. Then, we apply sensitive measures of topology to the dynamic BNs to study cellular changes with age. Below we provide our software for computing aging-related predictions from dynamic networks.

Reference: Fazle E. Faisal and Tijana Milenkovic (2014), Dynamic networks reveal key players in aging, Bioinformatics, 30(12):1721-1729.

Software: Our Unix version implementation for computing aging-related predictions from dynamic networks is available here.

Usage: ./generate-predictions.sh [network] [expression_dir] [detection_pv_threshold] [majority_call] [correlation_type] [rand_run] [min_ages_to_be_expressed] [pv_threshold] [output_dir]

[network] is the file containing the static PPI data in edge list format.
[expression_dir] is the directory containing the age-specific gene expression data. Each file in this directory contains gene expression data specific to a particular age. Each file in this directory must contain a number representing the age. For example, the file "sample-ge-data/s-20.txt" represents a gene expression data specific to age 20.
[detection_pv_threshold] is the detection p-value threshold to determine an expression gene from the gene expression data. We used a detection p-value 0.04 in our analysis.
[majority_call] is the parameter indicating a "majority vote rule" for determining an expressed gene if the expression data contains multiple samples per age and/or multiple probes per gene. Let gene g has x probes in a sample and let there be y samples at age a. Then, according to the majority call of 0.5, gene g will be considered as expressed at age a if more than 0.5 fraction (or 50%) of x * y probes are found to be expressed at age a. We used a majority call 0.5 in our analysis.
[correlation_type] indicates the correlation measure used in the program. The choices are P (Pearson correlation) and S (Spearman's correlation). We used P for the analysis in our main paper and both P and S for the analysis in the supplement.
[rand_run] is the number of random permutations for computing the p-value of aging-related predictions from dynamic networks. We performed 999,999 random permutations in our analysis.
[min_ages_to_be_expressed] is the parameter to filter genes that are unexpressed in most of the ages. A value x for this parameter will never consider consider a gene to be aging-related if the gene is expressed in fewer than x different ages. With the goal to filter the genes that are unexpressed in fewer than 20% of the ages, we use the value 8 for this parameter (because we had 37 different ages in our analysis).
[pv_threshold] is the p-value threshold to determine aging-related predictions. We used 0.01 p-value threshold in our analysis.
[output_dir] is the directory that will contain all the outputs.
Note. gene IDs in the static PPI data and gene expression data should be consistent.
The details on the usage of our software are provided in README.txt.

Computing detection p-values: Given a gene expression data, we used MAS 5.0 software package in R for computing detection p-values.

Gene ID mapping: We used David gene conversion tool to map gene IDs in the static PPI and gene expression data to a common gene ID.

Example: Given the static PPI data "sample-ppi-data.txt" and gene expression data "sample-ge-data", the aging-related predictions can be computed by the following command.

./generate-predictions.sh sample-ppi-data.txt sample-ge-data 0.04 0.5 P 999999 6 0.1 sample-output
The command will generate aging-related predictions based on 0.04 detection p-value, 0.5 majority call, pearson correlation measure, 999,999 random permutations, genes expressed in at least 6 ages, and 0.1 p-value threshold. The computed aging-related predictions will be saved in "sample-output/aging-predictions".
Note. the sample input and output files are provided in our software.
The details on the example usage of our software are provided in README.txt.

Data:

Static PPI data from HPRD and BioGRID.
Aging-related gene expression data from Berchtold et al. (2008).
The exact data used in the study is available here.

Please cite our our paper if you use our software.