Package 'DysPIA' reference manual

Title:	Dysregulated Pathway Identification Analysis
Description:	It is used to identify dysregulated pathways based on a pre-ranked gene pair list. A fast algorithm is used to make the computation really fast. The data in package 'DysPIAData' is needed.
Authors:	Limei Wang [aut, cre], Jin Li [aut, ctb]
Maintainer:	Limei Wang <[email protected]>
License:	GPL (>= 2)
Version:	1.4
Built:	2025-03-07 04:34:13 UTC
Source:	https://github.com/lemonwang2020/dyspia

calcDyspiaStat: Calculates DysPIA statistics

Description

Calculates DysPIA statistics for a given query gene pair set.

Usage

calcDyspiaStat(
  stats,
  selectedStats,
  DyspiaParam = 1,
  returnAllExtremes = FALSE,
  returnLeadingEdge = FALSE
)
calcDyspiaStat(
  stats,
  selectedStats,
  DyspiaParam = 1,
  returnAllExtremes = FALSE,
  returnLeadingEdge = FALSE
)

Arguments

`stats`	Named numeric vector with gene pair-level statistics sorted in decreasing order (order is not checked).
`selectedStats`	Indexes of selected gene pairs in the 'stats' array.
`DyspiaParam`	DysPIA weight parameter (0 is unweighted, suggested value is 1).
`returnAllExtremes`	If TRUE return not only the most extreme point, but all of them. Can be used for enrichment plot.
`returnLeadingEdge`	If TRUE return also leading edge gene pairs.

Value

Value of DysPIA statistic if both returnAllExtremes and returnLeadingEdge are FALSE. Otherwise returns list with the folowing elements:

res – value of DysPIA statistic
tops – vector of top peak values of cumulative enrichment statistic for each gene pair;
bottoms – vector of bottom peak values of cumulative enrichment statistic for each gene pair;
leadingEdge – vector with indexes of leading edge gene pairs that drive the enrichment.

Calculates DysPIA statistic values for all the prefixes of a gene pair set

Description

Calculates DysPIA statistic values for all the prefixes of a gene pair set

Usage

calcDyspiaStatCumulative(stats, selectedStats, DyspiaParam)
calcDyspiaStatCumulative(stats, selectedStats, DyspiaParam)

Arguments

`stats`	Named numeric vector with gene pair-level statistics sorted in decreasing order (order is not checked)
`selectedStats`	indexes of selected gene pairs in a 'stats' array
`DyspiaParam`	DysPIA weight parameter (0 is unweighted, suggested value is 1)

Value

Numeric vector of DysPIA statistics for all prefixes of selectedStats.

Calculates DysPIA statistic values for the gene pair sets

Description

Calculates DysPIA statistic values for the gene pair sets

Usage

calcDyspiaStatCumulativeBatch(
  stats,
  DyspiaParam,
  pathwayScores,
  pathwaysSizes,
  iterations,
  seed
)
calcDyspiaStatCumulativeBatch(
  stats,
  DyspiaParam,
  pathwayScores,
  pathwaysSizes,
  iterations,
  seed
)

Arguments

`stats`	Named numeric vector with gene pair-level statistics sorted in decreasing order (order is not checked).
`DyspiaParam`	DysPIA weight parameter (0 is unweighted, suggested value is 1).
`pathwayScores`	Vector with enrichment scores for the pathways in the database.
`pathwaysSizes`	Vector of pathway sizes.
`iterations`	Number of iterations.
`seed`	Seed vector

Value

List of DysPIA statistics for gene pair sets.

calEdgeCorScore_ESE

Description

Calculates differential Mutual information.

Usage

calEdgeCorScore_ESEA(
  dataset,
  class.labels,
  controlcharacter,
  casecharacter,
  background
)
calEdgeCorScore_ESEA(
  dataset,
  class.labels,
  controlcharacter,
  casecharacter,
  background
)

Arguments

`dataset`	Matrix of gene expression values (rownames are genes, columnnames are samples).
`class.labels`	Vector of binary labels.
`controlcharacter`	Charactor of control in the class labels.
`casecharacter`	Charactor of case in the class labels.
`background`	Matrix of the edges' background.

Value

A vector of the aberrant correlation in phenotype P based on mutual information (MI) for each edge.

Examples

data(gene_expression_p53, class.labels_p53,sample_background)
ESEAscore_p53<-calEdgeCorScore_ESEA(gene_expression_p53, class.labels_p53,
 "WT", "MUT", sample_background)

data(gene_expression_p53, class.labels_p53,sample_background)
ESEAscore_p53<-calEdgeCorScore_ESEA(gene_expression_p53, class.labels_p53,
 "WT", "MUT", sample_background)

Example vector of category labels.

Description

The labels for the 50 cell lines in p53 data. Control group's label is 'WT', case group's label is 'MUT'.

Usage

data(class.labels_p53)
data(class.labels_p53)

DysGPS: Calculates Dysregulated gene pair score (DysGPS) for each gene pair

Description

Calculates Dysregulated gene pair score (DysGPS) for each gene pair. Two-sample Welch's T test of gene pairs between case and control samples. The package 'DysPIAData' including the background data is needed to be loaded.

Usage

DysGPS(
  dataset,
  class.labels,
  controlcharacter,
  casecharacter,
  background = combined_background
)
DysGPS(
  dataset,
  class.labels,
  controlcharacter,
  casecharacter,
  background = combined_background
)

Arguments

`dataset`	Matrix of gene expression values (rownames are genes, columnnames are samples).
`class.labels`	Vector of category labels.
`controlcharacter`	Charactor of control group in the class labels.
`casecharacter`	Charactor of case group in the class labels.
`background`	Matrix of the gene pairs' background. The default is 'combined_background', which includes real pathway gene pairs and randomly producted gene pairs. The 'combined_background' was incluede in 'DysPIAData'.

Value

A vector of DysGPS for each gene pair.

Examples

data(gene_expression_p53, class.labels_p53,sample_background)
DysGPS_sample<-DysGPS(gene_expression_p53, class.labels_p53,
 "WT", "MUT", sample_background)

data(gene_expression_p53, class.labels_p53,sample_background)
DysGPS_sample<-DysGPS(gene_expression_p53, class.labels_p53,
 "WT", "MUT", sample_background)

Example vector of DysGPS in p53 data.

Description

The score vector of 164923 gene pairs from p53 dataset. It can be loaded from the example datasets of R-package 'DysPIA', and also can be obtained by running DysGPS(), details see DysGPS.R

Usage

data(DysGPS_p53)
data(DysGPS_p53)

DysPIA: Dysregulated Pathway Identification Analysis

Description

Runs Dysregulated Pathway Identification Analysis (DysPIA).The package 'DysPIAData' including the background data is needed to be loaded.

Usage

DysPIA(
  pathwayDB = "kegg",
  stats,
  nperm = 10000,
  minSize = 15,
  maxSize = 1000,
  nproc = 0,
  DyspiaParam = 1,
  BPPARAM = NULL
)
DysPIA(
  pathwayDB = "kegg",
  stats,
  nperm = 10000,
  minSize = 15,
  maxSize = 1000,
  nproc = 0,
  DyspiaParam = 1,
  BPPARAM = NULL
)

Arguments

`pathwayDB`	Name of the pathway database (8 databases:reactome,kegg,biocarta,panther,pathbank,nci,smpdb,pharmgkb). The default value is "kegg".
`stats`	Named vector of CILP scores for each gene pair. Names should be the same as in pathways.
`nperm`	Number of permutations to do. Minimial possible nominal p-value is about 1/nperm. The default value is 10000.
`minSize`	Minimal size of a gene pair set to test. All pathways below the threshold are excluded. The default value is 15.
`maxSize`	Maximal size of a gene pair set to test. All pathways above the threshold are excluded. The default value is 1000.
`nproc`	If not equal to zero sets BPPARAM to use nproc workers (default = 0).
`DyspiaParam`	DysPIA parameter value, all gene pair-level status are raised to the power of 'DyspiaParam' before calculation of DysPIA enrichment scores.
`BPPARAM`	Parallelization parameter used in bplapply. Can be used to specify cluster to run. If not initialized explicitly or by setting 'nproc' default value 'bpparam()' is used.

Value

A table with DysPIA results. Each row corresponds to a tested pathway. The columns are the following:

pathway – name of the pathway as in 'names(pathway)';
pval – an enrichment p-value;
padj – a BH-adjusted p-value;
DysPS – enrichment score, same as in Broad DysPIA implementation;
NDysPS – enrichment score normalized to mean enrichment of random samples of the same size;
nMoreExtreme' – a number of times a random gene pair set had a more extreme enrichment score value;
size – size of the pathway after removing gene pairs not present in 'names(stats)';
leadingEdge – vector with indexes of leading edge gene pairs that drive the enrichment.

Examples

data(pathway_list,package="DysPIAData")
data(DysGPS_p53)
DyspiaRes_p53 <- DysPIA("kegg", DysGPS_p53, nperm = 100, minSize = 20, maxSize = 100)

data(pathway_list,package="DysPIAData")
data(DysGPS_p53)
DyspiaRes_p53 <- DysPIA("kegg", DysGPS_p53, nperm = 100, minSize = 20, maxSize = 100)

Example list of DysPIA result in p53 data.

Description

The list includes 81 pathway results from 'DisPIA.R' as an example used in 'DyspiaSig.R'.

Usage

data(DyspiaRes_p53)
data(DyspiaRes_p53)

DyspiaSig

Description

Returns the significant summary of DysPIA results.

Usage

DyspiaSig(DyspiaRes, fdr)
DyspiaSig(DyspiaRes, fdr)

Arguments

`DyspiaRes`	Table with results of running DysPIA().
`fdr`	Significant threshold of 'padj' (a BH-adjusted p-value).

Value

A list of significant DysPIA results, including correlation gain and correlation loss.

Examples

data(pathway_list,package="DysPIAData")
data(DyspiaRes_p53)
summary_p53 <- DyspiaSig(DyspiaRes_p53, 0.05)       # filter with padj<0.05

data(pathway_list,package="DysPIAData")
data(DyspiaRes_p53)
summary_p53 <- DyspiaSig(DyspiaRes_p53, 0.05)       # filter with padj<0.05

DyspiaSimpleImpl

Description

Runs dysregulated pathway identification analysis for preprocessed input data.

Usage

DyspiaSimpleImpl(
  pathwayScores,
  pathwaysSizes,
  pathwaysFiltered,
  leadingEdges,
  permPerProc,
  seeds,
  toKeepLength,
  stats,
  BPPARAM
)
DyspiaSimpleImpl(
  pathwayScores,
  pathwaysSizes,
  pathwaysFiltered,
  leadingEdges,
  permPerProc,
  seeds,
  toKeepLength,
  stats,
  BPPARAM
)

Arguments

`pathwayScores`	Vector with enrichment scores for the pathways in the database.
`pathwaysSizes`	Vector of pathway sizes.
`pathwaysFiltered`	Filtered pathways.
`leadingEdges`	Leading edge gene pairs.
`permPerProc`	Parallelization parameter for permutations.
`seeds`	Seed vector
`toKeepLength`	Number of 'pathways' that meet the condition for 'minSize' and 'maxSize'.
`stats`	Named vector of gene pair-level scores. Names should be the same as in pathways of 'pathwayDB'.
`BPPARAM`	Parallelization parameter used in bplapply. Can be used to specify cluster to run. If not initialized explicitly or by setting 'nproc' default value 'bpparam()' is used.

Value

A table with DysPIA results. Each row corresponds to a tested pathway. The columns are the following:

pathway – name of the pathway as in 'names(pathway)';
pval – an enrichment p-value;
padj – a BH-adjusted p-value;
DysPS – enrichment score, same as in Broad DysPIA implementation;
NDysPS – enrichment score normalized to mean enrichment of random samples of the same size;
nMoreExtreme' – a number of times a random gene pair set had a more extreme enrichment score value;
size – size of the pathway after removing gene pairs not present in 'names(stats)';
leadingEdge – vector with indexes of leading edge gene pairs that drive the enrichment.

Example matrix of gene expression value.

Description

A dataset of transcriptional profiles from p53+ and p53 mutant cancer cell lines. It includes the normalized gene expression for 6385 genes in 50 samples. Rownames are genes, columnnames are samples.

Usage

data(gene_expression_p53)
data(gene_expression_p53)

Example list of gene pair background.

Description

The list of background was used in ”DysGPS.R' and 'calEdgeCorScore_ESEA.R' which is a part of the 'combined_background' in 'DysPIAData'.

Usage

data(sample_background)
data(sample_background)

setUpBPPARAM

Description

Sets up parameter BPPARAM value.

Usage

setUpBPPARAM(nproc = 0, BPPARAM = NULL)
setUpBPPARAM(nproc = 0, BPPARAM = NULL)

Arguments

`nproc`	If not equal to zero sets BPPARAM to use nproc workers (default = 0).
`BPPARAM`	Parallelization parameter used in bplapply. Can be used to specify cluster to run. If not initialized explicitly or by setting 'nproc' default value 'bpparam()' is used.

Value

parameter BPPARAM value

Package 'DysPIA'

Help Index

calcDyspiaStat: Calculates DysPIA statistics

Description

Usage

Arguments

Value

Calculates DysPIA statistic values for all the prefixes of a gene pair set

Description

Usage

Arguments

Value

Calculates DysPIA statistic values for the gene pair sets

Description

Usage

Arguments

Value

calEdgeCorScore_ESE

Description

Usage

Arguments

Value

Examples

Example vector of category labels.

Description

Usage

DysGPS: Calculates Dysregulated gene pair score (DysGPS) for each gene pair

Description

Usage

Arguments

Value

Examples

Example vector of DysGPS in p53 data.

Description

Usage

DysPIA: Dysregulated Pathway Identification Analysis

Description

Usage

Arguments

Value

Examples

Example list of DysPIA result in p53 data.

Description

Usage

DyspiaSig

Description

Usage

Arguments

Value

Examples

DyspiaSimpleImpl

Description

Usage

Arguments

Value

Example matrix of gene expression value.

Description

Usage

Example list of gene pair background.

Description

Usage

setUpBPPARAM

Description

Usage

Arguments

Value