Advanced Usage - PBMC + IFNb in sample scenario¶
In this tutorial we’ll be using pyroma to capture sample level variations in the perturbed setting. By querying predefined genesets, such as MSigDb Hallmarks, such approach highlights the largest sources of variance inside the dataset. This could give insights into how such variance relates to the perturbed condition, cell types, cluster or other labels.
[ ]:
import os
import zipfile
import io
import numpy as np
import pandas as pd
import scanpy as sc
import scipy.io
from glob import glob
import time
sc.set_figure_params(figsize=(3, 3), frameon=False )
import pyroma
roma = pyroma.ROMA()
roma
scROMA
load the PBMC dataset with IFNb stimulation from Kang et al. 2018¶
This is a good dataset for our purpose with clear expected biological response upon perturbation
[2]:
adata = pyroma.datasets.pbmc_ifnb()
adata
[2]:
AnnData object with n_obs × n_vars = 13576 × 14053
obs: 'orig.ident', 'nCount_RNA', 'nFeature_RNA', 'stim', 'seurat_annotations', 'integrated_snn_res.0.5', 'seurat_clusters', 'condition', 'cell_type'
subsmaple to reduce the computation cost
[3]:
sc.pp.subsample(adata, n_obs=3000)
standard preprocessing¶
[4]:
sc.pp.filter_cells(adata, min_genes=100)
sc.pp.filter_genes(adata, min_cells=3)
sc.pp.normalize_total(adata)
sc.pp.log1p(adata)
[5]:
adata
[5]:
AnnData object with n_obs × n_vars = 3000 × 12158
obs: 'orig.ident', 'nCount_RNA', 'nFeature_RNA', 'stim', 'seurat_annotations', 'integrated_snn_res.0.5', 'seurat_clusters', 'condition', 'cell_type', 'n_genes'
var: 'n_cells'
uns: 'log1p'
first look at the data¶
[6]:
sc.tl.pca(adata)
sc.pp.neighbors(adata)
sc.tl.umap(adata)
[7]:
# "seurat_annotations" contain cell type information
# "condition" - is the IFNb stimulation category
sc.pl.umap(adata, color=['seurat_annotations', 'condition'],
ncols=2,
wspace=0.5)
run pyroma¶
[8]:
hallmarks_gmt_path = pyroma.genesets.use_hallmarks()
[9]:
### change your output directory to store results
output_dir = "/home/az/Downloads/tutorial-test-results/02_pbmc_ifnb_sample/"
sample_name = "kang_sample_all"
roma.gmt = hallmarks_gmt_path
roma.adata = adata
roma.pc_sign_mode = 'UseMeanExpressionAllWeights'
pathways_to_check = 'all'
iters = 100
start = time.time()
roma.compute(pathways_to_check,
parallel=True,
loocv_on=True,
iters=iters,
)
# save the results
roma.save_active_modules_results(os.path.join(output_dir,sample_name), only_active=False)
end = time.time()
minutes, seconds = divmod(end - start, 60)
print(f"CPU Parallel {iters} iterations for shape {adata.shape}, running time (min): " + "{:0>2}:{:05.2f}".format(int(minutes),seconds))
Processing gene set: HALLMARK_NOTCH_SIGNALING | len of subsetlist: 27
Processing gene set: HALLMARK_HEDGEHOG_SIGNALING | len of subsetlist: 17
Processing gene set: HALLMARK_ANGIOGENESIS | len of subsetlist: 20
Processing gene set: HALLMARK_PANCREAS_BETA_CELLS | len of subsetlist: 14
Processing gene set: HALLMARK_WNT_BETA_CATENIN_SIGNALING | len of subsetlist: 33
Processing gene set: HALLMARK_APICAL_SURFACE | len of subsetlist: 23
Processing gene set: HALLMARK_REACTIVE_OXYGEN_SPECIES_PATHWAY | len of subsetlist: 46
Processing gene set: HALLMARK_TGF_BETA_SIGNALING | len of subsetlist: 48
Processing gene set: HALLMARK_MYC_TARGETS_V2 | len of subsetlist: 54
Processing gene set: HALLMARK_CHOLESTEROL_HOMEOSTASIS | len of subsetlist: 61
Processing gene set: HALLMARK_IL6_JAK_STAT3_SIGNALING | len of subsetlist: 74
Processing gene set: HALLMARK_PROTEIN_SECRETION | len of subsetlist: 87
Processing gene set: HALLMARK_INTERFERON_ALPHA_RESPONSE | len of subsetlist: 93
Processing gene set: HALLMARK_ANDROGEN_RESPONSE | len of subsetlist: 78
Processing gene set: HALLMARK_PEROXISOME | len of subsetlist: 82
Processing gene set: HALLMARK_PI3K_AKT_MTOR_SIGNALING | len of subsetlist: 91
Processing gene set: HALLMARK_BILE_ACID_METABOLISM | len of subsetlist: 66
Processing gene set: HALLMARK_UNFOLDED_PROTEIN_RESPONSE | len of subsetlist: 101
Processing gene set: HALLMARK_SPERMATOGENESIS | len of subsetlist: 64
Processing gene set: HALLMARK_COAGULATION | len of subsetlist: 72
Processing gene set: HALLMARK_UV_RESPONSE_DN | len of subsetlist: 101
Processing gene set: HALLMARK_DNA_REPAIR | len of subsetlist: 140
Processing gene set: HALLMARK_FATTY_ACID_METABOLISM | len of subsetlist: 126
Processing gene set: HALLMARK_UV_RESPONSE_UP | len of subsetlist: 125
Processing gene set: HALLMARK_APOPTOSIS | len of subsetlist: 142
Processing gene set: HALLMARK_MITOTIC_SPINDLE | len of subsetlist: 171
Processing gene set: HALLMARK_IL2_STAT5_SIGNALING | len of subsetlist: 167
Processing gene set: HALLMARK_TNFA_SIGNALING_VIA_NFKB | len of subsetlist: 187
Processing gene set: HALLMARK_HYPOXIA | len of subsetlist: 140
Processing gene set: HALLMARK_G2M_CHECKPOINT | len of subsetlist: 151
Processing gene set: HALLMARK_ADIPOGENESIS | len of subsetlist: 157
Processing gene set: HALLMARK_ESTROGEN_RESPONSE_EARLY | len of subsetlist: 116
Processing gene set: HALLMARK_ESTROGEN_RESPONSE_LATE | len of subsetlist: 118
Processing gene set: HALLMARK_MYOGENESIS | len of subsetlist: 90
Processing gene set: HALLMARK_INTERFERON_GAMMA_RESPONSE | len of subsetlist: 190
Processing gene set: HALLMARK_APICAL_JUNCTION | len of subsetlist: 110
Processing gene set: HALLMARK_COMPLEMENT | len of subsetlist: 159
Processing gene set: HALLMARK_MTORC1_SIGNALING | len of subsetlist: 183
Processing gene set: HALLMARK_E2F_TARGETS | len of subsetlist: 168
Processing gene set: HALLMARK_MYC_TARGETS_V1 | len of subsetlist: 188
Processing gene set: HALLMARK_EPITHELIAL_MESENCHYMAL_TRANSITION | len of subsetlist: 96
Processing gene set: HALLMARK_INFLAMMATORY_RESPONSE | len of subsetlist: 163
Processing gene set: HALLMARK_XENOBIOTIC_METABOLISM | len of subsetlist: 129
Processing gene set: HALLMARK_OXIDATIVE_PHOSPHORYLATION | len of subsetlist: 178
Processing gene set: HALLMARK_GLYCOLYSIS | len of subsetlist: 141
Processing gene set: HALLMARK_P53_PATHWAY | len of subsetlist: 161
Processing gene set: HALLMARK_HEME_METABOLISM | len of subsetlist: 148
Processing gene set: HALLMARK_ALLOGRAFT_REJECTION | len of subsetlist: 172
Processing gene set: HALLMARK_KRAS_SIGNALING_UP | len of subsetlist: 131
Processing gene set: HALLMARK_KRAS_SIGNALING_DN | len of subsetlist: 62
Finished: CPU Parallel 100 iterations for shape (3000, 12158), running time (min): 01:10.07
[10]:
roma
[10]:
scROMA: module activities are computed
[11]:
roma.adata
[11]:
AnnData object with n_obs × n_vars = 3000 × 12158
obs: 'orig.ident', 'nCount_RNA', 'nFeature_RNA', 'stim', 'seurat_annotations', 'integrated_snn_res.0.5', 'seurat_clusters', 'condition', 'cell_type', 'n_genes'
var: 'n_cells'
uns: 'log1p', 'pca', 'neighbors', 'umap', 'seurat_annotations_colors', 'condition_colors', 'ROMA_stats', 'ROMA_active_modules'
obsm: 'X_pca', 'X_umap'
varm: 'PCs'
obsp: 'distances', 'connectivities'
all the pyroma statistics¶
[12]:
roma.adata.uns['ROMA_stats']
[12]:
| L1 | ppv L1 | Median Exp | ppv Med Exp | q L1 | q Med Exp | |
|---|---|---|---|---|---|---|
| HALLMARK_NOTCH_SIGNALING | 0.383079 | 0.35 | -0.032327 | 0.316832 | 0.729167 | 0.465929 |
| HALLMARK_HEDGEHOG_SIGNALING | 0.264958 | 0.63 | 0.014416 | 0.564356 | 0.926471 | 0.667741 |
| HALLMARK_ANGIOGENESIS | 0.793618 | 0.03 | 0.169210 | 0.009901 | 0.500000 | 0.030941 |
| HALLMARK_PANCREAS_BETA_CELLS | 0.302861 | 0.50 | 0.067470 | 0.089109 | 0.862069 | 0.143724 |
| HALLMARK_WNT_BETA_CATENIN_SIGNALING | 0.279624 | 0.58 | 0.058241 | 0.089109 | 0.926471 | 0.143724 |
| HALLMARK_APICAL_SURFACE | 0.365530 | 0.38 | 0.013935 | 0.574257 | 0.760000 | 0.667741 |
| HALLMARK_REACTIVE_OXYGEN_SPECIES_PATHWAY | 0.543208 | 0.15 | 0.105772 | 0.029703 | 0.500000 | 0.064572 |
| HALLMARK_TGF_BETA_SIGNALING | 0.152006 | 0.89 | 0.113007 | 0.009901 | 1.000000 | 0.030941 |
| HALLMARK_MYC_TARGETS_V2 | 0.202604 | 0.68 | 0.833996 | 0.009901 | 0.937500 | 0.030941 |
| HALLMARK_CHOLESTEROL_HOMEOSTASIS | 0.412974 | 0.17 | 0.065308 | 0.029703 | 0.500000 | 0.064572 |
| HALLMARK_IL6_JAK_STAT3_SIGNALING | 0.394130 | 0.16 | -0.079453 | 0.019802 | 0.500000 | 0.055006 |
| HALLMARK_PROTEIN_SECRETION | 0.196540 | 0.70 | 0.049578 | 0.049505 | 0.937500 | 0.095202 |
| HALLMARK_INTERFERON_ALPHA_RESPONSE | 0.377076 | 0.18 | 4.424024 | 0.009901 | 0.500000 | 0.030941 |
| HALLMARK_ANDROGEN_RESPONSE | 0.261892 | 0.40 | 0.000554 | 0.980198 | 0.769231 | 0.980198 |
| HALLMARK_PEROXISOME | 0.362642 | 0.18 | -0.055311 | 0.049505 | 0.500000 | 0.095202 |
| HALLMARK_PI3K_AKT_MTOR_SIGNALING | 0.169255 | 0.74 | 0.043356 | 0.108911 | 0.937500 | 0.170173 |
| HALLMARK_BILE_ACID_METABOLISM | 0.198035 | 0.70 | 0.019200 | 0.524752 | 0.937500 | 0.652016 |
| HALLMARK_UNFOLDED_PROTEIN_RESPONSE | 0.275479 | 0.25 | 0.136769 | 0.009901 | 0.657895 | 0.030941 |
| HALLMARK_SPERMATOGENESIS | 0.284264 | 0.52 | 0.005079 | 0.861386 | 0.866667 | 0.936289 |
| HALLMARK_COAGULATION | 0.451178 | 0.09 | 0.311346 | 0.009901 | 0.500000 | 0.030941 |
| HALLMARK_UV_RESPONSE_DN | 0.267346 | 0.28 | 0.007185 | 0.792079 | 0.666667 | 0.880088 |
| HALLMARK_DNA_REPAIR | 0.090157 | 0.94 | -0.142698 | 0.009901 | 1.000000 | 0.030941 |
| HALLMARK_FATTY_ACID_METABOLISM | 0.221426 | 0.42 | -0.070138 | 0.039604 | 0.777778 | 0.082508 |
| HALLMARK_UV_RESPONSE_UP | 0.241909 | 0.33 | 0.049684 | 0.079208 | 0.729167 | 0.136565 |
| HALLMARK_APOPTOSIS | 0.278954 | 0.18 | 0.102191 | 0.009901 | 0.500000 | 0.030941 |
| HALLMARK_MITOTIC_SPINDLE | 0.049669 | 1.00 | 0.051391 | 0.069307 | 1.000000 | 0.123762 |
| HALLMARK_IL2_STAT5_SIGNALING | 0.361183 | 0.04 | 0.014583 | 0.435644 | 0.500000 | 0.558517 |
| HALLMARK_TNFA_SIGNALING_VIA_NFKB | 0.286159 | 0.13 | 0.346887 | 0.009901 | 0.500000 | 0.030941 |
| HALLMARK_HYPOXIA | 0.184140 | 0.60 | -0.080034 | 0.029703 | 0.926471 | 0.064572 |
| HALLMARK_G2M_CHECKPOINT | 0.093552 | 0.94 | -0.002816 | 0.900990 | 1.000000 | 0.958500 |
| HALLMARK_ADIPOGENESIS | 0.084591 | 0.97 | -0.018863 | 0.376238 | 1.000000 | 0.508429 |
| HALLMARK_ESTROGEN_RESPONSE_EARLY | 0.115814 | 0.90 | 0.000487 | 0.970297 | 1.000000 | 0.980198 |
| HALLMARK_ESTROGEN_RESPONSE_LATE | 0.248334 | 0.35 | -0.017774 | 0.534653 | 0.729167 | 0.652016 |
| HALLMARK_MYOGENESIS | 0.143026 | 0.81 | 0.070651 | 0.019802 | 0.987805 | 0.055006 |
| HALLMARK_INTERFERON_GAMMA_RESPONSE | 0.331686 | 0.08 | 1.677966 | 0.009901 | 0.500000 | 0.030941 |
| HALLMARK_APICAL_JUNCTION | 0.267790 | 0.28 | 0.027541 | 0.346535 | 0.666667 | 0.495050 |
| HALLMARK_COMPLEMENT | 0.310541 | 0.12 | 0.385006 | 0.009901 | 0.500000 | 0.030941 |
| HALLMARK_MTORC1_SIGNALING | 0.099295 | 0.92 | -0.015835 | 0.425743 | 1.000000 | 0.558517 |
| HALLMARK_E2F_TARGETS | 0.064892 | 0.99 | 0.131956 | 0.009901 | 1.000000 | 0.030941 |
| HALLMARK_MYC_TARGETS_V1 | 0.190448 | 0.47 | 1.139576 | 0.009901 | 0.839286 | 0.030941 |
| HALLMARK_EPITHELIAL_MESENCHYMAL_TRANSITION | 0.435304 | 0.03 | 0.116284 | 0.009901 | 0.500000 | 0.030941 |
| HALLMARK_INFLAMMATORY_RESPONSE | 0.300841 | 0.12 | 0.350755 | 0.009901 | 0.500000 | 0.030941 |
| HALLMARK_XENOBIOTIC_METABOLISM | 0.154182 | 0.75 | 0.038156 | 0.118812 | 0.937500 | 0.180018 |
| HALLMARK_OXIDATIVE_PHOSPHORYLATION | 0.072015 | 0.98 | -0.475797 | 0.009901 | 1.000000 | 0.030941 |
| HALLMARK_GLYCOLYSIS | 0.178932 | 0.63 | -0.075843 | 0.029703 | 0.926471 | 0.064572 |
| HALLMARK_P53_PATHWAY | 0.269423 | 0.16 | -0.018395 | 0.356436 | 0.500000 | 0.495050 |
| HALLMARK_HEME_METABOLISM | 0.152306 | 0.75 | -0.007931 | 0.772277 | 0.937500 | 0.877588 |
| HALLMARK_ALLOGRAFT_REJECTION | 0.287760 | 0.13 | -0.063326 | 0.029703 | 0.500000 | 0.064572 |
| HALLMARK_KRAS_SIGNALING_UP | 0.348036 | 0.07 | -0.062998 | 0.069307 | 0.500000 | 0.123762 |
| HALLMARK_KRAS_SIGNALING_DN | 0.520298 | 0.09 | -0.000882 | 0.980198 | 0.500000 | 0.980198 |
active pathways¶
[13]:
roma.adata.uns['ROMA_active_modules'].sort_values(by='Median Exp')
[13]:
| L1 | ppv L1 | Median Exp | ppv Med Exp | q L1 | q Med Exp | |
|---|---|---|---|---|---|---|
| HALLMARK_OXIDATIVE_PHOSPHORYLATION | 0.072015 | 0.98 | -0.475797 | 0.009901 | 1.000000 | 0.030941 |
| HALLMARK_DNA_REPAIR | 0.090157 | 0.94 | -0.142698 | 0.009901 | 1.000000 | 0.030941 |
| HALLMARK_APOPTOSIS | 0.278954 | 0.18 | 0.102191 | 0.009901 | 0.500000 | 0.030941 |
| HALLMARK_TGF_BETA_SIGNALING | 0.152006 | 0.89 | 0.113007 | 0.009901 | 1.000000 | 0.030941 |
| HALLMARK_EPITHELIAL_MESENCHYMAL_TRANSITION | 0.435304 | 0.03 | 0.116284 | 0.009901 | 0.500000 | 0.030941 |
| HALLMARK_E2F_TARGETS | 0.064892 | 0.99 | 0.131956 | 0.009901 | 1.000000 | 0.030941 |
| HALLMARK_UNFOLDED_PROTEIN_RESPONSE | 0.275479 | 0.25 | 0.136769 | 0.009901 | 0.657895 | 0.030941 |
| HALLMARK_ANGIOGENESIS | 0.793618 | 0.03 | 0.169210 | 0.009901 | 0.500000 | 0.030941 |
| HALLMARK_COAGULATION | 0.451178 | 0.09 | 0.311346 | 0.009901 | 0.500000 | 0.030941 |
| HALLMARK_TNFA_SIGNALING_VIA_NFKB | 0.286159 | 0.13 | 0.346887 | 0.009901 | 0.500000 | 0.030941 |
| HALLMARK_INFLAMMATORY_RESPONSE | 0.300841 | 0.12 | 0.350755 | 0.009901 | 0.500000 | 0.030941 |
| HALLMARK_COMPLEMENT | 0.310541 | 0.12 | 0.385006 | 0.009901 | 0.500000 | 0.030941 |
| HALLMARK_MYC_TARGETS_V2 | 0.202604 | 0.68 | 0.833996 | 0.009901 | 0.937500 | 0.030941 |
| HALLMARK_MYC_TARGETS_V1 | 0.190448 | 0.47 | 1.139576 | 0.009901 | 0.839286 | 0.030941 |
| HALLMARK_INTERFERON_GAMMA_RESPONSE | 0.331686 | 0.08 | 1.677966 | 0.009901 | 0.500000 | 0.030941 |
| HALLMARK_INTERFERON_ALPHA_RESPONSE | 0.377076 | 0.18 | 4.424024 | 0.009901 | 0.500000 | 0.030941 |
[14]:
roma.summary()
=== ROMA Analysis Summary ===
Total gene sets analyzed: 50
Active modules (overdispersed): 0
Active modules (shifted): 16
Top 5 overdispersed modules:
HALLMARK_ANGIOGENESIS: L1=0.794, q=5.000e-01
HALLMARK_REACTIVE_OXYGEN_SPECIES_PATHWAY: L1=0.543, q=5.000e-01
HALLMARK_CHOLESTEROL_HOMEOSTASIS: L1=0.413, q=5.000e-01
HALLMARK_IL6_JAK_STAT3_SIGNALING: L1=0.394, q=5.000e-01
HALLMARK_INTERFERON_ALPHA_RESPONSE: L1=0.377, q=5.000e-01
Top 5 shifted modules:
HALLMARK_ANGIOGENESIS: Med=0.169, q=3.094e-02
HALLMARK_TGF_BETA_SIGNALING: Med=0.113, q=3.094e-02
HALLMARK_MYC_TARGETS_V2: Med=0.834, q=3.094e-02
HALLMARK_INTERFERON_ALPHA_RESPONSE: Med=4.424, q=3.094e-02
HALLMARK_UNFOLDED_PROTEIN_RESPONSE: Med=0.137, q=3.094e-02
Potential issues found: 1
missing_genes: 1 modules
visualize results¶
[15]:
save_dir=os.path.join(output_dir,sample_name,'01_figures')
os.makedirs(save_dir, exist_ok=True)
sc.settings.figdir = save_dir
[28]:
roma.load_active_modules_results(path=os.path.join(output_dir,sample_name))
active_pathways = roma.adata.uns['ROMA_active_modules'].index.tolist()
df = pd.DataFrame(index=adata.obs.index)
for k in active_pathways:
v = adata.uns['ROMA'][k]
df[k] = v.svd.components_[0]
adata.obsm['pyroma_scores'] = df.copy()
adata
[28]:
AnnData object with n_obs × n_vars = 3000 × 12158
obs: 'orig.ident', 'nCount_RNA', 'nFeature_RNA', 'stim', 'seurat_annotations', 'integrated_snn_res.0.5', 'seurat_clusters', 'condition', 'cell_type', 'n_genes'
var: 'n_cells'
uns: 'log1p', 'pca', 'neighbors', 'umap', 'seurat_annotations_colors', 'condition_colors', 'ROMA_stats', 'ROMA_active_modules', 'ROMA'
obsm: 'X_pca', 'X_umap', 'pyroma_scores'
varm: 'PCs'
obsp: 'distances', 'connectivities'
[29]:
sc.pl.pca(adata, color=['seurat_annotations',
],
frameon=False,
#ncols=2,
wspace=0.4)
sc.pl.umap(adata, color=['seurat_annotations',
],
frameon=False,
#ncols=2,
wspace=0.4)
[30]:
genesets = roma.read_gmt_to_dict(roma.gmt)
genesets.keys()
[30]:
dict_keys(['HALLMARK_TNFA_SIGNALING_VIA_NFKB', 'HALLMARK_HYPOXIA', 'HALLMARK_CHOLESTEROL_HOMEOSTASIS', 'HALLMARK_MITOTIC_SPINDLE', 'HALLMARK_WNT_BETA_CATENIN_SIGNALING', 'HALLMARK_TGF_BETA_SIGNALING', 'HALLMARK_IL6_JAK_STAT3_SIGNALING', 'HALLMARK_DNA_REPAIR', 'HALLMARK_G2M_CHECKPOINT', 'HALLMARK_APOPTOSIS', 'HALLMARK_NOTCH_SIGNALING', 'HALLMARK_ADIPOGENESIS', 'HALLMARK_ESTROGEN_RESPONSE_EARLY', 'HALLMARK_ESTROGEN_RESPONSE_LATE', 'HALLMARK_ANDROGEN_RESPONSE', 'HALLMARK_MYOGENESIS', 'HALLMARK_PROTEIN_SECRETION', 'HALLMARK_INTERFERON_ALPHA_RESPONSE', 'HALLMARK_INTERFERON_GAMMA_RESPONSE', 'HALLMARK_APICAL_JUNCTION', 'HALLMARK_APICAL_SURFACE', 'HALLMARK_HEDGEHOG_SIGNALING', 'HALLMARK_COMPLEMENT', 'HALLMARK_UNFOLDED_PROTEIN_RESPONSE', 'HALLMARK_PI3K_AKT_MTOR_SIGNALING', 'HALLMARK_MTORC1_SIGNALING', 'HALLMARK_E2F_TARGETS', 'HALLMARK_MYC_TARGETS_V1', 'HALLMARK_MYC_TARGETS_V2', 'HALLMARK_EPITHELIAL_MESENCHYMAL_TRANSITION', 'HALLMARK_INFLAMMATORY_RESPONSE', 'HALLMARK_XENOBIOTIC_METABOLISM', 'HALLMARK_FATTY_ACID_METABOLISM', 'HALLMARK_OXIDATIVE_PHOSPHORYLATION', 'HALLMARK_GLYCOLYSIS', 'HALLMARK_REACTIVE_OXYGEN_SPECIES_PATHWAY', 'HALLMARK_P53_PATHWAY', 'HALLMARK_UV_RESPONSE_UP', 'HALLMARK_UV_RESPONSE_DN', 'HALLMARK_ANGIOGENESIS', 'HALLMARK_HEME_METABOLISM', 'HALLMARK_COAGULATION', 'HALLMARK_IL2_STAT5_SIGNALING', 'HALLMARK_BILE_ACID_METABOLISM', 'HALLMARK_PEROXISOME', 'HALLMARK_ALLOGRAFT_REJECTION', 'HALLMARK_SPERMATOGENESIS', 'HALLMARK_KRAS_SIGNALING_UP', 'HALLMARK_KRAS_SIGNALING_DN', 'HALLMARK_PANCREAS_BETA_CELLS'])
[34]:
geneset_name = 'HALLMARK_INTERFERON_GAMMA_RESPONSE'
roma.adata.obs[f'{geneset_name}_projections_1'] = roma.adata.uns['ROMA'][f'{geneset_name}'].svd.components_[0]
roma.adata.obs[f'{geneset_name}_projections_1_reverse'] = -1 * roma.adata.uns['ROMA'][f'{geneset_name}'].svd.components_[0]
GeneScores = pd.DataFrame(roma.adata.uns['ROMA'][f'{geneset_name}'].projections_1,
index=roma.adata.uns['ROMA'][f'{geneset_name}'].subsetlist,
columns=[f'{geneset_name}'])
sc.pl.pca(adata, color=['seurat_annotations',
'condition',
f'{geneset_name}_projections_1'],
ncols=3,
wspace=0.5)
sc.pl.umap(adata, color=['seurat_annotations',
'condition',
f'{geneset_name}_projections_1'],
ncols=3,
wspace=0.5)
top contributing genes¶
[37]:
GeneScores.sort_values(by=geneset_name, ascending=False)
[37]:
| HALLMARK_INTERFERON_GAMMA_RESPONSE | |
|---|---|
| ISG15 | 85.189522 |
| CXCL10 | 72.494965 |
| IFITM3 | 47.420273 |
| ISG20 | 47.004150 |
| CCL2 | 43.896164 |
| ... | ... |
| SOCS3 | -1.331164 |
| PIM1 | -1.750599 |
| CCL5 | -2.078131 |
| VAMP8 | -2.914884 |
| BTG1 | -17.022533 |
190 rows × 1 columns
[41]:
# number of top and bottom genes
n_top = 10
short_name = geneset_name.split('_')[1]
top_n = GeneScores.sort_values(by=f'{geneset_name}', ascending=False).head(n_top).index.tolist()
bottom_n = GeneScores.sort_values(by=f'{geneset_name}', ascending=False).tail(n_top).index.tolist()
top_mean_exp = roma.adata[:, top_n].X.mean(axis=1)
bottom_mean_exp = roma.adata[:, bottom_n].X.mean(axis=1)
roma.adata.obs[f'top_{n_top}_{short_name}_Mean_Exp'] = top_mean_exp
roma.adata.obs[f'bottom_{n_top}_{short_name}_Mean_Exp'] = bottom_mean_exp
sc.pl.pca(roma.adata, color=['seurat_annotations', 'condition',
f'{geneset_name}_projections_1',
f'top_{n_top}_{short_name}_Mean_Exp',
f'bottom_{n_top}_{short_name}_Mean_Exp'],
ncols=3,
wspace=0.5)
sc.pl.umap(roma.adata, color=['seurat_annotations', 'condition',
f'{geneset_name}_projections_1',
f'top_{n_top}_{short_name}_Mean_Exp',
f'bottom_{n_top}_{short_name}_Mean_Exp'],
ncols=3,
wspace=0.5)