Compute a list of clonotypes that are shared between seurat clusters
Source:R/ApotcClonalNetwork.R
getSharedClones.Rd
This function allows users to get a list of clonotypes that are shared
between clusters based on the levels of the active cell identities / some
custom identity based on the alt_ident
. A list is returned with its
names being the shared clonotypes, and the values are numeric vectors
indicating the index of the clusters that clonotype is found in. The index
corresponds to the index in the default levels of the factored identities.
If run_id
is inputted, then the function will attempt to get the shared
clonotypes from the corresponding APackOfTheClones run generated from
RunAPOTC()
. Otherwise, it will use the filtering / subsetting parameters
to generate the shared clones.
Usage
getSharedClones(
seurat_obj,
reduction_base = "umap",
clonecall = "strict",
...,
extra_filter = NULL,
alt_ident = NULL,
run_id = NULL,
top = NULL,
top_per_cl = NULL,
intop = NULL,
intop_per_cl = NULL,
publicity = c(2L, Inf)
)
Arguments
- seurat_obj
Seurat object with one or more dimension reductions and already have been integrated with a TCR/BCR library with
scRepertoire::combineExpression
.- reduction_base
character. The seurat reduction to base the clonal expansion plotting on. Defaults to
'umap'
but can be any reduction present within the reductions slot of the input seurat object, including custom ones. If `'pca'“, the cluster coordinates will be based on PC1 and PC2. However, generally APackOfTheClones is used for displaying UMAP and occasionally t-SNE versions to intuitively highlight clonal expansion.- clonecall
character. The column name in the seurat object metadata to use. See
scRepertoire
documentation for more information about this parameter that is central to both packages.- ...
additional "subsetting" keyword arguments indicating the rows corresponding to elements in the seurat object metadata that should be filtered by. E.g.,
seurat_clusters = c(1, 9, 10)
will filter the cells to those in theseurat_clusters
column with any of the values 1, 9, and 10. Unfortunately, column names in the seurat object metadata cannot conflict with the keyword arguments. MAJOR NOTE if any subsetting keyword arguments are a prefix of any preceding argument names (e.g. a column namedreduction
is a prefix of thereduction_base
argument) R will interpret it as the same argument unless both arguments are named. Additionally, this means any subsequent arguments must be named.- extra_filter
character. An additional string that should be formatted exactly like a statement one would pass into dplyr::filter that does additional filtering to cells in the seurat object - on top of the other keyword arguments - based on the metadata. This means that it will be logically AND'ed with any keyword argument filters. This is a more flexible alternative / addition to the filtering keyword arguments. For example, if one wanted to filter by the length of the amino acid sequence of TCRs, one could pass in something like
extra_filter = "nchar(CTaa) - 1 > 10"
. When involving characters, ensure to enclose with single quotes.- alt_ident
character. By default, cluster identity is assumed to be whatever is in
Idents(seurat_obj)
, and clones will be grouped by the active ident. However,alt_ident
could be set as the name of some column in the meta data of the seurat object to be grouped by. This column is meant to have been a product ofSeurat::StashIdent
or manually added.- run_id
character. This will be the ID associated with the data of a run, and will be used by other important functions like
APOTCPlot()
and AdjustAPOTC. Defaults toNULL
, in which case the ID will be generated in the following format:reduction_base;clonecall;keyword_arguments;extra_filter
where if keyword arguments and extra_filter are underscore characters if there was no input for the
...
andextra_filter
parameters.- top
integer or numeric in (0, 1) - if not null, filters the output clones so that only the shared clonotypes with counts the top
top
count / proportion (for numeric in (0, 1) input) shared clones are kept. For cases where several clonotypes tie in size, the clonotype(s) added are not guaranteed but deterministic given the other arguments are identical.- top_per_cl
integer or numeric in (0, 1) - if not null, filters the output clones so that for each seurat cluster, only the clonotypes with the
top_per_cl
frequency/count is preserved when aggregating shared clones, in the same way as the above. Note that if inputted in conjunction withtop
, it will get the intersection of the clonotypes filtered each way. For cases where several clonotypes tie in size, the clonotype(s) added are not guaranteed but deterministic given the other arguments are identical.- intop
integer or numeric in (0, 1) - if not null, filters the raw clone sizes before computing the shared clonotypes so that only the clonotypes that have their overall size in the top
intop
largest sizes (if it is integer, else theintop
proportion) are kept. To emphasize, this argument does not necessarily return thetop
shared clones and likely a little less, because this filters the raw clone sizes, of which, its very likely that not all those clones end up being shared.- intop_per_cl
integer or numeric in (0, 1) - if not null, filters the raw clustered clone sizes before computing shared clones, so that for every clone in a seurat cluster, the top
intop_per_cl
count / proportion (for numeric in (0, 1) input) clones are kept.- publicity
numeric pair. A simple filter range of
c(lowerbound, upperbound)
to retain only shared clones with their "publicity" - number of clusters they are present in - within this range.
Value
a named list where each name is a clonotype, each element is a numeric indicating which seurat cluster(s) its in, in no particular order. If no shared clones are present, the output is an empty list.
Examples
data("combined_pbmc")
getSharedClones(combined_pbmc)
#> $`TRAV8-3.TRAJ42.TRAC;TGTGCTGTGGGTGAGAAGGGTTATGGAGGAAGCCAAGGAAATCTCATCTTT_TRBV12-4.None.TRBJ1-6.TRBC1;TRBV7-6.None.TRBJ1-4.TRBC1;TGTGCCAGCAGTTTCCGACCGCCGGGTTCACCCCTCCACTTT;TGTGCCAGCCACGGCGCCAGGGGTGATGGCTTTTGTGAAAAACTGTTTTTT`
#> [1] 3 5
#>
#> $`TRAV12-1.TRAJ9.TRAC;TGTGTGGTCTCCGATAATACTGGAGGCTTCAAAACTATCTTT_TRBV9.None.TRBJ2-2.TRBC2;TGTGCCAGCAGCGTAAGGAGGGAAAGGGCGAACACCGGGGAGCTGTTTTTT`
#> [1] 3 4
#>
#> $`TRAV12-2.TRAJ12.TRAC;TGTGCCCGGAAGGTTAGGGATAGCAGCTATAAATTGATCTTC_TRBV6-4.None.TRBJ2-1.TRBC2;TGTGCCAGCAGTGACTCCGGATACAATGAGCAGTTCTTC`
#> [1] 3 5
#>
#> $`TRAV8-6.TRAJ34.TRAC;TGTGCTGTGACCTTCCATTATAACACCGACAAGCTCATCTTT_TRBV4-1.None.TRBJ2-7.TRBC2;TGCGCCAGCAGCCAAGACCGGACGGGACTAGACTACGAGCAGTACTTC`
#> [1] 4 9
#>
#> $`TRAV5.TRAJ15.TRAC;TGTGCGGAGCTAAACCAGGCAGGAACTGCTCTGATCTTT_TRBV4-1.None.TRBJ2-2.TRBC2;TGCGCCAGCAGCCAAGCCCCCTTTTCAACCTCCGGGGAGCTGTTTTTT`
#> [1] 3 5 9
#>
#> $`TRAV16.TRAJ30.TRAC;TGTGCTCTAAGTGGTAGCAGAGATGACAAGATCATCTTT_NA;NA`
#> [1] 3 13
#>
#> $`TRAV24.TRAJ22.TRAC;TGTGCCTCCCTATCTGGTTCTGCAAGGCAACTGACCTTT_TRBV27.None.TRBJ2-7.TRBC2;TGTGCCAGCAGCTCTACAGTTGCTGGCGAGCAGTACTTC`
#> [1] 4 5
#>
#> $`TRAV10.TRAJ48.TRAC;TGTGTGGTGAGCGACTTTGGAAATGAGAAATTAACCTTT_TRBV27.None.TRBJ2-1.TRBC2;TGTGCCAGCAGTTTAGGGTCGGGGGGGACGGGGAATGAGCAGTTCTTC`
#> [1] 3 5
#>
#> $`TRAV24.TRAJ22.TRAC;TGTGCCTCCCTTTCTGGTTCTGCAAGGCAACTGACCTTT_TRBV27.None.TRBJ2-1.TRBC2;TGTGCCAGCAGCCCCACAGTAGCGGGGGAGCAGTTCTTC`
#> [1] 5 9
#>
getSharedClones(
combined_pbmc,
orig.ident = c("P17B", "P18B"), # a named subsetting parameter
clonecall = "aa"
)
#> $CVVSDNTGGFKTIF_CASSVRRERANTGELFF
#> [1] 3 4
#>
#> $CVVSDFGNEKLTF_CASSLGSGGTGNEQFF
#> [1] 3 5
#>
#> $CAVTFHYNTDKLIF_CASSQDRTGLDYEQYF
#> [1] 4 9
#>
# extract shared clones from a past RunAPOTC run
combined_pbmc <- RunAPOTC(
combined_pbmc, run_id = "foo", verbose = FALSE
)
getSharedClones(
combined_pbmc, run_id = "foo", top = 5
)
#> $`TRAV8-3.TRAJ42.TRAC;TGTGCTGTGGGTGAGAAGGGTTATGGAGGAAGCCAAGGAAATCTCATCTTT_TRBV12-4.None.TRBJ1-6.TRBC1;TRBV7-6.None.TRBJ1-4.TRBC1;TGTGCCAGCAGTTTCCGACCGCCGGGTTCACCCCTCCACTTT;TGTGCCAGCCACGGCGCCAGGGGTGATGGCTTTTGTGAAAAACTGTTTTTT`
#> [1] 3 5
#>
#> $`TRAV12-1.TRAJ9.TRAC;TGTGTGGTCTCCGATAATACTGGAGGCTTCAAAACTATCTTT_TRBV9.None.TRBJ2-2.TRBC2;TGTGCCAGCAGCGTAAGGAGGGAAAGGGCGAACACCGGGGAGCTGTTTTTT`
#> [1] 3 4
#>
#> $`TRAV12-2.TRAJ12.TRAC;TGTGCCCGGAAGGTTAGGGATAGCAGCTATAAATTGATCTTC_TRBV6-4.None.TRBJ2-1.TRBC2;TGTGCCAGCAGTGACTCCGGATACAATGAGCAGTTCTTC`
#> [1] 3 5
#>
#> $`TRAV5.TRAJ15.TRAC;TGTGCGGAGCTAAACCAGGCAGGAACTGCTCTGATCTTT_TRBV4-1.None.TRBJ2-2.TRBC2;TGCGCCAGCAGCCAAGCCCCCTTTTCAACCTCCGGGGAGCTGTTTTTT`
#> [1] 3 5 9
#>
#> $`TRAV24.TRAJ22.TRAC;TGTGCCTCCCTATCTGGTTCTGCAAGGCAACTGACCTTT_TRBV27.None.TRBJ2-7.TRBC2;TGTGCCAGCAGCTCTACAGTTGCTGGCGAGCAGTACTTC`
#> [1] 4 5
#>
# doing a run and then getting the clones works too
combined_pbmc <- RunAPOTC(combined_pbmc, run_id = "run1", verbose = FALSE)
getSharedClones(combined_pbmc, run_id = "run1")
#> $`TRAV8-3.TRAJ42.TRAC;TGTGCTGTGGGTGAGAAGGGTTATGGAGGAAGCCAAGGAAATCTCATCTTT_TRBV12-4.None.TRBJ1-6.TRBC1;TRBV7-6.None.TRBJ1-4.TRBC1;TGTGCCAGCAGTTTCCGACCGCCGGGTTCACCCCTCCACTTT;TGTGCCAGCCACGGCGCCAGGGGTGATGGCTTTTGTGAAAAACTGTTTTTT`
#> [1] 3 5
#>
#> $`TRAV12-1.TRAJ9.TRAC;TGTGTGGTCTCCGATAATACTGGAGGCTTCAAAACTATCTTT_TRBV9.None.TRBJ2-2.TRBC2;TGTGCCAGCAGCGTAAGGAGGGAAAGGGCGAACACCGGGGAGCTGTTTTTT`
#> [1] 3 4
#>
#> $`TRAV12-2.TRAJ12.TRAC;TGTGCCCGGAAGGTTAGGGATAGCAGCTATAAATTGATCTTC_TRBV6-4.None.TRBJ2-1.TRBC2;TGTGCCAGCAGTGACTCCGGATACAATGAGCAGTTCTTC`
#> [1] 3 5
#>
#> $`TRAV8-6.TRAJ34.TRAC;TGTGCTGTGACCTTCCATTATAACACCGACAAGCTCATCTTT_TRBV4-1.None.TRBJ2-7.TRBC2;TGCGCCAGCAGCCAAGACCGGACGGGACTAGACTACGAGCAGTACTTC`
#> [1] 4 9
#>
#> $`TRAV5.TRAJ15.TRAC;TGTGCGGAGCTAAACCAGGCAGGAACTGCTCTGATCTTT_TRBV4-1.None.TRBJ2-2.TRBC2;TGCGCCAGCAGCCAAGCCCCCTTTTCAACCTCCGGGGAGCTGTTTTTT`
#> [1] 3 5 9
#>
#> $`TRAV16.TRAJ30.TRAC;TGTGCTCTAAGTGGTAGCAGAGATGACAAGATCATCTTT_NA;NA`
#> [1] 3 13
#>
#> $`TRAV24.TRAJ22.TRAC;TGTGCCTCCCTATCTGGTTCTGCAAGGCAACTGACCTTT_TRBV27.None.TRBJ2-7.TRBC2;TGTGCCAGCAGCTCTACAGTTGCTGGCGAGCAGTACTTC`
#> [1] 4 5
#>
#> $`TRAV10.TRAJ48.TRAC;TGTGTGGTGAGCGACTTTGGAAATGAGAAATTAACCTTT_TRBV27.None.TRBJ2-1.TRBC2;TGTGCCAGCAGTTTAGGGTCGGGGGGGACGGGGAATGAGCAGTTCTTC`
#> [1] 3 5
#>
#> $`TRAV24.TRAJ22.TRAC;TGTGCCTCCCTTTCTGGTTCTGCAAGGCAACTGACCTTT_TRBV27.None.TRBJ2-1.TRBC2;TGTGCCAGCAGCCCCACAGTAGCGGGGGAGCAGTTCTTC`
#> [1] 5 9
#>