T005 Leukemia¶
Version: 0.1.0
Last change: May 07, 2022
We observe a separation between acute lymphoblastic leukemia (ALL), which cluster in T119 ALL (n = 334), and acute myeloid leukemia (AML), which cluster in T120 AML (n = 472) at the second hierarchical level. A significant difference in age is expected due to the different etiologies (median age 7.16 vs 16.76 y.o., MWU adj. p-val = 2.98e-23) and the presence of both adult and pediatric populations in both groups to different degrees. No significant difference in OS is observed.
Acute lymphoblastic leukemia¶
Within the lymphoblastic branch, we immediately observe the separation of a small group of infant leukemias with KMT2A rearrangements, found in T121 ALL INF KMT2Ar (n =14), from all other diagnoses, in T122 ALL A (n = 320) (Fig. LEU1).
T121 ALL INF KMT2Ar contains most samples marked as infant (6 vs 1 χ2 p-val < 2.20e-16) and mixed-lineage leukemia (4 vs 0, χ2 p-val < 2.20e-16) and has a significantly younger median age (0.73 vs 7.20 y.o., MWU p-value = 4.37e-02). We confirmed this annotation with gene sets, as T121 ALL INF KMT2Ar is highly enriched for KMT2A downstream targets (medNES = 1.50 , MWU adj. p-val = 2.47e-09) [Ross2004] (Fig. LEU2).
T122 ALL A further splits into two subclasses, T123 ALL B (n = 127) and T124 ALL TRG (n = 193) (Fig. LEU1b), containing most of samples from TARGET. Gene sets analysis between all TARGET leukemia samples and the remaining cohort shows enrichment (MWU adj. p-val < 1.00e-10) of poly-A RNA binding, ribonucleoprotein complex, RNA processing, ribosomal and mitochondrial pathways, and oxidative phosphorylation [Ashburner2000], [The2019] in T124 ALL TRG. Furthermore, T124 ALL TRG has a lower median age (6.41 vs 13.17 y.o., MWU adj. p-val = 5.04e-08). We couldn’t identify any biological driver behind the split between T123 ALL B and T124 ALL TRG with statistical certainty; stringent low variance genes removal or more advanced batch effect removal methods (e.g. COMBaT [Lazar2013]) weren’t enough to assure complete compatibility between the TARGET cohort and the rest of the dataset without the loss of information and damage to the subtyping process. We decided to keep the clusters separate as by choice of the algorithm and further investigate their subtypes independently, to maintain tumor subtypes that were exclusive of one or the other cohorts and increase the classifier range.
Acute lymphoblastic leukemia, non-TARGET cohort¶
At the next level within T123 ALL B, we observe the separation of
T126 ALL ETV6-RUNX1 (n = 20) a small class of samples marked with
ETV6-RUNX1 fusion (χ2 p-val < 2.20e-16) from the remaining ALL
in T125 ALL C (n=107) (Fig. LEU1b).
The t(12;21)(p13;q22) translocation which results from this fusion is often accompanied by copy number gains in RUNX1,
which is overexpressed in T126 ALL ETV6-RUNX1
(logFC = 4.17e-01, FDR = 3.33e-03).
Compared to patients in T125 ALL C, those in
T126 ALL ETV6-RUNX1 are significantly younger
(14.5 vs. 4.46 y.o., MWU adj. p-val = 3.29e-08) [Sun2017].
The children of T125 ALL C separate into
T128 ALL ERGdel (n = 36) and
T127 ALL Ph-like (n = 71) (Fig. LEU1b).
T128 ALL ERGdel is characterized by tumors carrying
ERG deletions (15 vs. 55, χ2 p-val < 2.20e-16), and exhibits characteristic overexpression of
CHST2 (logFC = -4.48, FDR = 5.742e-33),
PTPRM (logFC = -7.64, FDR = 2.987e-32), and
GPR49/AGAP1 (logFC = -6.23, FDR = 3.201e-31) [Yeoh2002].
The majority of samples in T127 are composed of Ph-like tumors of various classes (χ2 p-val < 2.2e-16) [Jain2017].
T127 ALL Ph-like then further subdivides in two child nodes,
T129 ALL Ph-like A (n = 41) and
T130 ALL Ph-like IKZF1/JAK2 (n = 29) (Fig. LEU1b).
Both contain small populations of BCR-ABL1 fusion samples (Ph+) (11 and 5, ns) and Philadelphia-like (Ph-like) samples (13 and 14, ns).
While T129 ALL Ph-like A contains the majority of
Ph-like non- CRFL2 tumors (11/28 vs. 14/19, χ2 p-val = 4.32e-02),
there is no corresponding enrichment of this signature via gene sets analysis.
However, the two differ by some specific lesions known to be present in the Ph-like group:
T129 ALL Ph-like A contains 6
JAK2 fusion samples (0/13 vs 6/14, χ2 p-val = 2.69e-02),
while T130 ALL Ph-like IKZF1/JAK2 contains all
EPO fusion samples (4/13 vs. 0/14, FET p-val 4.07e-02).
Both contain other JAK/STAT alterations (4/13 vs. 3/14, ns), and two of other ABL1/2 fusion samples each.
T130 ALL Ph-like IKZF1/JAK2
is also enriched for tumors with concurrent IKZF1 alterations (11/28 vs. 14/19, χ2 p-val = 4.32e-02).
T129 ALL Ph-like A
then divides into two further subtypes, T131 ALL Ph-like JAK/STAT
(n=23) and T132 ALL Ph+/Ph-like EPOR (n =12) (Fig. LEU3a).
T132 ALL Ph+/Ph-like EPOR contains the majority of
BCR-ABL1 fusion samples (3/23 vs. 8/12, p-val = 4.23e-03).
Of the Ph-like samples for which we have annotation,
T131 ALL Ph-like JAK/STAT
contains 4 unspecified JAK/STAT mutants along with an additional CRLF2-JAK mutant, a CRFL2 rearranged sample with no
JAK rearrangements, and a RAS mutant (Fig. LEU3a).
T132 ALL Ph+/Ph-like EPOR contains 3
EPOR-IGH fusion samples, while T131 ALL Ph-like JAK/STAT
contains an EPOR-IGK fusion (n.s.).
Both groups contain one ABL fusion without CRFL2 rearrangement, while Ph-like non-CRLF2 samples are evenly
divided between the clusters (7/20 vs. 4/8, n.s.).
Another interesting distinction is that T131 ALL Ph-like JAK/STAT
is enriched for tumors with cell-cycle related lesions, either in TP53,
CDK2NA/B, or RB1 (14/20 vs. 1/8, χ2 p-val = 1.95e-2).
T132 ALL Ph+/Ph-like EPOR,
however, is enriched for samples with concurrent IKZF1 alterations (5/20 vs. 6/8, χ2 p-val = 4.35e-02),
though these are heterogeneous and have some overlap between the two clusters [Harvey2013].
Gene set enrichment analysis demonstrates T131 ALL Ph-like JAK/STAT
to be enriched for non-Ph-like CRFL2 rearranged samples (medNES = 1.57, MWU adj. p-val = 4.70e-05),
while T132 ALL Ph+/Ph-like EPOR
is enriched for Ph-like samples with CRFL2 rearrangments
(medNES = 2.68, MWU adj. p-val = 1.61e-07)
[Sadras2017] (Fig. LEU3b), suggesting that
T132 ALL Ph+/Ph-like EPOR may contain
CRFL2-rearranged samples which have not been annotated as such.
Acute lymphoblastic leukemia, TARGET cohort¶
The TARGET ALL cluster, T124 ALL TRG, divides into four classes (Fig. LEU1b, LEU4)
T133 ALL TRG A (n = 109) is the largest cluster and contains a
mixture of genomic alterations: ALL with hyperdiploidy without trisomy of
chr4 and ch10 (χ2 p-val = 3.31e-4), ALL with
hyperdiploidy with trisomy chr4 and ch10, samples with iAMP21, plus a number of unspecified samples
(Fig. LEU4).
The cluster is characterized by significant overexpression of CRLF2 (logFC ≤ 7.749e-04).
Indeed, gene set enrichment analysis confirmed this cluster contains a sizeable population of
Ph+ and Ph-like samples (medNES = 79.08,
KW adj. p-val = 7.03e-14,
Dunn adj. p-val < 1.00e-03).
T134 ALL TRG ZNF384 (n = 13)
is the smallest cluster and contains the oldest group of patients (median age 13.23
y.o., KW adj. p-val = 1.13e-03).
Patients with ALL in this cluster display the best overall survival
(lrt p-val < 1e-04).
Gene set enrichment analysis of genes upregulated and downregulated in ZNF384-rearanged
ALL demonstrates a characteristic gene expression pattern of
ZNF384-fusion downstream targets, in both upregulated (medNES ≥ 1.51,
KW adj. p-val < 1.00e-04) and
donwregulated targets (medNES ≤ 4.81e-01,
KW adj. p-val < 1.00e-04), respectively [Qian2017], [Hirabayashi2017]
(Fig. LEU5).
T135 ALL TRG TCF3 (n = 30) is comprised of samples harbouring both
TCF3-PBX1 (n = 19, χ2 p-val < 2.2e-16) and TCF3-HLF (n = 3, χ2 p-val = 1.60e-02)
fusions.
Out of all TARGET ALL subgroups,
T135 ALL TRG TCF3 contains the patient group with the worst overall survival,
reaching median OS at 483 days (lrt p-val = 6.30e-22 at 4383 days,
post-hoc pairwise lrt p-val ≤ 1.5e-06).
When comparing patients with each fusion within this class, those with TCF3-HLF
fusions exhibit significantly worse OS (lrt p-val = 4.89e-02),
consistent with literature [Inukai2007].
Though identifying TCG3-HLF outright is important for determining clinical course due to
its negative prognostic indication [Inukai2007], due to a paucity of these samples we are unable to separate them further.
Due to transcriptional similarities, we also expect MEF2D-mutated samples would be clustered in this group [Ohki2019].
The final subclass of T124 ALL TRG,
T136 ALL TRG ETV6-RUNX1 (n = 27),
contains samples with ETV6-RUNX1 fusions (n = 20, χ2 p-val < 2.2e-16) (Fig. LEU4)
and comprises the youngest patients (median 3.1 y.o.,
KW adj. p-val = 1.13e-03).
T133 ALL TRG A separates in further components (Fig. LEU4).
T137 ALL TRG Ph+/Ph-like CRLF2 (n=29)
contains all samples labelled as harbouring BCR-ABL1 fusions (n = 3), MLL-rearranged ALL samples (n=3),
and the highest proportion of otherwise unspecified ALL samples (n = 23, χ2 p-val = 2.95e-05).
It shows overexpression of CRLF2 (logFC = 2.99, FDR = 1.48e-02)
nd enrichment of CRLF2-rearrangment signatures in Ph-like ALL (Ph+ CRFL2 positive,
medNES = 2.21, KW adj. p-val = 3.05e-03)
[Sadras2017] (Fig. LEU3b).
It also exhibits overexpression of IDH1 (logFC = 1.28, FDR = 3.66e-05),
JAK1 (FDR = 0.641, FDR = 4.15e-02)
and is enriched for Ph-like gene signatures (medNES = 2.88,
KW adj. p-val = 9.79e-06,
Dunn adj. p-val < 1.00e-03) [Harvey2010], [Harvey2013]
when compared to its siblings (Fig. LEU3b).
T138 ALL TRG HYPERDIP (n=21)
is enriched for tumors with hyperdiploidy without trisomy of both chromosomes 4 and 10
(1/29 vs. 11/20 vs. 7/22, χ2 p-val = 2.66e-04). Patients in
T138 are also significantly younger than its siblings (3.59 y.o.,
KW adj. p-val = 1.14e-02).
Furthermore, T138 exhibits the highest DNA index of its siblings, an indicator of hyperdiploidy
(median = 1.17, KW adj. p-val = 3.97e-07,
Dunn adj. p-val ≤ 4.18e-03) [Rachieru-Sourisseau2010].
T139 ALL TRG Ph-like EPOR (n = 22)
is characterized by overexpression of EPOR (median logFC = 2.06, FDR ≤ 1.20e-04),
as well as enrichment of erythrocyte developmental gene sets (medNES = 1.22,
KW adj. p-val = 2.06-06,
Dunn adj. p-val < 5.00e-02) [Ashburner2000], [TGOC2019].
It also exhibits overexpression of IDH2 (median logFC = 1.65, FDR ≤ 3.40e-11).
Acute Myeloid Leukemia¶
Myeloid malignancies in T120 AML immediately separate into 9 different classes at the following heirarchical level (Fig. LEU6). Similar to ALL, we observe two classes made up exclusively of TARGET samples: T144 AML TRG and T146 AML TRG IDH2low, which are discussed at the end of this section.
Acute Myeloid Leukemia, non-TARGET cohort¶
T140 AML KMT2Ar (n = 52) has a median age of 60.00 y.o (KW p-val =1.54e-48) due to the presence of 46/52 adult patients. It contains a number of samples marked for KMT2A fusions (most of them high risk, χ2 p-val = 4.45e-08), and is highly enriched (medNES > 1.08, KW adj. p-val < 1.00e-40, Dunn adj. p-val < 1.00e-04) for their matching pathways ([Ross2004]; [Mullighan2007]) (Fig. LEU7). It is also enriched for NPM1 mutated pathways (medNES = 1.07, KW adj. p-val < 1.00e-04) [Mullighan2007] suggesting a large cohort within this class may be NPM1 mutated. Indeed, all samples in this cluster for which we have NPM1 and FLT3 mutation data are mutated for either NPM1 (n=23) or FLT3 (n=16). This class displays poor OS (lrt p-val = 6.31e-11at 4022 days), reaching median OS at 327days.
T140 AML KMT2Ar splits into two subclasses
(Fig. LEU6b). T149 AML KMT2Ar 11q23
(n = 8) is a very small cluster and is considerably younger (45.00 vs 62.00
MWU adj. p-val = 7.24e-03) than
T150 AML KMT2A NPM1/FLT3 (n = 44);
this is also reflected in the percentage of samples marked as pediatric (50.00% vs. 4.55%, χ2 p-val = 7.25e-03).
While 5 samples are marked as AML,
T149 AML KMT2Ar 11q23 also contains
3 samples marked as mixed lineage leukemias (χ2 p-val = 7.79e-04).
It contains 4 samples from TCGA, all of which are annotated with
KMT2A fusions (two MLL10-KMT2A and one KMT2A-MLLT3 and one KMT2A-MLLT4), while
T150 AML KMT2A NPM1/FLT3
contains 40 samples from TGCA, 10 of which have reported gene fusions, with seven involving KMT2A genes.
When compared to T150 AML KMT2A NPM1/FLT3 ,
T149 AML KMT2Ar 11q23 is significantly
enriched for genes sets involving chr11q23 rearrangement (medNES = 8.46,
KW adj. p-val = 1.06e-08) [Yagi2003]
and AML cluster 16 from Valk et al. 2004 (medNES = 4.03, adj. p-val = 2.66e-09),
which is composed of samples with 11q23 rearrangements [Valk2004].
T150 AML KMT2A NPM1/FLT3
inherits all of the NPM1 and FLT3 mutants found in its parent
T140 AML KMT2Ar [Braoudaki2010],
and is enriched for their corresponding gene sets (medNES =2.34,
KW adj. p-val =7.97e-08,
medNES = 1.85,
KW adj. p-val = 1.25e-04, respectively) [Valk2004], [Verhaak2009].
T141 AML BM (n = 30) is a mixed-lineage cluster.
It comprises myeloid, megakaryoblastic, non-specific, and lymphoblastic leukemias along with a few lymphomas and osteosarcomas.
It is not enriched for any leukemia associated gene sets, suggesting this class may contain samples contaminated by normal blood or bone marrow tissue.
T142 AML MATlow (n = 105) is largely composed of FAB subtypes M1
(n = 33, χ2 p-val = 7.44e-04), AML with minimal maturation,
and M2 (n = 34, χ2 p-val = 1.60e-06), AML with maturation,
and a smaller subpopulation of undifferentiated M0 (n = 15, χ2 p-val = 1.15e-04).
It is composed of older patients, with a median age of 57 y.o, and is enriched for samples classified as intermediate
(n = 54, χ2 p-val = 1.43e-07) and high-risk (n = 37, χ2 p-val 1.61e-09).
It contains two BCR-ABL1 fusion samples, 24 FLT3 mutants - all of which are from the TCGA,
though the mutations themselves are heterogenous – 24 NMP1 mutants, 21 of which are W288F (χ2 p-val < 2.2e-16),
along with 9 WT1 mutants (χ2 p-val = 1.56 e-4).
All samples in this cluster for which we have NPM1 and FLT3 mutation data have mutations in either gene.
This cluster displays intermediate low prognosis, reaching median OS at 417 days
(lrt p-val = 6.31e-11 at 4022 days).
T142 AML MATlow splits into two two subclasses,
T151 AML MATlow NPM1mut and
T152 AML MATlow noNPM1 (Fig. S26b),
which are separated by the presence or absence of NPM1 mutations, as well as karyotypic complexity.
T151 AML MATlow NPM1mut (n = 34) has a
higher ratio of FAB M1 samples, AML with minimal maturation,
(16/32 vs 17/62, FET p-val = 4.04e-02) and inherits all NPM1-mutant
samples except for one, a p.K263R (χ2 p-val = 6.67e-13); all samples for which we have
NPM1 data within this cluster (n=25) are NPM1 mutated.
As expected, we confirmed this annotation through significance (medNES = 1.25,
MWU adj. p-val = 7.83e-16) in NPM1 mutation pathways [Mullighan2007].
Its sibling, T152 AML MATlow noNPM1 (n = 71),
has a higher proportion of FAB M0 samples, undifferentiated AML
(1 vs. 14, FET p-val = 3.21e-02), and possibly contains equivalent samples
without NPM1 mutation.
M2 samples are evenly split between the clusters (χ2 p-val = 6.51e-01), suggesting
maturation is not a critical determinant of this split. Samples with FLT3 and WT1 mutations are more common in
T151 AML MATlow NPM1mut than in
T152 AML MATlow noNPM1, confirmed by gene
sets for FLT3 mutation (medNES = 1.90,
MWU adj. p-val = 2.29e-13) [Valk2004].
We observe no significant separation in survival between the two clusters.
T152 AML MATlow noNPM1 further splits into
T153 AML FLT3-ITD (n = 58) and
T154 AML CEBPA (n = 13) (Fig. S26b). There is a significantly age
desparity between patients in these clusters (63 vs 32 y.o.
MWU adj. p-val = 7.80e-05).
T153 AML FLT3-ITD contains all M0 samples
(n = 14 vs 0) while T154 AML CEBPA is enriched for FAB M2 samples
(n = 12 vs 9, χ2 p-val = 3.81e-03).
T153 AML FLT3-ITD also contains five acute
megakaryoblastic leukemias and two mixed lineage leukemias, and carries more samples with complex cytogenetics
(χ2 p-val < 1.00-03) and has significantly reduced
OS (lrt p-val = 2.00e-02).
In line with findings described in literature,
T153 AML FLT3-ITD exhibits a
higher mutation burden (median = 17.00 vs. 8.50, MWU adj. p-val = 2.06e-03),
which is largely related to age in AML [Shaver2015].
T153 contains six FLT3 mutant samples (three of which have in frame insertions), while T154 AML CEBPA
contains only one. T153 AML FLT3-ITD overexpresses
a myriad of genes (21/39, FDR < 0.05), which are known to be upregulated in samples
harbouring FLT3 internal tandem duplications (FLT3-ITD), as well as enrichment of
FLT3-ITD gene sets
(medNES = 3.11,
KW adj. p-val contains
only three CEBPA mutated samples, while T154 AML CEBPA contains
eight (χ2 p-val = 3.28e-06)
The direct subclusters of T120 AML continue here.
T143 AMKL (n = 49) is exclusively composed of
megakaryoblastic samples (n = 41, χ2 p-val < 2.20e-16) while eight samples
are unlabelled, and as expected is enriched for AMKL
pathways (medNES ≥ 1.70 ,
KW adj. p-val at 313 days (lrt p-val = 6.31e-11).
T143 AMKL then splits into T155 AMKL CBFA2T3-GLIS2
(n = 12) and T156 AMKL HOX (n = 37).
Though both are entirely pediatric, the former cluster contains significantly younger patients (median age of 0.97 vs 2.17
y.o. ,:abbr:MWU adj. p-val (Mann Whitney U test Benjamin-Hochberg adjusted p-value) = 2.08e-02).
All samples in T155 AMKL CBFA2T3-GLIS2 for which genomic
data are available are characterized by a CBFA2T3-GLIS2 fusion (9/9 vs. 0/25, χ2 p-val = 7.03e-08) [deRooij2017].
Patients in T155 AMKL CBFA2T3-GLIS2 have poorer prognsosis,
reaching median OS at just 313 days post diagnosis. T156 AMKL HOX is composed of other driver events: two GATA1 mutants , four HOXr (HOX fusion) samples, eight KMT2A-MLLT3/10 fusions, four NUP98-KDM5A fusions, two RBM15-MKL1 fusions, and four samples with other driver mutations.
With a greater sample size its possible these mutations would form their own clusters as well. When comparing these
two classes, T156 AMKL HOX exhibits overexpression of
HOXA (11/11 genes upregulated, median logFC ≤ -5.67, FDR ≤ 8.47e-03 )
and HOXB genes (8/10 upregulated, median logFC = -5.65, FDR ≤ 7.31e-03) [deRooij2017].
The remaining subclasses of T120 AML are defined by clear fusion events. All samples
within T145 AML CBFB-MYH11 (n = 14) are marked as core binding factor positive, CBFB-MYH11.
As expected, it is enriched (medNES ≥ 1.35 ,
KW adj. p-val (n = 15), except for one, are positive for
PML-RARA fusions (χ2 p-val < 2.20e-16) and marked as FAB M3
(χ2 p-val < 2.20e-16), acute promyelocytic leukemia.
This class also contains 5 samples with FLT3 mutations, four of which are p600 in frame insertions
(from TCGA); these seem to be exclusive to this cluster.
This class has the best prognosis of the cohort, with >60% of patients surviving at 4022 days post diagnosis.
The final child of T120 AML,
T148 AML RUNX1-RUNX1T1 (n = 13), exclusively contains
RUNX1-RUNX1T1 fusion AML (χ2 p-val < 2.20e-16).
It has moderate-good prognosis, reaching median OS 2910 days.
Acute Myeloid Leukemia, TARGET cohort¶
We observe two classes within the AML branch with an exclusive TARGET composition (Fig. LEU6). T146 AML TRG IDH2low (n = 23) is composed by samples with various diagnostic categories: three KMT2A fusions (n =3), eight normal karyotypes, and 10 other lesions, including two t(X;10)(p11.2;p11.2), add(17)(p11.2) and two inv(17)(p13.1q11.2), both exclusive to this group. However, it contains the highest proportion of WT1 mutations (7/23, χ2 p-val = 1.39e-3) and FLT3-ITDs (8/23, χ2 p-val = 2.427e-05) amongst the TARGET cohort. It also exhibits the lowest expression of IDH2 (logFC = -0.836, p-val = 2.58e-2 against T155-T159 and T161 AML TRG RUNX-RUNX1T1). This group displays intermediate prognosis, reaching median OS at 1394 days post diagnosis.
T144 AML TRG (n = 163) is the largest subcluster of T120 AML and is composed
largely of unspecified AML (n=154), and surprisingly contains 5
ALL. It is an entirely pediatric cluster
(median age 9.36 y.o.) and has excellent prognosis, with >50% of patients surviving at 4022 days post diagnosis.
Diving deeper into this class (Fig. S26b, c), we observe first the singling out of AML
with KMT2A translocations (23/33 vs 12/120, χ2 p-val = 2.623e-12) in
T158 AML TRG KMT2Ar (n = 33) from all
other samples in T157 AML TRG A (n = 130).
As expected, T158 AML TRG KMT2Ar shows
enrichment (MWU adj. p-val ≤ 1.00e-03)
of KMT2A-associated gene sets [Ross2004], [Mullighan2007]. There is no difference in OS between the two subclasses.
We then observe T157 AML TRG A splitting into three small
subclasses characterized by unique molecular aberrations:
T159 AML TRG KMT2Ar/MPAL (n = 65),
T160 AML TRG CFB-MYH11 (n = 36),
and T161 AML TRG RUNX-RUNX1T1 (n = 29).
Aside from myeloid malignancies,
T159 AML TRG KMT2Ar/MPAL
contains 4 ALL samples, one unspecified leukemia and one lymphoma.
It has the highest proportion of intermediate risk samples (n = 36, χ2 p-val = 1.581e-06)
and patients within it exhibit a significantly worse OS than either of its siblings
(lrt p-val = 2.20e-04). This cluster also inherits all NPM1 mutant samples, while
FLT3-ITD and WT1 mutants are spread across all three clusters.
This class also contains samples labelled as KMT2A-rearranged (n = 11/56, χ2 p-val = 4.103e-03).
It shows overexpression of a wide variety of HOX genes (24/39 HOX genes with median
logFC > 0 & FDR < 0.05, 22/39 FDR < 1e-04, median
logFC = 4.62), a phenotype previously described in AMLs with KMT2A partial internal tandem duplication (KMT2A-PTD) [Dorrance2006].
The characteristic expression patterns of KMT2A-PTD
could explain the inclusion of a handful of ALL samples, which may also
harbour non-canonical KMT2A aberrations.
Indeed, manual inspection of a subsample of eight mRNA sequences (five labelled as AML,
three as ALL) from TARGET revealed the majority of these samples (4/8) harbour
complex lesions in KMT2A or (2/8) with rearrangments to exon 7 and 8 associated with KMT2A-PTD.
The transcriptional profile of KMT2A lesions in this class departs from that most commonly described by literature,
as most gene sets involving KMT2A mutated leukemias agree an impoverishment in this class when compared to the
bona-fide KMT2A-rearranged AML class
T158 AML TRG KMT2Ar
(medNES ≥ 1.27 for positive signatures in T158 AML TRG KMT2Ar,
≥ 1.61 for negative signatures in T159 AML TRG KMT2Ar/MPAL, MWU adj. p-val ≤ 3.32e-15) ([Ross2004]; [Mullighan2007];).
A single sample harbours a BSG-CDC34 fusion. While no KMT2A mutation was reported, CDC34 is known to mediate stability and degradation of
KMT2A ([Meyer2018]; [Sugeedha2021]), supporting the idea that
T159 AML TRG KMT2Ar/MPAL
is composed of tumors with various lesions which converge upon KMT2A pathway pertubation.
KMT2A rearrangements are also common in mixed phenotype acute leukemias (MPAL) [Winters2017], [Yang2017];
to assess whether some of these samples are MPAL, we interrogated a number of gene sets (Fig. LEU9).
Indeed, MPAL expression sets were significantly upregulated in
AML within
T159 AML TRG KMT2Ar/MPAL
when compared to all other AML in T120 AML
(medNES = 1.20, MWU adj. p-val = 2.94-12),
which in turn have higher markers of AML vs MPAL
(medNES = 1.04, MWU adj. p-val = 5.55e-05) [Bian2018].
Furthermore, these samples carry higher lymphocyte differentiation expression than AML
from their family class (T159 AML TRG KMT2Ar/MPAL
vs T120 AML, medNES = 2.63,
MWU adj. p-val = 1.36e-03) [The2019], [Ashburner2000].
In turn, the four ALL samples within this same class have significant enrichment
for myeloid differentiation when compared to all other ALL in
T119 ALL (medNES = 1.25,
MWU adj. p-val = 9.01e-04) [The2019], [Ashburner2000].
Furthermore, we report enrichment of T-cell development and differentiation gene sets when comparing
samples of matching reported lineage to either T120 AML
(medNES ≥ 1.10, MWU adj. p-val ≤ 9.82e-08)
and T119 ALL (medNES ≥ 1.27,
MWU adj. p-val ≤ 2.77e-02) [The2019], [Ashburner2000],
composed exclusively of B-cell ALL (Fig. LEU9).
These results support the hypothetical presence of T-cell MPAL within
T159 AML TRG KMT2Ar/MPAL.
While limited information is given by the labelling of these samples, we can confidently speculate this class includes
KMT2A-rearranged B-cell and/or T-cell MPAL, or at the very least samples of either
linage expressing both myeloid and lymphoid markers.
Finally, a more straightforward annotation allows us to determine T160 AML TRG CFB-MYH11 harbours core binding factor-mutated samples, as the majority of its samples are labelled as CFB-MYH11 fusion positive (n = 26/35, χ2 p-val = 9.70e-15), and furthermore shows enrichment (medNES ≥ 1.84, KW adj. p-val = 9.376e-16, Dunn adj. p-val < 1.00e-04) of its associated gene sets [Ross2004]. Similarly, T161 AML TRG RUNX-RUNX1T1 is largely composed of samples labelled as harbouring RUNX1-RUNX1T1 fusions (n = 18/29, χ2 p-val = 1.77e-11) and is enriched for respective gene sets (medNES ≥ 1.01, KW adj. p-val = 5.83e-04) [Tonks2007]. It also contains 6 CEBPA mutants (χ2 p-val = 8.21e-3).
Bibliography¶
- Ashburner2000(1,2,3,4,5)
Ashburner, M., Ball, C.A., Blake, J.A., et al. 2000. Gene Ontology: tool for the unification of biology. Nature Genetics 25(1), pp. 25–29.
- Bian2018
Bian, S., Hou, Y., Zhou, X., et al. 2018. Single-cell multiomics sequencing and analyses of human colorectal cancer. Science 362(6418), pp. 1060–1063.
- Braoudaki2010
Braoudaki, M., Papathanassiou, C., Katsibardi, K., Tourkadoni, N., Karamolegou, K. and Tzortzatou-Stathopoulou, F. 2010. The frequency of NPM1 mutations in childhood acute myeloid leukemia. Journal of hematology & oncology 3, p. 41.
- Dorrance2006
Dorrance, A.M., Liu, S., Yuan, W., et al. 2006. Mll partial tandem duplication induces aberrant Hox expression in vivo via specific epigenetic alterations. The Journal of Clinical Investigation 116(10), pp. 2707–2716.
- Harvey2010
Harvey, R.C., Mullighan, C.G., Wang, X., et al. 2010. Identification of novel cluster groups in pediatric high-risk B-precursor acute lymphoblastic leukemia with gene expression profiling: correlation with genome-wide DNA copy number alterations, clinical characteristics, and outcome. Blood 116(23), pp. 4874–4884.
- Harvey2013(1,2)
Harvey, R.C., Kang, H., Roberts, K.G., et al. 2013. Development and Validation Of a Highly Sensitive and Specific Gene Expression Classifier To Prospectively Screen and Identify B-Precursor Acute Lymphoblastic Leukemia (ALL) Patients With a Philadelphia Chromosome-Like (“Ph-like” or “BCR-ABL1-Like”) Signature For Therapeutic Targeting and Clinical Intervention. Blood 122(21), pp. 826–826.
- Hirabayashi2017
Hirabayashi, S., Ohki, K., Nakabayashi, K., et al. 2017. ZNF384-related fusion genes define a subgroup of childhood B-cell precursor acute lymphoblastic leukemia with a characteristic immunotype. Haematologica 102(1), pp. 118–129.
- Inukai2007(1,2)
Inukai, T., Hirose, K., Inaba, T., et al. 2007. Hypercalcemia in childhood acute lymphoblastic leukemia: frequent implication of parathyroid hormone-related peptide and E2A-HLF from translocation 17;19. Leukemia 21(2), pp. 288–296.
- Jain2017
Jain, N., Roberts, K.G., Jabbour, E., et al. 2017. Ph-like acute lymphoblastic leukemia: a high-risk subtype in adults. Blood 129(5), pp. 572–581.
- Lazar2013
Lazar, C., Meganck, S., Taminau, J., et al. 2013. Batch effect removal methods for microarray gene expression data integration: a survey. Briefings in Bioinformatics 14(4), pp. 469–490.
- Meyer2018
Meyer, C., Burmeister, T., Gröger, D., et al. 2018. The MLL recombinome of acute leukemias in 2017. Leukemia 32(2), pp. 273–284.
- Mullighan2007(1,2,3,4,5)
Mullighan, C.G., Kennedy, A., Zhou, X., et al. 2007. Pediatric acute myeloid leukemia with NPM1 mutations is characterized by a gene expression profile with dysregulated HOX gene expression distinct from MLL-rearranged leukemias. Leukemia 21(9), pp. 2000–2009.
- Ohki2019
Ohki, K., Kiyokawa, N., Saito, Y., et al. 2019. Clinical and molecular characteristics of MEF2D fusion-positive B-cell precursor acute lymphoblastic leukemia in childhood, including a novel translocation resulting in MEF2D-HNRNPH1 gene fusion. Haematologica 104(1), pp. 128–137.
- Qian2017
Qian, M., Zhang, H., Kham, S.K.-Y., et al. 2017. Whole-transcriptome sequencing identifies a distinct subtype of acute lymphoblastic leukemia with predominant genomic abnormalities of EP300 and CREBBP. Genome Research 27(2), pp. 185–195.
- Rachieru-Sourisseau2010
Rachieru-Sourisseau, P., Baranger, L., Dastugue, N., et al. 2010. DNA Index in childhood acute lymphoblastic leukaemia: a karyotypic method to validate the flow cytometric measurement. International Journal of Laboratory Hematology 32(3), pp. 288–298.
- deRooij2017(1,2)
de Rooij, J.D.E., Branstetter, C., Ma, J., et al. 2017. Pediatric non-Down syndrome acute megakaryoblastic leukemia is characterized by distinct genomic subsets with varying outcomes. Nature Genetics 49(3), pp. 451–456.
- Ross2004(1,2,3,4,5)
Ross, M.E., Mahfouz, R., Onciu, M., et al. 2004. Gene expression profiling of pediatric acute myelogenous leukemia. Blood 104(12), pp. 3679–3687.
- Sadras2017(1,2)
Sadras, T., Heatley, S.L., Kok, C.H., et al. 2017. Differential expression of MUC4, GPR110 and IL2RA defines two groups of CRLF2-rearranged acute lymphoblastic leukemia patients with distinct secondary lesions. Cancer Letters 408, pp. 92–101.
- Shaver2015
Shaver, A.C., Seegmiller, A.C., Strickland, S.A., et al. 2015. Mutational burden in acute myeloid leukemia is largely age dependent. Blood 126(23), pp. 2605–2605.
- Sugeedha2021
Sugeedha, J., Gautam, J. and Tyagi, S. 2021. SET1/MLL family of proteins: functions beyond histone methylation. Epigenetics 16(5), pp. 469–487.
- Sun2017
Sun, C., Chang, L. and Zhu, X. 2017. Pathogenesis of ETV6/RUNX1-positive childhood acute lymphoblastic leukemia and mechanisms underlying its relapse. Oncotarget 8(21), pp. 35445–35459.
- The2019(1,2,3,4)
The Gene Ontology Consortium 2019. The Gene Ontology Resource: 20 years and still GOing strong. Nucleic Acids Research 47(D1), pp. D330–D338.
- Tonks2007
Tonks, A., Pearn, L., Musson, M., et al. 2007. Transcriptional dysregulation mediated by RUNX1-RUNX1T1 in normal human progenitor cells and in acute myeloid leukaemia. Leukemia 21(12), pp. 2495–2505.
- Valk2004(1,2,3)
Valk, P.J.M., Verhaak, R.G.W., Beijen, M.A., et al. 2004. Prognostically useful gene-expression profiles in acute myeloid leukemia. The New England Journal of Medicine 350(16), pp. 1617–1628.
- Verhaak2009
Verhaak, R.G.W., Wouters, B.J., Erpelinck, C.A.J., et al. 2009. Prediction of molecular subtypes in acute myeloid leukemia based on gene expression profiling. Haematologica 94(1), pp. 131–134.
- Winters2017
Winters, A.C. and Bernt, K.M. 2017. MLL-Rearranged Leukemias-An Update on Science and Clinical Approaches. Frontiers in pediatrics 5, p. 4.
- Yagi2003
Yagi, T., Morimoto, A., Eguchi, M., et al. 2003. Identification of a gene expression signature associated with pediatric AML prognosis. Blood 102(5), pp. 1849–1856.
- Yang2017
Yang, W., Tran, P., Khan, Z., Rezk, S. and O’Brien, S. 2017. MLL-rearranged mixed phenotype acute leukemia masquerading as B-cell ALL. Leukemia & Lymphoma 58(6), pp. 1498–1501.
- Yeoh2002
Yeoh, E.-J., Ross, M.E., Shurtleff, S.A., et al. 2002. Classification, subtype discovery, and prediction of outcome in pediatric acute lymphoblastic leukemia by gene expression profiling. Cancer Cell 1(2), pp. 133–143.