Frequent copy number gains of SLC2A3 and ETV1 in testicular embryonal carcinomas

Testicular germ cell tumours (TGCTs) appear as different histological subtypes or mixtures of these. They show similar, multiple DNA copy number changes, where gain of 12p is pathognomonic. However, few high-resolution analyses have been performed and focal DNA copy number changes with corresponding candidate target genes remain poorly described for individual subtypes. We present the first high-resolution DNA copy number aberration (CNA) analysis on the subtype embryonal carcinomas (ECs), including 13 primary ECs and 5 EC cell lines. We identified recurrent gains and losses and allele-specific CNAs. Within these regions, we nominate 30 genes that may be of interest to the EC subtype. By in silico analysis of data from 150 TGCTs from The Cancer Genome Atlas (TCGA), we further investigated CNAs, RNA expression, somatic mutations and fusion transcripts of these genes. Among primary ECs, ploidy ranged between 2.3 and 5.0, and the most common aberrations were DNA copy number gains at chromosome (arm) 7, 8, 12p, and 17, losses at 4, 10, 11, and 18, replicating known TGCT genome characteristics. Gain of whole or parts of 12p was found in all samples, including a highly amplified 100 kbp segment at 12p13.31, containing SLC2A3. Gain at 7p21, encompassing ETV1, was the second most frequent aberration. In conclusion, we present novel CNAs and the genes located within these regions, where the copy number gain of SLC2A3 and ETV1 are of interest, and which copy number levels also correlate with expression in TGCTs.


458
A M Ho , S M Kraggerud, S Alagaratnam et al. DNA copy numbers in embryonal carcinomas

27:9
Endocrine-Related Cancer over the last 50 years, especially in industrialized countries. Genome-wide association studies (GWAS) and large-scale meta-analyses (Chung et al. 2013, Rajpert-De Meyts et al. 2016, Litchield et al. 2017, Wang et al. 2017 have identiied above 40 susceptibility loci. The mutation load has been found to be low, comparable to paediatric cancer types (Brabrand et al. 2015, Litchield et al. 2015, but TGCTs are characterized by aneuploidy and a high degree of DNA copy number changes (Oosterhuis et al. 1989, Lothe et al. 1995, Taylor-Weiner et al. 2016. TGCTs can be divided into two main histological types, seminomas and non-seminomas, with the latter comprising embryonal carcinomas (ECs), teratomas, choriocarcinomas, and yolk sac tumours. The various histological subtypes of TGCT have remarkably similar DNA copy number aberration (CNA) patterns, although some particular differences have been described (Kraggerud et al. 2002, Skotheim et al. 2006, Korkola et al. 2008. The isochromosome 12p and/or gain of 12p sequences are pathognomonic to TGCT and used for diagnostic purposes for extragonadal tumours of unknown origin (Sandberg et al. 1996). Most genome-wide DNA copy number studies of TGCTs to date have been performed using relatively low-resolution technologies, but recently TCGA published a multilevel genomics paper, including next-generation sequencing and highresolution single nucleotide polymorphisms (SNPs) microarray analysis of 150 TGCTs, including 27 tumours classiied as EC (18 pure EC and 9 mixed) according to the International Classiication of Diseases for Oncology (ICD-O) morphological codes (Shen et al. 2018).
EC is a pluripotent histological subtype of TGCT that can be present alone or as one of several components in the tumour. ECs can be considered the malignant counterpart of embryonic stem (ES) cells, as both are pluripotent and have the capacity to differentiate. Identiication of molecular differences between the two cell types may help resolve tumourigenic mechanisms and cellular pathways involved. We previously identiied a discriminating gene expression signature between EC and ES cell lines, including a number of pluripotency and cancer-related genes (Alagaratnam et al. 2013). ES cell lines have been characterized for DNA CNAs on high-resolution SNP platforms (Närvä et al. 2010, Amps et al. 2011, where several higher-passage cells showed aberrations similar to those in TGCTs (Baker et al. 2007).
In this study, we proiled 13 pure primary EC tumours, as well as 12 cell lines (5 EC and 7 ES) on the high-resolution, whole-genome Affymetrix SNP 6.0 DNA copy number platform. We present a comprehensive overview of the EC subtype, identifying recurring regions of loss of heterozygosity (LOH) and focal regions of gains and losses, which harbour genes that may be of importance in EC development. These genes were further investigated in publically available multi-omics datasets and a transcriptional impact was conirmed for several of the genes.

Sample preparations
Genomic DNA from 13 primary ECs had previously been isolated by phenol/chloroform extraction. Tumour percentage was estimated by an experienced pathologist on the basis of haematoxylin and eosin stained sections for 10/13 samples. For each case, the tumour percentage was calculated as the average of the tumour percentage of three sections, taken at either end and in the middle of the tumour sample used for DNA isolation. The median pathology tumour percentage was 49%, and ranged from 22% to 78%. Ten of the 13 primary ECs included in the current study have previously been analysed by chromosomal comparative genomic hybridization (cCGH; n = 6, Kraggerud et al. 2002) and array CGH (aCGH; n = 5, Skotheim et al. 2006), with one sample (EC no. 1838) analysed with both technologies.

DNA copy number pro ling of primary tumours and cell lines
Three sets of samples were analysed for genome-wide DNA copy number on Affymetrix SNP 6.0 microarrays: primary EC tumours (n = 13), EC cell lines (n = 5), and ES cell lines (n = 7). For each sample, 500 ng of genomic DNA was used as input for the Cytogenetics Copy Number Assay protocol for Genome-Wide Human SNP 6.0 arrays (Thermo Fisher Scientiic). The samples were individually processed and hybridized as described in the Affymetrix Cytogenetics Copy Number Assay User Guide (P/N 702607 Rev. 2). Endocrine-Related Cancer

Data processing, target region analysis and statistics
The resulting cell intensity (CEL) iles after hybridization were within recommended QC thresholds (CQC >0.4; MAPD <0.35). Signal extraction and pre-processing of the raw data was performed using the PennCNV protocol modiied for Affymetrix genotyping arrays with Affymetrix Power Tools version 1.15.0 as described earlier (Sveen et al. 2016). HapMap samples previously analysed on the SNP Array 6.0 (n = 270), were used as reference for normalization, log R ratio (LRR), and B-allele frequency (BAF) calculation. Probes targeting the allosomes, control probes (n = 3643), duplicate probes (one of the two probes covering overlapping genomic loci (n = 187), and probes mapping to regions with recurrent high frequency aberrations in non-cancer samples from several organs (n = 6668) were removed (Sveen et al. 2016). For copy number analysis, preprocessed LRR data results from primary tumours and cell lines were used for single-sample segmentation, using the Piecewise Constant Fitting (PCF) algorithm in the R package copynumber (version 1.14.0). The user-deined penalty parameter γ was set to 100 and the minimum number of probes per segment, k min was set to 5. PCF segments with copy number estimates ≥0.15 were called as gains and segments with estimates ≤−0.15 were called as losses. The results were visualized using the copynumber R package. In addition, CNAs (gains, ampliications (deined as high gains >0.45), and deletions) were extracted for 27 target genes, earlier identiied by our group as differentially expressed in EC vs ES (Alagaratnam et al. 2013).
For genomic identiication of signiicant cancer related regions/genes, PCF segmented data for the primary ECs was used as input for the GISTIC 2.0.22 algorithm (Mermel et al. 2011). Copy number estimates >0.1 were called as copy number gain, while estimates <−0.1 were called as loss. The broad length cut-off was set to 0.5 (−brlen 0.5), the conidence level was set to 0.90 (−conf 0.90), normal arbitrated peel-off was performed (−armpeel 0), and we calculated the signiicance of deletions at a gene level (−genegistic 1), otherwise default settings. The reference genome ile hg19.mat was used. Signiicant broad events were deined as events with a q-value <0.05, and signiicant focal events as events with q-values <0.25.
Preprocessed and normalized LRR and BAF data for the primary ECs was analysed using the allelespeciic copy number analysis of tumours (ASCAT) v.2.3 algorithm to obtain allele-speciic copy number estimates (Van Loo et al. 2010). ASCAT data were subsequently used to call regions with ampliications and LOH.
However, as blood/germline DNA was not analysed, the LOH regions may include germline homozygous regions. By ASCAT, we also estimated ploidy and aberrant cell fraction of each tumour. The penalty parameter was set to 50 and discrete copy number states were determined relative to the median genome-wide copy number in each tumour sample.
The fraction of the genome with CNA or LOH was calculated as the number of aberrant bases out of the total number of bases with copy number and LOH estimate available, respectively.
Copy number estimates per gene were retrieved by mapping chromosomal segments from each sample to the R implemented transcript database TxDb.Hsapiens.UCSC. hg19.knownGene (v3.2.2) (Carlson et al. 2015), utilizing the indOverlaps function from the GenomicRanges R package (v1.28.3) (Lawrence et al. 2013). Gene symbols were collected using the R package org.Hs.eg.db (Carlson 2017) and updated to the approved symbols according to HUGO Gene Nomenclature Committee. For GISTIC, the output contains the genes located in the identiied focal regions. However, to obtain a inal target gene list, the regions identiied with focal CNAs by GISTIC, were also manually examined for protein coding genes in Ensembl (Version 87, GRCh37) and these were added to the list of target genes. All genomic positions refer to genome version GRCh37 (hg19). Pseudogenes and genes annotated as non-coding in Ensembl were not considered. Analysis of DNA and RNA level data from TCGA are described in the Supplementary Materials and methods (see section on supplementary materials given at the end of this article).

DNA copy number aberrations in primary ECs, compared to EC and ES cell lines
By use of PCF segmentation, we identiied similar CNAs in primary ECs and EC cell lines (Fig. 1). In general, the frequencies of CNAs were higher for EC cell lines than for primary EC tumours. The most frequent aberrations observed for primary ECs were gain of 12p (100% of the samples) and gains of the whole or parts of chromosomes 7, 8, and 17 and losses of the whole or parts of chromosomes 4, 10, 11, 15, and 18 (>30%). From the PCF segmented data, apart from gain of 12p, the two most frequent aberrations were a region of gain at 7p21 (12,327,412,764) and a region of loss at 10q11-q21 (47,757,156,269 Fig. 1), we observed that the 13 primary ECs varied markedly in both number of gains and losses and the proportion of the genome affected by CNAs. The aberrations were typically broad events of chromosome arm-length, and median genome wide CNA for the 13 EC samples was 12% (mean 23%; Supplementary  Fig. 1). Three samples (EC 28, EC 1740, and EC 1838) had only nine percent genome wide CNA, whereas the two samples with the highest percent of aberrations (EC 1017 and EC 3113) had 53 and 56% genome wide CNA.
We observed ive recurrent, CNA regions in the seven early passage ES cell lines ( Fig. 1 and Table 2), including focal loss in regions 1q21.3 and 3q22.1 (both in two ES cell lines). These regions overlap with larger segments of loss also found in primary ECs (Table 1) and covers the genes LCE1E and ALG1L2, respectively. Endocrine-Related Cancer

Signi cant DNA copy number events in primary ECs
PCF-segmented data from the 13 primary ECs were analysed with GISTIC to identify statistically signiicant CNAs, both in terms of chromosome arm-level (broad; Supplementary Table 1) and focal events. We identiied three signiicant focal regions of gain, located at 12p13.31, 12p11.1, and 22q11.23; and ive signiicant focal regions of loss, located at 1p36.11, 1q21.3, 3q22.1, 11q11, and 17p11.2 (Table 1 and Supplementary Fig. 2). Although the 1q21.3 and 3q22.1 segments covers the LCE1E and ALG1L2 genes, also found to be lost in ES cell lines, they were not excluded from further analyses.

Ploidy, allele-speci c copy number pro les, and LOH in primary ECs
Ploidy estimates for the 13 tumours, as calculated by the ASCAT algorithm, ranged from 2.3 to 5.0. The ploidy values formed two clusters, one between 2.3 and 2.8 (9/13 tumours) and one between 4.4 and 5.0 (4/13 tumours; Supplementary Fig. 3). Individual allelespeciic proiles of the 13 tumours are shown in Supplementary Fig. 4. The ASCAT analysis revealed a minimal amplicon of 100 kbp (chr12: 8,024,362-8,123,900) that was present at 15 and 31 additional copies in two individual tumours and gained across all 13 tumours (Fig. 2). For 12/13 ECs, this amplicon was the segment, or was included within the 12p segment, with the overall highest copy number. This segment contains the SLC2A3 gene and parts of

SLC2A14.
LOH was determined from the allele-speciic copy number proiles for the primary ECs (Fig. 1). The fraction of the genome with LOH varied from 15 to 42%, with a median of 26%. LOH was detected in one or more samples for all the autosomal chromosomes, and encompassed larger regions for 6 of 13 ECs on chromosome arms 4q, 9q, 18p, and 18q. Within these broad regions of LOH, four additional focal regions of LOH were detected  Endocrine-Related Cancer (4q21.21, 4q33-q34.1, 9q34.2, and 18q12.1), indicated as peaks in Fig. 1, and present in at least 9 of the 13 ECs (Table 3). Interestingly, a region on chromosome arm 9q showed frequent LOH but no copy number loss (Fig. 1), and is thus a copy neutral LOH.

Di erentially expressed genes associated with DNA copy number levels
In a previous study, we identiied 28 differentially expressed genes between EC and ES cell lines (Alagaratnam et al. 2013). The relative gene expression and the corresponding copy number changes from PCF for 27 genes (one was located on chromosome X) are shown for EC cell lines and tumours in Fig. 3. Six of the 16 genes with higher expression in EC compared to ES cell lines are localized on chromosome arm 12p (C12orf4, DPPA3, GOLT1B, NOP2, PARP11, and TULP3) and showed gain in all and ampliication in most EC cell lines (4/5) and primary ECs (9/13). However, the 10 remaining genes, and the 11 genes with lower expression in EC compared to ES cell lines, were in regions with few CNAs.

denti cation of target genes a ected by CNAs
Within the identiied regions of CNA or LOH in the EC subtype, there are several protein-coding genes of potential interest to EC development. There are 16 genes located in the GISTIC-deined focal loss or gain regions, ive genes within the ASCAT-deined LOH regions and in the two regions showing the most frequent aberrations (apart from 12p), as identiied by the PCF segmented data, there are three genes known to be cancer critical genes according to COSMIC. In addition, six genes previously identiied as differentially expressed between EC and ES cell lines were also found to be gained or ampliied in EC tumours. Taken together, we nominate 30 protein-coding genes affected by CNAs and/or LOH to be of interest to the EC subtype (Supplementary Table 2).

Figure 2
Minimal amplicon of 100 kbp on chromosome arm 12p. Copy number aberrations on chromosome 12 from 13 primary ECs, plotted by median adjusted copy number, from ASCAT analysis and genomic position. To allow visibility of all DNA copy number chromosome 12 segments, for each tumour, the lines were adjusted. Segments <0.5 Mb are enlarged as circles to increase their visibility. Endocrine-Related Cancer

DNA copy number and mRNA expression among TGCTs in TCGA data
For further investigation of the genes affected by CNAs and/or LOH in ECs, we analysed copy number levels for 27 of the 150 TGCT tumours from TCGA classiied as EC according to the ICD-O morphological codes (18 pure EC and 9 mixed). The genes identiied at 12p, including SLC2A3 and SLC2A14, were gained in all 27 samples and were highly ampliied in 8 of the samples (30%; Supplementary Fig. 5). ETV1, located at 7p21 was gained in 25 of the 27 samples, while CCDC6 and NCOA4 located at 10q11-q21 had copy number loss in 21 and 20 samples, respectively. Surprisingly, many of the genes located in focal regions identiied as statistically signiicant loss in our cohort by GISTIC, for example, 1q21.3, 1p36.11 and 3q22.1, were infrequently lost in the EC cohort from TCGA ( Supplementary Fig. 5). A signiicant correlation (q < 0.05) between DNA copy number and mRNA expression data was seen for 15 of the 30 genes. These were ETV1 and CCDC6 (from PCF-identiied gain/loss); LRP5L and SLC2A3 (from GISTIC-identiied focal gain); TMEM50A and TRH (from GISTIC-identiied focal loss); ANTXR2, BRD3, BRD3OS, and VAV 2 (from ASCAT-identiied LOH); C12orf4, DPPA3, GOLT1B, NOP2 and PARP11 (previously identiied as differentially expressed between EC and ES cell lines; Supplementary Fig. 5). Correlation between copy number and gene expression remained signiicant for four of the genes when only considering the EC subset (n = 27). This included a strong correlation for ETV1 (R = 0.8, q < 0.0001).

Somatic mutations among TGCTs in TCGA
TCGA whole-exome sequencing data were examined for somatic mutations in the 30 genes. We found that six of the 150 TGCT samples contained markedly higher numbers of mutations genome-wide (median 1091.5 mutations), than the remaining TGCTs (median 38.5 mutations), and omitted these from further analysis. Among the included 144 tumour samples, 20 (4/20 diagnosed as EC) were found to harbour somatic, non-synonymous mutations in 11 of the 30 genes (Supplementary Table 3). Non-synonymous mutations in two or more TGCTs were identiied in ANTXR2, LCE1F, SLC2A3, SLC2A14, and TULP3.

Fusion transcript breakpoints including target genes/regions among TGCTs in TCGA
Next, we evaluated whether CNAs were associated with generation of fusion genes. After analysis of RNA-sequencing data from TCGA's TGCT samples, the intersection of the outputs from two fusion inder software, FusionCatcher and deFuse, resulted in 1956 nominated fusion transcript breakpoints (range 2 to 49 per sample, median = 10). None of these transcript breakpoints involved the 30 genes affected by CNAs. However, when considering breakpoints of fusion transcripts within 1 Mbp of the identiied CNA segments, we detected the previously described CLEC6A-CLEC4D read-through fusion transcripts (Hoff et al. 2016) in 12 of 150 TGCTs. Additionally, two fusion transcripts, LIN28A-CD52 and LRP6-LRRC23 were each detected in individual samples. Both these fusion transcripts were nominated with breakpoints joining the canonical exon-boundaries of the partner genes and are predicted to maintain reading frames (Table 4). These two fusion transcripts were however found to be predominantly expressed in the seminoma subtype of the TCGA samples (Table 4).
Interestingly, FusionCatcher and deFuse individually nominated a vast number of breakpoints involving Endocrine-Related Cancer SLC2A3; 137 and 364, respectively. These breakpoints did not include the same partner genes in the individual samples and were therefore not considered in the intersected analysis. However, we observed that the number of breakpoints nominated per sample correlated between FusionCatcher and deFuse and that the nominated breakpoints were mostly in ECs and mixed germ cell tumours (16 and 15 out of in total 43, respectively; Supplementary Fig. 6). Overall in 150 TGCTs, the correlation between gene expression and DNA copy numbers of SLC2A3 was signiicant (Spearman: R = 0.55, q = 9 × 10 −12 ), whereas when only considering cases that had at least one nominated fusion breakpoint with SLC2A3 (n = 43), the correlation was not signiicant (Spearman: R = 0.26, P = 0.09).

Discussion
We have here performed high-resolution DNA copy number analysis of the EC subtype of TGCT, and identiied broad and focal CNAs as well as allele-speciic CNAs, including LOH. We have nominated altogether 30 genes which may be related to EC within the regions affected by CNAs, including SLC2A3 from chromosome arm 12p and ETV1 on 7p21.
The CNA proiles varied in complexity among primary ECs. Both individual EC copy number proiles and the summarized overall CNA frequency plots, are in agreement with TGCT and EC proiles in particular (Kraggerud et al. 2002, Skotheim et al. 2006, Korkola et al. 2008, however, in this study with higher resolution than previously reported. Previous studies of the copy number landscape of EC include two aCGH studies (n = 25 (Korkola et al. 2008)  .32, and of loss at 22q12.2. Our results are in agreement with alterations reported in these studies. However, apart from the common 12p gain and frequent 7p gain, none of the signiicant, focal CNA identiied here were reported by Korkola et al. (2008) or Gilbert et al. (2011). To our knowledge, only one SNP microarray study  Endocrine-Related Cancer has been published, including 18 pure ECs and 9 mixed TGCTs with a dominant EC proportion of in total 137 TGCTs (Shen et al. 2018). EC subtype-speciic CNAs are not reported in this TCGA study; however, they report that ECs' CNA proiles cluster into three of ive identiied CNA groups. Among the focal, GISTIC-identiied alterations in the TGCT cohort of TCGA, gain at 12p12 is in agreement with our results.

ES and EC cell lines
ES and EC cells have many common characteristics, and culture-adaptation of ES cells have been acknowledged as a model system for EC carcinogenesis (Andrews et al. 2005). All seven ES cell lines included are previously analysed for CNAs on SNP microarrays (Närvä et al. 2010, Amps et al. 2011. Aberrations, identiied in individual cell lines at early passage were also found in our dataset, including gain at 2p11.2 and 3q26.1. A recurrent gain on 20q11.21 in ES cell lines is suggested to confer a growth advantage (Amps et al. 2011). However, this gain is not found in the ES cell lines applied in our study. Also, none of the primary ECs showed gain of the 20q11.21 region. Still, among EC cell lines, two showed gain and one a borderline gain, supporting that this CNA may be induced by cell culturing rather than relevant for EC tumourigenesis.

Ploidy estimates of ECs
Ploidy estimates by ASCAT showed that 9/13 (69%) of primary ECs were hyperdiploid to triploid, while 4/13 (31%) were tetraploid to pentaploid. However, the algorithm gives an estimate of the on average ploidy and does not account for sub-clonality. This result is largely in agreement with previous studies, where ECs are often categorized as aneuploid or hypotriploid, and with low cytometry often several aneuploid cell population are observed (Fosså et al. 1991, Burger et al. 1994. The near triploidy among ECs has also been shown in cytogenetic studies (Sandberg et al. 1996).

SLC2A3 and SLC2A14 in ECs
Gain of 12p was detected in all primary EC samples and EC cell lines, supporting its role as an early driver event in EC development. High-level ampliication of 12p segments has been reported in TGCT (Kraggerud et al. 2002, Skotheim et al. 2006, mostly focusing on a 12p11.2-p12.1 amplicon (Bourdon et al. 2002. Interestingly, we identiied two novel segments with focal ampliication; a 3.5 Mbp segment on 12p11.1 with no annotated genes, and a 100 kbp segment on 12p13.31. The latter segment corresponds to minimal amplicons present at estimated 15 and 31 additional copies in two individual ECs. This segment overlaps with both a larger region of ampliication at 12p13 identiied in a CGH study of TGCT cell lines (Henegariu et al. 2004) and a 200 kbp region/ gene cluster at 12p13.31 that exhibits coordinated overexpression in both ECs and seminomas (Korkola et al. 2006). The small, 100 kbp ampliied region contains two glucose transporter genes, SLC2A3 and parts of SLC2A14.
Increased SLC2A3 expression is reported in TGCTs compared to normal testis (Rodriguez et al. 2003), and validated as a sensitive and speciic marker for the EC and yolk sac tumour histological subtypes (Howitt et al. 2013).
In vitro differentiation of EC cells, with subsequent loss of tumourigenic potential, is reported to repress several pluripotency genes at this locus, including NANOG, GDF3, and DPPA3, but also SLC2A3 (Giuliano et al. 2005). SLC2A14 is a paralog of SLC2A3 and with major expression in testis. We showed in data from TCGA, that the expression signiicantly correlates with copy number gains for SLC2A3, but not for SLC2A14. These results imply that ampliication and over-expression of SLC2A3 may be a common mechanism for activation. SLC2A3 and SLC2A14 were among the most frequently mutated of the investigated target genes (each observed with somatic mutation in three TCGA TGCTs, where one had an EC component). A large number of fusion transcript breakpoints were nominated for SLC2A3. Interestingly, expression of SLC2A3 and DNA copy number did not correlate signiicantly for the samples that had nominated SLC2A3 fusion breakpoints, which indicates that overexpression of SLC2A3 in these cases is regulated by other mechanisms than the number of gene copies alone.
The roles of SLC2A14 and SLC2A3 in cancer have more recently gained attention. SLC2A14 (or GLUT14) expression is deregulated in several cancer types and is suggested to be a prognostic factor for a number of cancers, for example, in thyroid carcinoma (Chai et al. 2017). SLC2A3 (alias GLUT3) encodes a glucose transporter with a ive-fold higher afinity for glucose than its ubiquitous family member GLUT1 (Simpson et al. 2008), making its expression an advantage in glucosepoor microenvironments with high glucose demands, such as in certain tumour environments. Indeed, SLC2A3 expression correlates with poor survival in several cancers, including brain and gastric cancers (Flavahan et al. 2013, 466 A M Ho , S M Kraggerud, S Alagaratnam et al.
DNA copy numbers in embryonal carcinomas

27:9
Endocrine-Related Cancer Schlößer et al. 2017). While broad level gain of 12p in TGCTs appears likely to confer the pluripotent phenotype for initiation of tumourigenesis, the focal ampliication of the region containing SLC2A3 may grant a proliferative advantage in progression and development of the tumour.
CNAs at 7p and 10q a ect the cancer critical genes

ETV1 and CCDC6
The second most frequently gained (after 12p) and the most frequently lost regions in ECs were located at 7p and at 10q, respectively. Among the genes located in these regions, ETV1, CCDC6, and NCOA4 are known cancer critical genes. Several previous studies indicate that the functions of these genes are relevant in respect to TGCT development. Activated KIT is reported to prolong ETV1 protein stability and cooperate with ETV1 to promote tumorigenesis in gastrointestinal tumours (Chi et al. 2010). Disruption of the KIT-KITLG/MAPK signalling pathway is implicated in TGCT formation both as a predisposing germline risk factor and somatic driver event (Litchield et al. 2015(Litchield et al. , 2017. ETV1 has been shown to upregulate the expression of androgen receptor target genes and promote autonomous testosterone production (Baena et al. 2013). CCDC6 is a tumour-suppressor and a pro-apoptotic protein involved in DNA damage response and repair (Merolla et al. 2012). Loss of CCDC6 has been suggested to contribute to testicular neoplastic growth (Staibano et al. 2013) and could enhance tumour progression by impairing apoptosis following DNA damage (Cerrato et al. 2018). In effect, loss of CCDC6 has also been implicated as a biomarker to sensitizing cancer cells to treatment with PARP inhibitors (Cerrato et al. 2018).

Fusion genes located on chromosome arm 12p
We have previously identiied novel fusion transcripts in TGCT (Hoff et al. 2016). In this study we analysed RNA sequencing data of TGCTs from the TCGA for the expression of fusion transcripts in proximity (1 Mbp) of identiied regions of gain, loss, and LOH. We reasoned that CNAs may relect structural rearrangements that form fusion genes. We repeatedly identiied the fusion event CLEC6A-CLEC4D (n = 12 patients) and also two private fusion events, LIN28A-CD52 and LRP6-LRRC23. These fusions were, however, found expressed in non-EC histological subtypes (Table 4). Both genes involved in the CLEC6A-CLEC4D and the LRP6-LRRC23 fusion genes are located on chromosome arm 12p. Previously, we described several other private fusion genes on 12p (Hoff et al. 2016). The recurrent structural alterations of 12p may be a common mechanism for the generation and expression of fusion genes in TGCT. However, the biological impact of these mostly private fusion gene events is uncertain.
In conclusion, by use of high-resolution SNP microarrays and advanced analyses, we present allelespeciic copy number proiles for primary ECs and several novel focal CNAs. Within the regions affected by CNAs, we report 30 target genes that may be of interest to further our understanding of the EC subtype. High ampliication of a 100 kbp segment at 12p13.31 containing SLC2A3 was identiied and the second most common CNA identiied as gain at 7p21 encompassed the cancer critical gene ETV1. Increasing DNA copy numbers were found to be correlated with increased gene expression of SLC2A3 and ETV1. Endocrine-Related Cancer