ASC Student Supercomputing Competition 2025

Original English Document

ASC25_Preliminary_Round_Announcement.pdf

如果手机上内嵌预览仍无法正常纵向滚动，请使用“新窗口打开”或“下载 PDF”。

Preliminary Round Announcement

Dear ASC25 Teams:

Welcome to the 2025 ASC Student Supercomputer Competition (ASC25)!

The ASC Student Supercomputer Competition (ASC25) has been running for 13 years. As the world's largest supercomputing and AI "hackathon," it has always been committed to cultivating young talents and promoting innovation in supercomputing and AI. Since its launch on November 21, 2024, during the SC24 conference, ASC25 has received widespread attention, with hundreds of teams registered. The competition is about to enter the preliminary round stage on January 6, 2025, and we look forward to new sparks of innovation and collaboration.

Preliminary Round: Teams Prepare to Submit Their Work

In the preliminary round of ASC25, each participating team needs to do their best to complete the assigned tasks and submit detailed solution documents. The submission should include cluster design details, source code optimization ideas, and output result files. The ASC25 review committee will conduct strict and fair review of all solutions in English.

Submission Guidelines

Important Deadline: All participating teams must submit all required materials before February 21, 2025, 24:00 (UTC/GMT +8:00).

Solution Submission
- Solution naming format: [University/College Name]_[Contact Name] (e.g., ABC University John Doe).
- Save the solution as a single PDF file.
- Upload this PDF on the ASC25 official website: https://www.ascevents.net/Student Challenge/ASC25/Preliminary Submission.php
Other Materials
- All additional materials should be compressed into a single file, with the same naming convention as the solution file, e.g., ABC University John Doe.
- The compressed package must at least contain the following four folders (see Appendix A for requirements):
  - HPL output files
  - HPCG output files
  - AlphaFold3 inference challenge required files
  - RNA m^5C challenge required files
- Please upload the compressed package to the competition's official FTP server (the address will be notified via email later).
Submission Confirmation and Support
- Please make sure to complete and submit all required materials before the deadline, otherwise you will not receive corresponding scores.
- After all materials are successfully received, the participating team will receive a confirmation email.
- If you have any questions or need help, please contact:
  - Technical support: tech support@asc-events.org
  - General information: info@asc-events.org
  - Media: media@asc-events.org

The ASC25 Organizing Committee wishes all teams excellent results in this competition!

I. Introduction to the School's Work in Supercomputing (5 points)

Software and hardware platforms related to supercomputing
Courses, training, and interest groups related to supercomputing
Research and applications related to supercomputing
Brief description of up to 2 major achievements related to supercomputing research

II. Team Introduction (5 points)

Brief introduction to team composition
Personal introduction and photo of each member, including team photo
Team slogan or motto

III. Technical Solution Requirements (90 points)

1. HPC System Design (15 points)

Teams need to submit a theoretical design of an HPC cluster, not build a real physical cluster. The following requirements must be met:

The system is designed for optimal computational performance, with the cluster containing at least 3 compute nodes; the power consumption limit per node is 2000 W, and the total power consumption limit for the entire cluster is 4000 W.
Clearly specify the software and hardware configuration and interconnection structure of the system, explain power consumption estimation and performance estimation, and analyze the advantages and disadvantages of the designed architecture.

Note: The hardware in the table below is for reference only, based on dual-socket servers (supporting up to 2 GPUs). Note: The actual hardware configuration provided in the ASC25 finals may differ from this.

2. HPL and HPCG Benchmarks (15 points)

The solution should include a description of the software environment (operating system, compiler, math libraries, MPI software, versions, etc.), an explanation of performance optimization and testing methods, performance measurement results, problems encountered and their solutions, etc. If you can provide in-depth analysis of HPL and HPCG algorithms and source code, you will receive extra points.

HPL download: http://www.netlib.org/benchmark/hpl/
HPCG download: https://github.com/hpcg-benchmark/hpcg

It is recommended to perform HPL and HPCG verification and optimization tests on x86 CPUs and data center-grade GPUs. If you use other hardware platforms, you are also welcome to submit corresponding analysis and test results to demonstrate good performance.

3. AlphaFold3 Inference Optimization (30 points)

Task Introduction

Protein structure prediction is an important problem in biology that has lasted for half a century. The emergence of AlphaFold has significantly broken through this long-standing scientific challenge, and it was awarded half of the 2024 Nobel Prize in Chemistry. The latest version, AlphaFold3, has brand new capabilities that can accurately predict the structures of complexes containing ligands, proteins, and nucleic acids, which may revolutionize drug development, disease treatment, and understanding of the essence of life.

AlphaFold3 uses a diffusion model-based architecture. The input is one or more biological molecule sequences (including proteins, RNA, etc.), and the overall process is divided into two stages: data pipeline and model inference. The data pipeline is mainly completed on the CPU, using Jackhmmer/Nhmmer to construct protein or RNA MSAs in gene databases; model inference is performed on the GPU, by inputting MSA, templates, and original sequences into the Pairformer and diffusion modules to obtain the final structure prediction results.

In this preliminary round, only the model inference part is involved. The organizing committee has provided 12 single protein sequence examples, and MSAs and templates have been pre-generated, so there is no need to run the data pipeline again. The preliminary round task goal is to minimize inference time, specifically focusing on the following two aspects:

Optimization of the GPU inference process to minimize inference time. (10 points)
Migrate the inference code from GPU to CPU and optimize it to minimize the corresponding CPU inference time. (20 points)

For example, in the figure below, after running the inference stage, "Running model inference for seed 1 took 53.76 seconds" indicates that GPU inference took approximately 53.76 seconds.

After installing AlphaFold3, you can use the following command to skip the data pipeline and perform inference:

cd alphafold
python run_alpha_fold.py \
  --json_path=JSON文件路径 \
  --model_dir=模型参数路径 \
  --no_run_data_pipeline \
  --output_dir=输出目录

After inference is complete, a corresponding directory will be generated in the output directory based on the name field of the input file. The following example shows the output file structure for a file named hello_fold (only 1 seed was run):

Notes:

AlphaFold3 source code can be obtained from: https://github.com/googledeepmind/alphafold3 (this test used version 3.0.0).

AlphaFold3 model parameters can be obtained by filling out this form: https://forms.gle/sv vp Y 4 u 2 j sHE wW YS 6.

The input files for this task (including MSA) can be obtained from the ASC repository: https://github.com/ASC-Competition. No additional multi-library/multi-sequence database downloads are needed.

Result Submission

After completing the optimization, please run the 12 input cases on both GPU and CPU, and upload the corresponding results to the official FTP server. Since the total output file size exceeds 1GB, please package it and name it AlphaFold3.tar.gz before uploading. The reference file structure is as follows:

AlphaFold3
  ├── GPU-optimization   (Scripts or code for GPU inference process)
  ├── GPU-results
  │    ├── case_name_1
  │    │    (Default directory for inference output, consistent with the 'name' field in the input file. Structure is the same as the hello_fold example, similar content omitted below)
  │    └── SUMMARY_time.out  (Summary of inference times for all cases before and after optimization)
  ├── CPU-optimization   (Scripts or code for CPU inference process)
  └── CPU-results
       ├── case_name_1
       ├── case_name_2
       ├── ......
       └── case_name_12
       ├── case_name_1.log
       ├── case_name_2.log
       ├── ......
       └── case_name_12.log
       └── SUMMARY_time.out

Scoring Rules

Scoring is mainly based on the improvement of inference performance on CPU and GPU, while also referencing the details and principles of optimization strategies in the solution.
Please note:
1. BF16 precision is used by default, and cannot be lower than 16-bit;
2. AlphaFold3 model parameters cannot be modified;
3. The recycling count and diffusion count in config.json cannot be modified;
4. The provided JSON input files (including modelSeeds, sequences, MSA, etc.) cannot be modified;
5. The output directory structure and file content cannot be changed, including structure files, confidence files, etc.;
6. Both inference speed and structure prediction accuracy are important; points will be deducted if the structure is obviously unreasonable;
7. Logs with inference process time information (start time, time spent in each stage, etc.) must be submitted;
8. The solution should detail the inference process, machine configuration, environment setup, optimization strategies, and performance comparison, which will be the basis for scoring;
9. If required files are missing or not submitted as required, the score for this section will be 0.

4. RNA m^5C Modification Site Detection and Performance Optimization Challenge (30 points)

Task Introduction

RNA molecules commonly have various chemical modifications that have significant impacts on gene expression regulation, post-transcriptional processes, and protein translation. More than 170 RNA modifications have been discovered to date, among which 5-methylcytosine (m^5C) widely exists in different types of RNA and plays an important role in gene expression and regulation.

With the development of high-throughput sequencing (HTS), various methods such as RNA-Bis-seq and UBS-seq have been used to detect m^5C. However, most detection methods rely on the base conversion signal from cytosine (C) to thymine (T), which can easily lead to high false positives. How to ensure the accuracy and reliability of m^5C detection while reducing the false positive rate is one of the core challenges in current m^5C research.

This task requires participants to implement an analysis pipeline for detecting RNA m^5C based on given bioinformatics software, and to improve the pipeline as needed to improve detection accuracy and reliability while minimizing runtime. The competition will mainly examine m^5C detection accuracy, false positive control, and operational efficiency.

Unlike single tools, this task integrates a multi-component tool chain: starting from raw sequencing data (FASTQ), performing adapter removal, low-quality sequence cleaning, rRNA/tRNA filtering, genome alignment, m^5C site detection and strict filtering, and finally outputting a high-confidence m^5C candidate site list.

Reference: Dai, Q., Ye, C., Irkliyenko, I. et al. Ultrafast bisulfite sequencing detection of 5-methyl cytosine in DNA and RNA. Nat Biotechnol 42, 1559-1570 (2024). https://doi.org/10.1038/s41587-023-02034-w

Reference code: https://github.com/y9c/m5C-UBSseq

The main analysis pipeline can be found in the docs directory, and the Snakefile is also very helpful.

Main Analysis Pipeline

Dataset

Raw input data download link: https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE225614

It is recommended to use SRA Toolkit to download and extract the data.

Pipeline Guidance and Parameters

Reference Genome Index Construction
- Build indexes for rRNA, tRNA, and genome respectively, and include C->T converted sequences in the reference to make subsequent detection compatible.
- Input: rRNA, tRNA, genome reference sequences
- Output: rRNA, tRNA, genome indexes, and corresponding FAI and SAF files
Data Cleaning
- Remove adapters, trim low-quality bases, remove poly(A) tails, etc., to generate clean sequences for downstream analysis.
- Input: Raw FASTQ files
- Output: Cleaned FASTQ files, and statistics of filtered too-short or untrimmed sequences
rRNA Alignment
- Align cleaned reads to the rRNA reference to filter rRNA contamination.
- Input: Cleaned FASTQ
- Output: Reads aligned to rRNA, unaligned reads, alignment statistics, unaligned FASTQ
tRNA Alignment
- Align reads that were not aligned to rRNA in the previous step to the tRNA reference to filter tRNA contamination.
- Input: rRNA-unaligned FASTQ
- Output: Reads aligned to tRNA, unaligned reads, alignment statistics, unaligned FASTQ
Genome Alignment
- Align reads obtained after filtering rRNA and tRNA to the C->T converted genome index.
- Input: FASTQ not aligned to rRNA/tRNA
- Output: BAM aligned to genome, unaligned reads, alignment statistics
Sort BAM Files
- Sort BAM files for subsequent processing.
- Input: Aligned BAM
- Output: Sorted BAM
Merge, Statistics, and Deduplication
- Merge sorted alignment results, collect alignment statistics, and remove duplicate reads to avoid false positives caused by duplicates.
- Input: Sorted BAM
- Output:
  1. Merged alignment statistics (TSV)
  2. Deduplicated BAM and log
- After deduplication, remember to index the final BAM (.bam.bai).
Detection and Filtering of Sites
- Identify possible C->T conversion sites, outputting "unfiltered" and "filtered" unique/multi-mapped cases separately.
- Input: Deduplicated BAM
- Output: Four TSVs containing unfiltered uniq, unfiltered multi, filtered uniq, filtered multi cases
join_pileup: Merge the base conversion statistics of the same sample under the four cases of unfiltered uniq, unfiltered multi, filtered uniq, and filtered multi into one file.
- Script: join_pileup.py
- Input: The four TSVs above
- Output: One .arrow file, merging base counts by ref, pos, strand, etc.
group_pileup: Merge data from the same group (multiple samples), calculate coverage and key indicators, and generate preliminary m^5C candidate sites.
- Script: group_pileup.py
- Input: Multi-sample .arrow files
- Output: Merged .arrow containing coverage, ratios, and other indicators
- Related indicators:
  - u: Unconverted base count under filtered uniq conditions (multi-sample merged)
  - d: Total coverage under filtered uniq conditions (converted + unconverted, multi-sample merged)
  - ur: Unconverted ratio, u / d
  - mr: Multiple alignment ratio
  - cr: Proportion of reads lost from unfiltered to filtered
combined_select_sites: Perform preliminary threshold filtering on group_pileup output to obtain candidate m^5C sites.
- Script: select_sites.py
- Input: .arrow file generated by group_pileup
- Output: m^5C candidate site .tsv with only core columns (ref, pos, strand)
- Default filtering thresholds:
```
d >= 20
u >= 3
ur >= 0.02
cr < 0.5
mr < 0.2
```
stat_sample_background: Perform background estimation and statistical tests (such as binomial tests) for each sample to confirm final m^5C candidate sites.
- Script: filter_sites.py
- Input:
  1. join_pileup .arrow file (each sample)
  2. combined_select_sites .tsv file
- Output:
  1. Global background methylation ratio bg_ratio for each sample
  2. Detection result for each site in that sample (p-value, whether it passes threshold determination, etc.)
Pipeline logic:
1. First calculate the sample background methylation ratio bg_ratio (average unconverted ratio of remaining positions after removing all candidate sites).
2. Perform binomial test for each candidate site: if pval < 0.001 and (u>=2) and (d>=10) and (ur>0.02), then the site is determined as m^5C in this sample.
Merge Different Replicates (Biological Replicates)
- Only sites with p-values below 1e-6 in all three replicates are retained as final m^5C sites.

Processing Software and Parameter Recommendations

Reference Index Construction
- Software: hisat3n-build, samtools
- Recommended parameters: -p 12 --base-change C,T
Data Cleaning
- Software: cutseq
- Recommended parameters: -t 20 -A INLINE -m 20 --trim-polyA --ensure-inline-barcode
rRNA, tRNA Filtering and Genome Alignment
- Software: hisat3n, samtools
- rRNA, tRNA alignment parameters: --base-change C,T --mp 8,2 --no-spliced-alignment --directional-mapping
- Genome alignment parameters: --base-change C,T --pen-noncan-splice 20 --mp 4,1 --directional-mapping
Sorting and Deduplication
- Software: samtools sort, java + umi_collapse.jar
- Recommended parameters (samtools): -@ 20 -m 3G --write-index
- Recommended parameters (umi collapse): bam -t 2 -T 20 --data naive --merge avgqual --two-pass
Site Detection and Filtering
- Software: samtools view, hisat3n-table, bgzip
- Main parameters (samtools): -e "rlen<100000"
- Main parameters (hisat3n-table): -p N, -u/-m, --alignments, --ref..., --base-change C,T
- Main parameters (samtools view): -Q 10, -e "[XM]20 <= (qlen-sclen) && [Zf] <= 3 && 3*[Zf] <= [Zf]+[Yf]"
Related Scripts
- See https://github.com/y9c/m5C-UBSseq
- Example figure:

Result Submission

Pipeline Description File
- Record the naming convention of each intermediate file (such as alignment rate, QC statistics, etc.).
- If any scripts or programs have been modified, please attach documentation explaining the modifications and reasons.
- Also submit the time spent on each step (can be collected using the time command).
m^5C Site Files
- Output three filtered TSVs (e.g., SRR23538290.filtered.tsv, SRR23538291.filtered.tsv, SRR23538292.filtered.tsv).
Software Packaging
- Package the entire workflow into a software tool or container (e.g., using conda to unify environment configuration) for easy reproduction and to reduce environment issues.
Record Complete Pipeline Time
- Time statistics from the start of cutseq to the end of the entire pipeline, with screenshots as proof.
Package the required files and name them RNA.tar.gz before submission.

Scoring Criteria

Precision: Calculate the ratio of true positives (TP) to all detected positives (TP+FP).
$Precision = \frac{TP}{TP + FP}$
Correlation: Calculate the Pearson correlation coefficient for the unconverted ratio (ur) of true positive sites.
$r = \frac{\sum_{k=1}^{n}\bigl(ur_{\mathrm{true},k}-\overline{ur_{\mathrm{true}}}\bigr)\bigl(ur_{\mathrm{detect},k}-\overline{ur_{\mathrm{detect}}}\bigr)}{\sqrt{\sum_{k=1}^{n}\bigl(ur_{\mathrm{true},k}-\overline{ur_{\mathrm{true}}}\bigr)^{2}\cdot \sum_{k=1}^{n}\bigl(ur_{\mathrm{detect},k}-\overline{ur_{\mathrm{detect}}}\bigr)^{2}}}$
Accuracy Requirements: Precision >= 95% and Correlation >= 90%.
Performance Optimization: Under the premise of meeting accuracy requirements, the higher the operational efficiency, the higher the score. The optimization strategy should be detailed in the solution.

Workflow Example

Build Genome and ncRNA Index:
Data Processing Pipeline (Stage 1) Example (using one sample as an example):

cutseq /asc25/SRR23538290/SRR23538290.fastq -t 20 -A INLINE -m 20 --trim-polyA --ensure-inline-barcode \
  -o /asc25/SRR23538290/SRR23538290.fastq_cut \
  -s /asc25/SRR23538290/SRR23538290.fastq_too_short \
  -u /asc25/SRR23538290/SRR23538290.fastq_untrimmed

hisat-3n/hisat-3n --index /asc25/ncrna_ref/Homo_sapiens.GRCh38.ncrna.fa \
  --summary-file /asc25/SRR23538290/map2ncrna.output.summary \
  --new-summary -q -U /asc25/SRR23538290/SRR23538290.fastq_cut \
  -p 16 --base-change C,T --mp 8,2 --no-spliced-alignment --directional-mapping \
  | /asc25/samtools-1.21/samtools view -@ 16 -e '!flag.unmap' -O BAM \
  -U /asc25/SRR23538290/SRR23538290.ncrna.unmapped.bam \
  -o /asc25/SRR23538290/SRR23538290.ncrna.mapped.bam

samtools-1.21/samtools fastq -@ 16 -O /asc25/SRR23538290/SRR23538290.ncrna.unmapped.bam \
  > /asc25/SRR23538290/SRR23538290.mRNA.fastq

hisat-3n/hisat-3n --index /asc25/ref/Homo_sapiens.GRCh38.dna.primary_assembly.fa -p 16 \
  --summary-file /asc25/SRR23538290/map2genome.output.summary \
  --new-summary -q -U /asc25/SRR23538290/SRR23538290.mRNA.fastq \
  --directional-mapping --base-change C,T --pen-noncan-splice 20 --mp 4,1 \
  | samtools-1.21/samtools view -@ 16 -e '!flag.unmap' -O BAM \
  -U /asc25/SRR23538290/SRR23538290.mRNA.genome.unmapped.bam \
  -o /asc25/SRR23538290/SRR23538290.mRNA.genome.mapped.bam

samtools-1.21/samtools sort -@ 16 --write-index -O BAM \
  -o /asc25/SRR23538290/SRR23538290.mRNA.genome.mapped.sorted.bam \
  /asc25/SRR23538290/SRR23538290.mRNA.genome.mapped.bam

samtools-1.21/samtools view -@ 20 -F 3980 -c \
  /asc25/SRR23538290/SRR23538290.mRNA.genome.mapped.sorted.bam \
  > /asc25/SRR23538290/SRR23538290.mRNA.genome.mapped.sorted.bam.tsv

java -server -Xms8G -Xmx40G -Xss100M -Djava.io.tmpdir=/asc25/SRR23538290 \
  -jar /asc25/UMI_Collapse-1.0.0/umi_collapse.jar bam -t 2 -T 16 \
  --data naive --merge avgqual --two-pass \
  -i /asc25/SRR23538290/SRR23538290.mRNA.genome.mapped.sorted.bam \
  -o /asc25/SRR23538290/SRR23538290.mRNA.genome.mapped.sorted.dedup.bam \
  > /asc25/SRR23538290/SRR23538290.mRNA.genome.mapped.sorted.dedup.log

samtools-1.21/samtools index -@ 8 \
  /asc25/SRR23538290/SRR23538290.mRNA.genome.mapped.sorted.dedup.bam \
  /asc25/SRR23538290/SRR23538290.mRNA.genome.mapped.sorted.dedup.bam.bai

samtools-1.21/samtools view -e "rlen<100000" -h \
  /asc25/SRR23538290/SRR23538290.mRNA.genome.mapped.sorted.dedup.bam \
  | hisat-3n/hisat-3n-table -p 16 -u --alignments - \
  --ref /asc25/ref/Homo_sapiens.GRCh38.dna.primary_assembly.fa \
  --output-name /dev/stdout --base-change C,T \
  | cut -f 1,2,3,5,7 \
  | gzip -c > /asc25/SRR23538290/SRR23538290_unfiltered_uniq.tsv.gz

The above pipeline needs to be executed for each dataset.

Data Processing Pipeline (Stage 2)

Notes

Do not make any changes to the raw sequencing data.
Do not modify the final output and log files.
Under the premise of ensuring the pipeline and results are correct, minimize the runtime as much as possible.

For further questions, please contact tech support@asc-events.org

Original English Document​

Preliminary Round Announcement​

Preliminary Round: Teams Prepare to Submit Their Work​

Submission Guidelines​

I. Introduction to the School's Work in Supercomputing (5 points)​

II. Team Introduction (5 points)​

III. Technical Solution Requirements (90 points)​

1. HPC System Design (15 points)​

2. HPL and HPCG Benchmarks (15 points)​

3. AlphaFold3 Inference Optimization (30 points)​

Task Introduction​

Result Submission​

Scoring Rules​

4. RNA m^5C Modification Site Detection and Performance Optimization Challenge (30 points)​

Task Introduction​

Main Analysis Pipeline​

Dataset​

Pipeline Guidance and Parameters​

Processing Software and Parameter Recommendations​

Result Submission​

Scoring Criteria​

Workflow Example​

Notes​

Original English Document

Preliminary Round Announcement

Preliminary Round: Teams Prepare to Submit Their Work

Submission Guidelines

I. Introduction to the School's Work in Supercomputing (5 points)

II. Team Introduction (5 points)

III. Technical Solution Requirements (90 points)

1. HPC System Design (15 points)

2. HPL and HPCG Benchmarks (15 points)

3. AlphaFold3 Inference Optimization (30 points)

Task Introduction

Result Submission

Scoring Rules

4. RNA m^5C Modification Site Detection and Performance Optimization Challenge (30 points)

Task Introduction

Main Analysis Pipeline

Dataset

Pipeline Guidance and Parameters

Processing Software and Parameter Recommendations

Result Submission

Scoring Criteria

Workflow Example

Notes