25 August 2023 0 6K Report

Dear Community,

I want to align SmartSeq platform-based single-cell data to the human genome using Alevin and/or StarSolo and then create a Seurat object. My data is paired-end. Each FASTQ file is around 4-10 GB in size.

R1 and R2 pairs look like this:

R1:

```

@A00814:396:HYJJ7DMXX:2:1101:1976:1000 1:N:0:TTATAACC+NCGATATC

NTCTCTGTATCAGCATATTAGCAATAACATATTTTTAAATGAAGGTATGTA

+

#FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF

@A00814:396:HYJJ7DMXX:2:1101:4146:1000 1:N:0:TTATAACC+NCGATATC

NGCATCTTTATGGTGTTCTCTGTATTTCCTGAATTTGAATGTTGGCCTGCC

+

#FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF

```

R2:

```

@A00814:396:HYJJ7DMXX:2:1101:1976:1000 2:N:0:TTATAACC+NCGATATC

GGTGCACATGAAGGCTATGTTTGCACTGTATTATGGTTTAAGTGTATAATA

+

FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF

@A00814:396:HYJJ7DMXX:2:1101:4146:1000 2:N:0:TTATAACC+NCGATATC

AAACACTCTGCAGGATATTATCCAGGAGAACTTCCCCAACCTAGAAAGGCA

+

FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF:FFFFFFF,

```

Does this data have barcodes? If yes, where is it?

"TTATAACC+NCGATATC" is present in all headers in the same FASTQ file. Different FASTQ files have different values at the end of the header. If "TTATAACC+NCGATATC" is the cell barcode, can I consider the whole FASTQ file to belong to a single cell? Or does this FASTQ file have sequence information for many cells?

I am looking forward to your assistance.

Thank you

Similar questions and discussions