Dear Community,
I want to align SmartSeq platform-based single-cell data to the human genome using Alevin and/or StarSolo and then create a Seurat object. My data is paired-end. Each FASTQ file is around 4-10 GB in size.
R1 and R2 pairs look like this:
R1:
```
@A00814:396:HYJJ7DMXX:2:1101:1976:1000 1:N:0:TTATAACC+NCGATATC
NTCTCTGTATCAGCATATTAGCAATAACATATTTTTAAATGAAGGTATGTA
+
#FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF
@A00814:396:HYJJ7DMXX:2:1101:4146:1000 1:N:0:TTATAACC+NCGATATC
NGCATCTTTATGGTGTTCTCTGTATTTCCTGAATTTGAATGTTGGCCTGCC
+
#FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF
```
R2:
```
@A00814:396:HYJJ7DMXX:2:1101:1976:1000 2:N:0:TTATAACC+NCGATATC
GGTGCACATGAAGGCTATGTTTGCACTGTATTATGGTTTAAGTGTATAATA
+
FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF
@A00814:396:HYJJ7DMXX:2:1101:4146:1000 2:N:0:TTATAACC+NCGATATC
AAACACTCTGCAGGATATTATCCAGGAGAACTTCCCCAACCTAGAAAGGCA
+
FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF:FFFFFFF,
```
Does this data have barcodes? If yes, where is it?
"TTATAACC+NCGATATC" is present in all headers in the same FASTQ file. Different FASTQ files have different values at the end of the header. If "TTATAACC+NCGATATC" is the cell barcode, can I consider the whole FASTQ file to belong to a single cell? Or does this FASTQ file have sequence information for many cells?
I am looking forward to your assistance.
Thank you