Need help with verifying the results of DNA sequence analysis work. Can anyone help?

11 May 2023 1 2K Report

Dear colleagues,

I am writing to you to request your assistance in evaluating the results of my research on DNA comparison and analysis. I am not an expert in genetic engineering and would like to receive expert feedback on my work.

The following tasks were performed as part of the study:

DNA comparison was performed for influenza viruses of segment A H1N1 H3N2, with the results presented in the bestmatch.json file. An example of element-wise comparison is provided in the pa_pb1.json file. The accuracy of the match is determined by weight, such as "w":0.472249629.

A search was conducted for identical segments in the DNA sequence. The original file is HA.seq. The results are presented in the following format: [{length, number of variations with this length, number of occurrences of these variations in the original larger sequence}].

[ {2, 25, 403380}, {3, 18, 114124}, {4, 16, 31748}, {5, 16, 7710}, {6, 16, 2893}, {7, 14, 685}, {8, 3, 282}, {9, 5, 137}, {10, 3, 3}, {11, 4, 5}, {12, 2, 2}, {13, 5, 135}, {14, 5, 6}, {15, 4, 132}, {16, 5, 6}, {17, 5, 134}, {18, 5, 6}, {19, 4, 132}, {20, 5, 134}, {21, 4, 5}, {22, 5, 132}, {23, 3, 130}, {24, 3, 130}, {25, 2, 2}, {26, 3, 129}, {27, 3, 129}, {28, 3, 129}, {29, 2, 128}, {30, 1, 127}, {50, 16, 32} ]

DNA was divided into "words." The results are presented in the HA_seq.json file.

I would be grateful if someone from the ResearchGate community could provide their professional insight into my results and assist in their analysis.

The source data was obtained from:

https://www.kaggle.com/datasets/premlert/influenza-a-h1n1-h3n2-segment

The result files can be downloaded via the link.

https://disk.yandex.ru/d/uB29TVo67Hdmzw

During the study, I used our data processing technology, KnoDL, which does not require knowledge of data structure, machine learning, or neural network technologies. All operations took an average of 1-2 minutes on a personal laptop.

Sincerely,

Dmitriy Pospelov

Meshack Owira Amimo

I am not a biologist, but I should think that hidden markov models comes out as a prerequisite to undertake this kind of tasks.DNA strands for various genes are aligned along a given strand sequence--the DNA of your mouth has a unique code, different from that of your legs. the mathematics which models this kind of sequence is operations research and biostatistics. to build these models one assumes/ presupposes that you are familiar with what stationary markov chains are, and that you are a bit familiar with bayesian statistics. This will give u added advantages u read and make efforts at understanding & building such sequence-based models.

a link such as the one presented hereunder for your perusal gives you a headstart:

https://www.youtube.com/watch?v=7ZOYDAqfq6U

Badges
Science topic

More Dmitriy Pospelov's questions See All

TUNEL trouble labelling 40 µM stroke mouse brain sections, weak and insensitive labeling?

Hello everyone, I am asking this question as I have found no solutions for this on any other ResearchGate threads or published literature. I am using the Roche In-situ cell death detection kit...

14 March 2024 1,249 0 View

Protein/RNA extraction from fixed (non-paraffin embedded) tissues?

Does anyone have a protocol/experience with extracting total protein or RNA from formalin or PFA fixed tissues that have not been paraffin embedded? I want to extract protein/RNA from mouse brains...

28 January 2024 8,954 1 View

Assessment of the Academic Community's Data Analysis Tool Needs?

What aspects of working with data are the most time-consuming in your research activities? Data collection Data processing and cleaning Data analysis Data visualization What functional...

13 January 2024 7,290 7 View

Which AAV serotype is best suited for delivery of neurotrophic factors to the striatum and substantia nigra region of mice (neurons and glia cells)?

We are going to overexpress neurotrophic factors in the striatum and substantia nigra region of mice. We are interested to have overexpression in both neurons and glia. The viruses are intended to...

08 January 2024 9,524 7 View

Groupware or Personal Knowledge Base with Advanced Document Management?

We're considering developing a tool that functions as either groupware or a personal knowledge base, automatically classifying and placing each document alongside the most similar ones. How would...

12 August 2023 9,450 0 View

Method for determining phosphorus in soil?

Dear Colleagues! Tell me a fairly simple, reproducible and effective method for determining phosphorus in soil with an estimate on a spectrophotometer. The task is to teach students to estimate...

23 March 2023 7,720 5 View

How to measure carbon nanotube's length?

There is an array of single-walled carbon nanotubes with an estimated length of 40 microns or more, with a tube diameter of 1-3 nm. How to determine the average length of nanotubes with high accuracy?

22 October 2020 8,676 7 View

How do you determine the availability of medicines in your country?

Dear Colleagues, How do you analyze the availability of drug care? In the Russian Federation, accessibility analysis is usually based on two components: 1. Content analysis of the State Register...

21 June 2020 2,164 2 View

How I can get values of colours of pixels in image?

Hello everyone, which ways could I use for obtaining data about colour of every pixel in image which has .jpg or .png formats? In the end I want to get .txt file with numerical values of colours...

11 December 2019 4,305 11 View

What protein involved in splicing could be used as endogeneous control in the experiment aka CLIP?

Want to crosslink RNA binding protein in tissue lysate, then do Ribo depletion. Then with biotinylated oligos take out particular RNA of interest, sense and antisense strand separately. Then want...

03 July 2019 454 0 View

Feedback defines the constitution of an organism?

“Here is a thought experiment. Let's place Rodolpho Llinas's jarred-brain on top of a body (Fig. 1). I bet Llinas would argue that his jarred-brain retains its own consciousness, and the android...

11 August 2024 2,483 1 View

How to learn more about SPSS and its Application?

I would like to learn more about SPSS and Its application especially in regards to data analysis. Please suggest me how I can learn more about it. Thank you so much.

11 August 2024 9,101 4 View

Is there a problem with my RNA pellet?

Hello, I am currently having problems with RNA extraction. I am using mouse liver (C57BL6J), and I have extracted RNA from mouse liver before. Before this experiment, my final RNA pellets were...

11 August 2024 7,082 3 View

Can I base on reverse DNA sequences to perform alignment, convert to amino acids and GenBank submission?

I have reverse sequences (AB1 format), can I base on reverse DNA sequences to perform nucleotide alignment, convert nucleotides to amino acids and deposit the sequence in GenBank database?

11 August 2024 5,138 1 View

Baseline drift in HPLC? What causes this?

Hello, Why do i see this baseline drift when i compare my blank (black) to the sample (blue)? Any suggestions as to why this happened? Thank you!

11 August 2024 3,770 4 View

Text-Communication from the M1 Hand Area using BCI—and then there is Elon Musk?

Willett, Shenoy et al. (2021) have developed a brain computer interface (BCI) that used neural signal collected from the hand area of the motor cortex (area M1) of a paralyzed patient. The...

10 August 2024 7,180 0 View

Has anyone applied Python in the field of textile engineering for data analysis, automation, or smart textiles?

I'm currently exploring the application of Python in textile engineering, specifically in areas like data analysis, process automation, and the development of smart textiles. I'm interested in...

10 August 2024 7,429 2 View

Strugglling with m6A dot blot any suugesstion ?

I have been doing the m6A dot blot for a while with no improvement, I am extracting the RNA, and I can see the dots although the three biological replicas give a different reading on the memberan...

10 August 2024 8,539 5 View

How can I use the cif data obtained from rietveld refinement extracted via gsas2, for microstructural analysis using ETEX software?

09 August 2024 7,718 0 View

Can we mark 'EFL Learners shifting from general digital to AI technologies' as technological transition?

After COVID-19 it has seen that EFL learners technological affiliation has raised. In addition, in the post-COVID period learners started to engage AI technologies like ChatGPT while learning...

08 August 2024 8,964 4 View