I have a protein sequence file, and I noticed that there is a sequence in its Fasta file that is related to one of its antibodies. I want to do a docking between the antibody and the protein. Does this file consider both the protein and antibody?
Dear Researcher, Firstly you decide objectives of your research then choose suitable data set for analysis....If your FASTA protein sequence belongs to any chain of antibody....Then consider it as an antibody....I think you want to perform antigen antibody interactions......for reverse vaccinology...
Where did you get your fastA file from? What does it represent?
First of all, look at your FastA file in a text editor - does it contain one or multiple sequences? Each entry in a multi-sequence fastA file starts with a first line where a ">" is followed by some header text, followed by one or more lines of sequence data. It may be that your fastA file contains the sequence(s) of the antibody-antigen complex. Alternatively, it may represent an artificial fusion construct. Also- have you positively identified the antibody? Some V-type immunoglobulin domain sequences look a lot like antibody variable domain sequences.
To answer Dear Honegger, It is a FASTA file from uniprot website.
I have provided some pictures related to the structure, attached. The highlighted is related to the antibody sequence (based on litterateurs). I have bought CD-133 antibody from biorbyt, and to reference its sequence, the 043490 ID has been mentioned. On the other hand, we can see that it is related to cd133 protein, not just an antibody. I have concluded that the 043490 ID is contained both CD133 protein and its antibody.
I don't see any antibody sequence in that UniProt sequence! Are you mistaking the antibody recognition sequence (aka Epitope sequence) KYGRTIGYFEHYLQ for an antibody sequence? This is not the sequence of the antibody, but the part of the antigen sequence this specific antibody binds to!
Right. Thanks for your answer. Actually I am basic in this topic. I am reproducing an article. It has not been mentioned how the antibody has been used in docking. I want to study the cd133 protein and its antibody interactions using docking. I have no idea how to find and include the antibody structure. you can find the related description of that article below and a figure attached. Thanks for your guidance again.
"Want to understand molecular interactions between CD133 (marker
of circulating endothelial progenitor cells) and anti-CD133 antibodies
and choose the best of commercially available anti-CD133 antibodies,
we conducted full atomistic numerical simulations. Since CD133 cell
surface protein is still not crystallized and its molecular structure re-
mains unknown, we decided to perform homology modeling of its
amino acids sequence (O43490, uniprot database) on recently crystal-
lized P-glycoprotein (P21447, uniprot database) template, known also
as ABCB1a protein (5KPI, RCSB PDB database [18]. P-glycoprotein fa-
mily is the closest choice for sequence identity, matching, however,
only on 176% level (data not shown). Such a low identity is usually
disqualifying for further molecular studies, but our goal was principally
not to understand structure-function relationship but only to verify pro-
endothelial properties of anti-CD133 antibodies presence on stent sur-
faces. Homology modeling was performed using Modeller 9.18 software
[19], and 10 generated homology CD133 models with the highest both
molpdf (between 9316 and 7824) and DOPE (between -71992 and
-67427) scores were kept for further analysis. Since we cannot objec-
tively choose only one homology model, and since they are all struc-
turally resembling, we decided to keep them all, increasing thus sta-
tistical insight, as it is also routinely done taking account dynamical
aspect of protein. Next, two commercially available non-oxidized anti-
CD133 antibody reconnaissance sequences, namely the short one
(KYGRTIIGYFEHYLQ) and the long one (NHQVRTRIKRSRKLADSN-
FKD), were freely docked onto extracellular domain of each of ten
CD133 homology models, using AutoDock Vina software [20]. Both
anti-CD133 reconnaissance sequences were let to be flexible during
docking simulation. On each of ten CD133 homology models only 20
best scoring docks were kept for further interactions analysis, giving a
sample set of 200 configurations for each of anti-CD133, being a right
How about actually sharing the reference for the article? This sentence makes absolutely no sense: "Next, two commercially available non-oxidized anti-CD133 antibody reconnaissance sequences, namely the short one
(KYGRTIIGYFEHYLQ) and the long one (NHQVRTRIKRSRKLADSN-
FKD), were freely docked onto extracellular domain of each of ten
CD133 homology models, using AutoDock Vina software [20]"
As you showed by identifying and highlighting these sequences in the CD133 sequences, these are the epitope sequences. This is confirmed by the information snippet from the antibody vendor: The antibody in question was raised by immunisation with this synthetic peptide coupled to KLH (Keyhole limpet hemocyanine) - If you read the vendor information carefully, it is not even a monoclonal antibody, but a rabbit polyclonal antiserum that will be a mixture of many different antibodies. Therefore no sequence will be available for this antibody, from which you could derive a 3D model required for docking.
So in this paper they seem to have docked the epitope peptide to a low confidence homology model of the full antigen - this makes absolutely no sense!!! If this interpretation is correct (and I need to read the full paper to make sure of that judgement), the paper should never have made it through the review process!!! Was the paper published in some predatory journal that does not even do a decent review process???
A protein FASTA file typically contains the sequence of a protein, not its antibody. To perform molecular docking between an antibody and a protein, separate files or sequences are needed. To identify the antibody sequence, preprocess data, and perform docking using molecular docking tools to understand potential binding interactions.
Susanta Roy, you answered the question without reading the previous answers. We have established that the problem consisted of mistaking the sequence of the epitope peptide for that of the antibody - of course the epitope peptide sequence is part of the full antigen sequence.
Hence as you said the claim of the article to consider the interactions between CD133-anti and CD133 protein is incorrect and there is no way to do this.
I need to evaluate the interaction of a nanoparticle and CD133-anti with CD133 protein. As I understand It is impossible and I just can consider the interaction of the nanoparticle with the protein without antibody.
Do you need this specific antibody, or will any antibody against human CD133 do? While none of the 10 anti-CD133 antibodies listed in the TABS database (http://tabs.craic.com/users/sign_in) have their sequence published, this paper: www.ncbi.nlm.nih.gov/pmc/articles/PMC5527709/ lists the sequence of the single-chain fragment in figure 2. The scFv is derived from this hybridoma: pubmed.ncbi.nlm.nih.gov/20674577/
This study uses scFv independently derived from the same hybridoma :
Article Production of a Ribosome-Displayed Mouse scFv Antibody Again...
Since the sequences are in an image, you have to transcribe them into a text file.
You can use this text file to do a homology search for the closest experimental structures in the PDB https://www.rcsb.org/search/advanced or
SAbDab https://opig.stats.ox.ac.uk/webapps/sabdab-sabpred/sabdab. I do not think they contain this specific antibody, as I did not find it by a full-text search for CD133.
For specialized antibody homology modelling, you can submit the VL and VH sequences to SabPred https://opig.stats.ox.ac.uk/webapps/sabdab-sabpred/sabpred.
Do you think this assignment is correct? The model has been attached also. So I can use the generated pdb file along with the CD133 protein to perform a docking?
Heavy sequence = ELKSSGGGGSGGGGGGSSRSSLEVKLVESGPELKKPGETVKISCKASGYTFTDYSMHWVNQAPGKGLKWMGWINTETGEPSYADDFKGRFAFSLETSASTAYLQINNLKNEDTATYFCATDYGDYFDYWGQGTTLTVSS Light sequence = MDIVLSQSPAIMSASPGEKVTISCSASSSVSYMYWYQQKPGSSPKPWIYRTSNLASGVPARFSGSGSGTSYSLTISSMEAEDAATYYCQQYHSYPPTFGAGTKL
You have included the last 3 amino acids of VL and the linker connecting the VL and VH domain in the heavy chain sequence. The model most likely omits it as SabPred looks for antibody-typical sequences for its alignment
VH would be: EVKLVESGPELKKPGETVKISCKASGYTFTDYSMHWVNQAPGKGLKWMGWINTETGEPSYADDFKGRFAFSLETSASTAYLQINNLKNEDTATYFCATDYGDYFDYWGQGTTLTVSS,
The initiation methionine at the start of the VL domain would be cleaved of upon cytoplasmic expression in E.coli or replaced by a signal sequence for other expression systems.
However, the model looks like an antibody Fv fragment.
Can you give me an advice with the unit of these energy values (attached picture)? The software is Molegro Virtual docker. How do I refer to these numbers? In addition, I have a nanoparticle that is large in size. It has 23,000 atoms. Docking is done, but I have received a hint that the particle is too big. Do you think the docking is done correctly?
You would have to consult the software manual to see that. However, if these were total energies of complex formation, I would expect negative values for a stable complex. Since you are docking a protein antigen to an antibody, I would retrieve several different experimental protein-antibody complexes from the pdb, energy-minimise these with the same methods as your best docked complexes, and check whether their predicted interaction energies lie in the same ballpark as the predicted interaction energies for your docked complexes.