There is a technique that is used in biotechnology known as phage display. In this technology a virus that attacks bacteria is used to display a short peptide. Without boring the reader with the details, this is a technique known as evolution in an instant. What you do is subject a library of peptide displaying phage to a protein target. You wash away non-binding phage, amplify binding phage and repeat. What you hope to end up with are peptides that bind to a target. You are repeating because it takes a while to get rid of non-binding phage. They must be slowly selected against in an evolution like process.
During the selection process many other things can happen. Contamination of the library of phage can happen. Most interesting however is the selection of phage that are in the original library. These phage will predominate after a while because they grow faster than the other phage. At least that is the theory. A company called New England Biolabs sells the phage libraries. Each lot appears to have a set of contaminants that will come out after a period of selection. They do not seem to appear because they bind to a protein target. They lack a gene that the rest of the library has. This gene causes the phage plaques (formed on bacterial lawns) to turn blue. The contaminating phage form white plaques. Perhaps this is part of the selection of the phage since the normal state of the DNA lacks the blue plaque gene.
One such phage that has been seen displays the peptide GETRAPL. That is code for 7 amino acids, glycine-glutamic acid-threonine-arginine-alanine-proline-leucine. A co-worker of mine found this sequence. I recognized it and ran it in our database. Sure enough, it showed up several times. There was not particular protein linked to it however. It was linked to later stages of selection however. We noted that the plaques were white in all cases. We typed it into Google and found an interesting group of papers.
Researchers at a different laboratories had found this sequence while using phage display against their targets. The first paper "Design and Assay of Inhibitors of HIV-1 Vpr Cell Killing and Growth Arrest Activity Using Microbial Assay Systems". The next, "Development of efficient viral vectors selective for vascular smooth muscle cells". Completely unrelated, yet they used the peptide to validate their work. They published and ...?
In physics you set up a system. In that system many things are happening. Pressure changes, concentration of molecules changes, heat is given off and so on. One of the question you can ask about a system is whether or not work has been done at a certain period of time. For example, you load a box onto a conveyor belt that takes the box and drops it onto another conveyor belt that puts the box back you to load onto the first conveyor belt. You are sweating. Boxes are moving but only in a circle. Work is not really getting done.
The GETRAPL story is like that. Did work get done? Did anything scientific get done? The papers on GETRAPL appear to have ended. No one is talking about the peptide but there was a time when this little contaminant from a phage display library made two labs get out their typewriters and type of the story of how they got the peptide and what they did with it. And it did something! The circle here is starting with the library, getting the same sequence, publishing and moving on to something else. The way to stop the non-working cycle is to publish a paper on the sequence and let people know when they are being misled. The system here is science as it is being practiced by PhD scientists and journal editors. The people who sell the libraries can also stop the non-work cycle. The system is not producing work.
Feynman said that he hoped we learned something during our educations. Something you can't teach. You can easily learn how a phage library is made and used. You can learn some computer software that will help you assess the peptides you end up with. Finding a protein binding peptide is a rare event, but that is not what is taught. To me, that is the interesting thing about the technique. It is just one of many techniques that can be used to study a protein and it's interactions. You can learn all about the protein prior to using phage display. You can get up a give a talk to the Academy of Sciences on the details of your work. But in the end you have to look at your results. You have to use something inside of you that is not taught. Critical thinking perhaps. Can it really be binding to the target? How can I prove that? How can I disprove that? How can I be sure of any result?
One way of looking at GETRAPL is very obvious to laboratory people. You look at the DNA sequences of each phage that displays GETRAPL. Are they identical or are there various sequences using different codons that code for GETRAPL. This would be strong evidence that it is not being selected for because the original phage grows faster than the rest of the library. Another modern day trick that we applied was to Google the sequence and see if anyone could back up your hopes and dreams that it binds to a specific target. Your education can teach you about the tools. It is up to you to use them.
There are many phage like GETRAPL. There is a paper published by one of the leading phage display scientists that discusses their existance. What is selecting these phage is still unknown. Developing an explanation and proving it is science. There are scientists who currently work on software that can take a list of sequences that come out of a phage display selection process and help you understand what you've got. If you sequence 100 phage and they are all the same sequence for example, the computer only sees one sequence. The computer is weeding out repeated sequences because it indicates selection based on non-binding factors such as faster growing phage. The computer software continues to add new features to help you find binding phage. Someday you could simply put files into this software and it will give you a list of possible binding targets. It could search your own database as well those on the internet or in Pubmed. This is still not going to tell you the whole story. You have to develop an assay. They teach you about assays. The assays however, give you another set of data that you have to interpret. This is what Feynman was hoping you would know how to do. Think scientifically. Ignore your hopes and dreams of getting published and moving on to more interesting projects. Don't act on your desire to take that peptide and fuse it into some elaborate molecule you designed and toss that into an elaborate assay and prove that you can cure AIDS! Slow down. Is the data really saying what you think it is. Think critically and proceed logically.
There are many other peptides that have been discovered that are mere contaminants. The claims that their finders have made on their behalf are quite fancy. Big science words were used and put together in a way that is certainly feasible. No work was done however. Just words used to describe a set of letters that represent amino acids. What is really going on can open doors. What is really going on can make the planes land! But they fade away, these research projects. No resolution. No work was done. No planes landed.
10 comments:
Dear Ginsberg,
At last, I found your writing on GETRAPL, just tonight. Resurrecting my phage library experties in the NGS age was a fantastic experience, and since after some successful outcomes, I happened to GEt TRAPped
- is it not magic, cannot be by chance, must be the work of a library hacker...-
together with my collegues today morning by an ~80 % consensus of GETRAPL obtained after 5 rounds of selection when we wanted to further enrich two separate and for-some-time left over third round amplificates on our antibodies.
Since I lost my memories on the cheeky background peptides, I had to spend some time digging for the solution,
I find amasing the progress except the "getrapper", which as you said, still an unsolved/untold story.
To add some new findings: the peptide is coded by various DNA sequences in the selected library and not a single clone, so GETRAPL must have a biological function or it has mutated since...do not think so? Actually, all the short term evolution marks can be seen in a library if you look deep...
So, I decided to do some more digging, I have models, means, tools to do that.
Please, contact me if you need some more info,
it was nice to read you both in specific and general sense.
I only ask that we speak clearly here. Post your sequences. Let's see what you have got. Maybe I can help you interpret those arced loops on a DNA chromatogram.
To speak clearly, in a capillary sequencing (QC before NGS) of two separate Round 5 library from a selection on two different/unrelated antibodies, I got the same predominant (~80% estimated) insert sequences in both cases
ggkgagactcgtgckccgcwt
ggggagactcgtgcgccgctt
ggtgagactcgtgcgccgcat
ggtgagactcgtgcgccgcag
which encode:
G E T R A P (L,H,Q)
(I would attach chromatograms if I knew how to do it in the blog.)
those libraries after 3 rounds of selection showed very mild enrichment due the nature of the native epitopes of the two antibodies.
My feeling is that something happened to the phage preps and a fast grower clone/clones amplified preferably - and in our case in a spectecularly predominant way.
Round 4 of the two selections are also enriched for GETRAPL.
From previous NGS datasets, I could see GETRAPL clones with variations in the sequence.
I think it is worth publishing our work also to support your long lasting fight on Cargo Cult Science Blog against GETRAPL.
I was not joking.
http://www.proxencell.com
To summarise the evidence we have for the GETRAPL phage displayed peptide sequence to be a target unrelated peptide and its mechanism of being enriched are as follows:
1. NGS datasets show that the coding sequence of the peptide is variable, not a single clone
2. Capillary sequencing of total library from late selection rounds (4-5) shows that GETRAPL gradually and predominanty (~80% of the sequences) amplified in case of two non-related rabbit polyclonal antibody preparations, which supports unrelatedness to the target (with difficult to mimic epitopes).
3. such enrichment was reached from a 4 out of 5000 frequency of sequences encoding GETRAPL in the 3rd round as analysed by NGS, which is indicative of something going rong in the selection process.
We are getting further evidence by cloning and the analysis of GETRAPL clones, including a model system to test the mechanism of amplification of those clones.
You're doing a lot of work on one of the many contaminants from the NEB 7mer library, lot 3. You won't find GETRAPL anywhere else if I am correct. I don't think there is any logical reason to believe the sequence imparts any "fast growing" properties. The frequencies you mention come and go. When you are dealing with large numbers of clones seeing GETRAPL from different targets is enough information to know it is a fast grower. 80% frequency in one particular set of panning is not nearly as significant as any percentage above zero from two different unrelated targets.
I am not worried much about GETRAPL, we have succeeded many times recently in epitope mapping, very good results, with the same library lot. I did not mean to spend much time on artefacts except when it is of interest.
According to some latest papers, GETRAPL is a weak binder of various targets, which is wrong.
http://www.ncbi.nlm.nih.gov/pubmed/21248664
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3245166/pdf/gkr922.pdf
On the other hand, MimoDB has cleared the situation already, you are right when saying not to spend much time on it, however...
SAROTUP is an example of scientists who understand the value of the truth. It's an example of science sorting things out. It's a tool I wish I had when I had to work in the Cargo Cults.
Post a Comment