Homology Modeling FABP

Size: px

Start display at page:

Download "Homology Modeling FABP"

Kelley Quinn
5 years ago
Views:

Homology Modeling FABP Homology modeling is a technique used to approximate the 3D structure of a protein when no experimentally determined structure exists.

1 Homology Modeling FABP Homology modeling is a technique used to approximate the 3D structure of a protein when no experimentally determined structure exists. It operates under the principle that protein tertiary structure is better conserved than amino acid sequence, and proteins that share high sequence homology should therefore also share high structural homology. The protein being modeled is called the target protein. A protein for which there is an experimentally determined structure that is used to model the target is called a template protein. Template proteins must share high sequence homology with the target protein, so that their 3D structures can be expected to be very similar. The amino acid sequence of the target protein is mapped onto the 3D structure of the template to create an approximate model. This is a homology model. 1. Acquire target protein sequence and reference pdb files a. The target protein for this semester is Fatty Acid Binding Protein (FABP), which has PDB ID # 2HMB. Everyone will be modeling this protein. b. Go to the RCSB Protein Data Bank website ( and search for PDB ID# 2HMB. c. From right menu, download FASTA sequence and PDB text file. d. Save FASTA file and the PDB file to your homology_modeling directory.

2 2. Choose a template protein Everyone will choose a different template protein to use in modeling FABP. Potential templates can be identified by performing a BLAST search on the target protein to produce a scored list of proteins with highly similar sequences. a. Go to the NCBI BLAST website ( and click protein blast. b. Open your target FASTA sequence in the text editor. Highlight and copy the sequence. c. Paste your target FASTA sequence into the BLAST search field. Select Protein Data Bank proteins(pdb) for the database and make sure blastp (protein-protein BLAST) is selected as the algorithm. Click the BLAST button at the bottom to begin your search.

3 d. This results page displays information about the top alignment hits to the primary sequence you submitted. A graph shows the distribution of BLAST scores beginning with those hits that scored 200 points or higher. Mouseover each bar to see which protein it corresponds to and click it to be taken to the protein's entry in the scored list further down the page. Hits that scored above 200 will likely include other Fatty Acid Binding Proteins, mutants, and analogs from different species. Those that scored from will reveal more protein diversity. A second scored list containing detailed alignment information for each hit can be found at the bottom of the page. Be sure to note what dictates a positive result and how the identities, positives, and gaps are assigned percentages. Understanding how the BLAST program scores results (similar sequences) will be important for discussing the percent similarity of your template to your target.

4 e. After you have examined the results page, choose a protein from the list to serve as your template. A higher BLAST score means a better template, but we will choose from the intermediate range of in order to test the limits of homology modeling. Report your protein choice to the TA to confirm that everyone is assigned a unique template. Save the BLAST results page for your records. 3. Acquire template protein sequence and reference pdb files Repeat the steps for acquiring target protein files to get template protein files. PDB ID# 2FLJ is used as an example in this tutorial. a. Return to the RCSB Protein Data Bank website ( and search for the PDB ID# of your template. b. As before, download FASTA sequence and PDB text file. c. Save FASTA file and the PDB file to your homology_modeling directory. 4. Load files, align sequences, and build model in the Molecular Operating Environment MOE will be used to build the homology model using the FASTA sequence of your target and the PDB file of your template. After loading in these files, the sequences are aligned and the target sequence is superimposed onto the template structure. From this, MOE will build multiple homology models, minimize them, then select the best one to be the final model.

Select your target FASTA sequence from the list and click OK to exit the window. b.

5 a. Open Moe. In the MOE main window, click File->Open. Navigate to your homology_modeling directory and click CWD to set this as your current working directory. Select your target FASTA sequence from the list and click OK to exit the window. b. Click Window->Sequence Editor in the MOE main window, and your target amino acid sequence will be displaced in the sequence editor. c. Click Display and toggle on Compound Name, Actual Secondary Structure, and Single Letter Residues.

It is important to delete any superfluous chains, waters, or heteroatoms from your template protein so that they

6 d. In the MOE main window, click File->Open as before. Select your template PDB and click OK to load it into MOE. Keep the defaults in the pop-up window and click OK. e. It is important to delete any superfluous chains, waters, or heteroatoms from your template protein so that they are not considered in the alignment. If you have one of these chains, left-click on the chain number in the sequence editor to select it, and click Edit->Delete Selected Chains.

7 f. To align the target and template sequences, select both chains and click Homology->Align. Keep the defaults in the pop-up window and click OK. MOE will align the sequences, placing gaps if necessary. Compare the alignment with your saved BLAST search results to confirm that they are the same. g. To begin modeling, click Homology->Homology Model in the sequence editor. The Current System and Output Database fields at the top of this window indicate file names that data will be saved to. The current molecular

given in Output Database. Ten intermediate models are generated as a default. h.

8 state (structure AND alignment) will be saved to the MOE file given in Current System. All intermediate models plus a final model (based on the best intermediate model) will be saved to the MDB file given in Output Database. Ten intermediate models are generated as a default. h. Click Potential Setup in the lower right-hand corner and load Amber94 as the force field. Click Close to exit this window, then OK to begin building your homology model.

9 i. In the MOE main window, you will see the different intermediate models being built and minimized. After 10 models, MOE will refine the final model and leave that as the only remaining structure in the main window. Click File->Save and save the final model as both.moe and.pdb files. Questions: 1. Brief describe the mechanism of Homology Modeling. 2. What are the advantages and disadvantages of Homology Modeling? 3. What factors are important to the quality of the Homology Modeling? 4. Describe what is meant by protein tertiary structure. 5. Explain why, when given the amino acid sequence of a protein, it is difficult for computational methods to predict its 3D structure.

Tutorial 4 BLAST Searching the CHO Genome

Tutorial 4 BLAST Searching the CHO Genome Accessing the CHO Genome BLAST Tool The CHO BLAST server can be accessed by clicking on the BLAST button on the home page or by selecting BLAST from the menu bar