Background:Proteins bind ligands/substrates through molecular interactions provided by specific aminoacids in the binding pocket. These interactions (e.g. hydrogen bond, hydrophobic interactions,and electrostatic interactions etc.) are key to binding and in turn modulating the proteinsfunction. As a result, one common drug design strategy is to design molecules that wouldmake favorable interactions to these essential amino acids in order to outcompete the nativesubstrate and inhibit the protein’s function. At the same time, regions that can change can beimportant to take advantage of for designing organism specific drugs and avoiding off-targeteffects (e.g. maybe you want the drug inert in humans but highly functional in dogs). In thiscase you want to take advantage of the sequence differences between species. In this lab, youare going to identify and visualize these important amino acids through protein sequenceanalysis using bioinformatics tools.Instructions:1. We are going to use a sequence visualization program (Jalview). This is where we can identifythe residues we think are interacting with the molecule in the active site.2. On the page of your protein from the pdb website, click on fasta sequence under the “DisplayFiles” menu.3. This will display the fasta sequence on your browser (which is the amino acid sequence for yourenzyme).4. Copy this sequence then go to the BLAST web server: 5. Select Protein Blast (blastp)6. Paste in your sequence. Wait! Don’t hit submit, instead lets add some paramters to make sureyou have enough diversity in sequences to get a good analysis of conservation:a. Change database to: Reference Proteins (refseq_protein). This database is a bit lessredundant than the default nr database. Ideally we would use a database that is nonredundanton the 75% level (other servers such as HMMER allow that, but they are moredifficult to use).b. Expand “Algorithm Parameters” and select 250 on “Max Target Sequences”. This willmake sure you have enough sequences to measure conservation while making sure that all theproteins identified are predicted to be highly related.*IMPORTANT NOTE: If you see %’s in the “Ident” column in the blast resultsgoing below 30% re-adjust this to 100 or even 50. Below 30% you are in what isconsidered the “twighlight” zone and the sequences you are finding are notnecessarily related in structure or function. Basically, you have drifted too far inevolutionary space. Roughly speaking, 30-70% Ident it has been found that proteinsare both related in protein structure and function, above 70% it is almost certain thatproteins are almost identical in structure and function.7. Hit “BLAST” to initiate search results8. Analyze your results, make sure all the hits are “Pink or Red” for alignment score (i.e. largeportions of the sequence aligned). And in the “Descriptions” table with “Sequences producingsignificant alignments” make sure the Ident column never goes below 30. If it does, see note above.9. If all looks good, Select “Multiple Alignment” near the top in the “Other Reports” section.10. From this output (don’t worry if an error is thrown for graphical overview) select:”Download” -> Fasta plus gaps13. Now we are going to open another program named Jalview (. ClickLaunch Jalview Desktop open the downloaded file.14. Under the “File” menu, hit Input Alignment, from file and open the alignment file you justdownloaded from the BLAST.15. Under Jalview hit colour tab at top, then click percentage identity, now go back tocolor and click above identity threshold. Change occurrence to 99% conserved.15. The residues highlighted here are either critical for protein structure or function. All of theresidues you identified in lab 1 (i.e. the catalytic residues) should be highlighted.16. Using Jalview and using percentage identity and moving the bar back and forth you candetermine the percent conservation for a residue (look at switch from beingcolored to not colored).17. Now, in the assignment below you will need to find non-catalytic residues (i.e. residues notdirectly involved in the reaction mechanism) that are either highly conserved OR have lowconservation. Roughly speaking (based on search paramters above):High Conservation residues have >70% identity across the homologous sequencesLow Conservation residues have <30% identity across the homologous sequencesResidues in the 30-70% range a likely important but not critical for structure or function.Assignment---For your assigned enzyme group complete the following assignments with the help of the Lab2 instructions:1) What NON-CATALYTIC residues in the active site of the enzyme are most likelyimportant for function?a. Highlight three highly conservative residues, show both structural interactionsAND conservation using PyMol and JalView. Hypothesize why they are critical forfunction.b. Highlight one low conservation residue, show both structural interactions ANDconservation using Pymol and JalView. What other amino acids are observed,hypothesize why the alternate residues are ok.2) How does the inhibitor interact and how would you further optimize?a. Describe WHY it acts as an inhibitor.b. What key interactions are made with the inhibitor? Are any of those interactionsmimic the natural substrate?c. If you were to modify the inhibitor what new interactions would you try to takeadvantage of? (hint...conserved residues in the protein are unlikely to changewhile maintaining protein structure or function)Case 1 = ACE: 2X91Case 2 = OD: 2TODCase 3 = ?-Fucosidase: 2ZX5Case 4 = PNP: 1A9SCase 5 =CD: 2FR6

Leave a Reply