练习1(i)本练习的目的是研究处理生物信息学问题的各种软件工具。更具体地说,您应该调查(不求解)页面上列出的软件工具的示例https://rosalind.info/problems/list-view/?location = biioinformatics-rosalind(https://rosalind..info/problems/locations/)和小报告的bioinformatics-markory((ii)有许多可自由访问的工具用于多个序列对齐。在本报告中,您将比较NCBI和EBI数据库中的工具。访问NCBI和EBI网站,并报告其多分配工具的关键功能。对于NCBI,关键工具在链接中:https://www.ncbi.nlm.nih.gov/project/project/projects/msaviewer/,httpps://wwwwwwwwwww.ncbi.ncbi.nlm.nih.gov/tools/cobalt/cobalt/cobalt/cobalt/cobalt/re_cobalt.cgi and yan manip on manip in yebience in hanip in yebience https://www.ebi.ac.uk/jdispatcher/msa/确保访问大量工具。提示:因此,简单地使用各种工具,而不是解决上述问题是足够的。也就是说,该练习的目的是与一些现成的工具保持联系,而不是经验丰富的工具。练习2 1(i)访问NCBI数据库,以链接https://www.ncbi.nlm.nih.gov/sars-cov-2/研究SARS-COV-2冠状病毒。使用SARS-COV-2序列数据的记录https://www.ncbi.nlm.nih.gov/nuccore/nc_045512下载冠状病毒尖峰蛋白序列。报告最终结果。然后使用http://ekhidna.biocenter.helsinki.fi/dali/的DALI工具比较两种蛋白质的结构。Then from the link https://www.uniprot.org/uniprotkb/A0A6B9WHD3/entry download the Bat-RaTG13 coronavirus spike protein sequence (https://en.wikipedia.org/wiki/RaTG13) and implement the classic dynamic programming global alignment algorithm with appropriate weights to identify their最长的常见子序列。(ii) View the structure of the two proteins of the previous query using the ab-initio swiss- modeller tool ( https://swissmodel.expasy.org/interactive ) and download the .pdb files (a textual file format describing the three-dimensional structures of molecules held in the Protein Data Bank (textual file of three-dimensional structures of in Protein Data Bank)).使您观察到序列和结构的相关性。子问题(iii)(无评分贡献的子问题):如果某人想深入研究,他们可以访问https://biologicalmodeling.org/coronavirus/home网站,带有类似(但不完全相同)的问题。子问题(IV)(无评分贡献的子问题):尝试通过各种新机器学习(https://www.nature.com/articles/s41592-023-01790-6)算法来解决蛋白质结构预测问题。 https://www.ebi.ac.uk/tools/sss/fasta/,https://colab.research.google.com/github/github/deepmind/alphafold/alphafold/blob/main/notebooks/notebooks/alphafold.i pynb(Esmfold.i pynb)和esmfold( https://www.science.org/doi/10.1126/science.ade2574,https://esmatlas.com/resources?action=fold)。
主要关键词