基因短序列模式分析及其在5'剪接位点识别中的应用
DOI:
作者:
作者单位:

作者简介:

通讯作者:

中图分类号:

基金项目:

国家自然科学基金资助项目(60471003)


Analysis of Short Sequence Motifs with Applications to 5'SpliceSites Identification
Author:
Affiliation:

Fund Project:

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
  • |
  • 文章评论
    摘要:

    短序列模式分析是基因序列分析的一个重要组成部分,在进行生物信号识别的时候,一般都会利用到短序列模式的信息。通常短序列模式的数目很多,如果每个都应用到生物信号识别中,会产生大量的参数,而且无法体现信号的主要特征。为了找出在识别信号位点中起关键作用的短序列模式,以信息增益作为评价依据,按照逐步选择的策略,将模式进行排队。根据排队结果,选取信息增益突出的短序列模式作为识别生物信号的关键依据,这样可以用较少的模式得到较好的结果。结合选取的短序列模式,用最大熵模型作为信号序列真实分布的估计,从而对给定序列进行识别。最后将这个方法用于5'剪接位点的识别,得到了满意的结果。

    Abstract:

    Analysis of short sequence motifs is an important component of gene sequence analysis. Information of motifs is usually used for identifying biological signals. However, the number of short sequence motifs is very large. If all of them are used for signal identification, there will be too many parameters, thus covering the main characteristics of the signal. To find out the key short sequence motifs for signal identification, in this paper, a stepwise strategy was adopted to rank motifs by their information gain. As a result, the motifs were selected orderly for signal identification. In so doing, good results were achieved with fewer motifs. Consisted with the selected motifs, maximum entropy model was used as the approximation of the true distribution of the signal sequences, thus realizing the identification of a given sequence. Finally, the model was used to identify 5'splice sites, and approving experiment results were achieved.

    参考文献
    相似文献
    引证文献
引用本文

晏春,杜耀华,王正志.基因短序列模式分析及其在5'剪接位点识别中的应用[J].国防科技大学学报,2006,28(1):51-56.

复制
分享
文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:2005-09-12
  • 最后修改日期:
  • 录用日期:
  • 在线发布日期: 2013-03-25
  • 出版日期:
文章二维码