发布文献求助

已入深夜，您辛苦了！由于当前在线用户较少，发布求助请尽量完整地填写文献信息，科研通机器人24小时在线，伴您度过漫漫科研夜！祝你早点完成任务，早点休息，好梦！

Modeling General and Specific Aspects of Documents with a Probabilistic Topic Model

概率逻辑计算机科学数据科学人工智能

作者

Chaitanya Chemudugunta,Padhraic Smyth,Mark Steyvers

出处

期刊：The MIT Press eBooks [The MIT Press]
日期：2007-09-07 卷期号：: 241-248 被引量：183

标识

DOI：10.7551/mitpress/7503.003.0035

摘要

Techniques such as probabilistic topic models and latent-semantic indexing have been shown to be broadly useful at automatically extracting the topical or semantic content of documents, or more generally for dimension-reduction of sparse count data. These types of models and algorithms can be viewed as generating an abstraction from the words in a document to a lower-dimensional latent variable representation that captures what the document is generally about beyond the specific words it contains. In this paper we propose a new probabilistic model that tempers this approach by representing each document as a combination of (a) a background distribution over common words, (b) a mixture distribution over general topics, and (c) a distribution over words that are treated as being specific to that document. We illustrate how this model can be used for information retrieval by matching documents both at a general topic level and at a specific word level, providing an advantage over techniques that only match documents at a general level (such as topic models or latent-sematic indexing) or that only match documents at the specific word level (such as TF-IDF).

求助该文献

科研通智能强力驱动
Strongly Powered by AbleSci AI

我的文献求助列表浏览历史

一分钟了解求助规则 | 捐赠本站 | 历史今天

更新

新增更精细的自定义提醒设置 (2026-1-4)

新增

🕒每天60秒读懂世界·精选全球要闻 (2026-1-2)

更新

2025年影响因子查询已上线 (2025-6-18)

新增

PDF的下载单位、IP信息已删除 (2025-6-4)

科研通是完全免费的文献互助平台，具备全网最快的应助速度，最高的求助完成率。对每一个文献求助，科研通都将尽心尽力，给求助人一个满意的交代。

实时播报: 雪阳发布了新的文献求助20

1秒前; CodeCraft上传了应助文件

1秒前; 英勇的沉鱼完成签到，获得积分10

1秒前; 顾矜的应助被messi0731采纳，获得30

1秒前; 久顾南川完成签到，获得积分10

1秒前; gyyy完成签到，获得积分10

3秒前; 冰棒比冰冰完成签到，获得积分10

3秒前; kkr完成签到，获得积分20

4秒前; 优美紫槐上传了应助文件

4秒前; 周敏杰完成签到，获得积分10

4秒前; KK完成签到，获得积分10

8秒前; 共享精神的应助被刘萌清采纳，获得10

8秒前; 挑片岛屿发布了新的文献求助10

9秒前; 多情的安雁发布了新的文献求助10

10秒前; 上官若男的应助被神医magical采纳，获得10

11秒前; Tuniverse_完成签到，获得积分10

15秒前; 想毕业的笑笑完成签到，获得积分20

16秒前; 汉堡包上传了应助文件

16秒前; 彭于晏的应助被默默的采纳，获得10

17秒前; 研友_ngX12Z发布了新的文献求助10

17秒前; 香蕉觅云上传了应助文件

18秒前; 充电宝的应助被福多多采纳，获得10

18秒前; 量子星尘发布了新的文献求助10

21秒前; Akebi完成签到，获得积分10

22秒前; 畅快的涵蕾发布了新的文献求助10

22秒前; 大大怪完成签到，获得积分20

22秒前; 海荷完成签到，获得积分10

22秒前; 贾克斯发布了新的文献求助10

23秒前; 李健的粉丝团团长上传了应助文件

23秒前; 科研白完成签到，获得积分10

25秒前; 科研通AI6上传了应助文件

25秒前; 科研通AI6上传了应助文件

26秒前; sxb10101上传了应助文件

27秒前; 专一的蛋挞完成签到，获得积分10

28秒前; 科研通AI6上传了应助文件

28秒前; pure完成签到，获得积分10

29秒前; 默默的发布了新的文献求助10

30秒前; 张莜莜发布了新的文献求助10

30秒前; 多情的安雁完成签到，获得积分10

30秒前; 传奇3上传了应助文件

31秒前

高分求助中: (应助此贴封号)【重要！！请各用户(尤其是新用户)详细阅读】【科研通的精品贴汇总】 10000; Encyclopedia of Reproduction Third Edition 3000; 《药学类医疗服务价格项目立项指南（征求意见稿）》 1000; 花の香りの秘密―遺伝子情報から機能性まで 800; 1st Edition Sports Rehabilitation and Training Multidisciplinary Perspectives By Richard Moss, Adam Gledhill 600; nephSAP® Nephrology Self-Assessment Program - Hypertension The American Society of Nephrology 500; Digital and Social Media Marketing 500

热门求助领域（近24小时）

热门帖子: 关注科研通微信公众号，转发送积分 5627458; 求助须知：如何正确求助？哪些是违规求助？ 4713928; 关于积分的说明 14962390; 捐赠科研通 4784838; 什么是DOI，文献DOI怎么找？ 2554884; 邀请新用户注册赠送积分活动 1516380; 关于科研通互助平台的介绍 1476702

今日热心研友

专注的问寒

注：热心度 = 本日应助数 + 本日被采纳获取积分÷10

Copyright © 2020-2026 AbleSci.COM, 科研通, All Right Reserved

科研通是非营利科研互助平台，不忘初心，为科研助力

本站互助的所有文件仅供个人学习研究用，禁止任何人把求助的所得文献进行盈利或传播

皖ICP备2024041134号-1

皖公网安备34019202002308

科研通【文献互助QQ群】：如果您有特殊求助，或发布求助超过24小时未得到应助，可加群求助，群号：821889395【点击一键加群】

科研通【志愿服务QQ群】：如果您热爱文献互助，有热心愿意为更多人服务，请加入小伙伴群，点击申请加入

关注微信服务号

科研通