微博
计算机科学
社会化媒体
社会网络分析
垃圾邮件
情报检索
垃圾邮件程序
万维网
数据挖掘
互联网
作者
Shen Yang,Shuchen Li,Ye Xiaoxiao -,He Fangping -
出处
期刊:Journal of Convergence Information Technology
[AICIT]
日期:2010-02-18
卷期号:5 (1): 135-140
被引量:22
标识
DOI:10.4156/jcit.vol5.issue1.16
摘要
The number of microblogs’ user is growing rapidly with the increase of spam. Firstly, we give microblog a formal definition, and then divide spam into two types: news and advertisements. We collect 1,760,314 items of 188MB microblog news to complete the process of content mining. Using ROST Content Mining, we work on topology macro statistics, time series mining, and so on. We find that the group of microblog presents the feature of small world. Its coefficient with the same degree is negative and the probability of news microblog followers is 0.0002, while the rate of second spread is 0.011.We put forward a recursive filtering method to estimate the rate of spread on many occasions and we import cross-relation method that switches the node that are difficult for network analysis to easy forms and do social network analysis.
科研通智能强力驱动
Strongly Powered by AbleSci AI