Welcome to DiSC
Journals
TODS '06/'07
VLDBJ '06/'07
Forums
SIGKDD Explor. '06/'07
SIGIR Forum '06/'07
SIGMOD Record '06/'07
Conferences
ADC '06
APCCM '06
CIKM '06
CIKM '07
ER '05
ER '06
Hypertext '06
Hypertext '07
JCDL '06
JCDL '07
MIR '06
MIR '07
PODS '06
PODS '07
SIGIR '06
SIGIR '07
SIGKDD '06
SIGKDD '07
SIGMOD '06
SIGMOD '07
VLDB '06
VLDB '07
Symposiums
ACM-GIS '06
ACM-GIS '07
SBBD '06
SBBD '07
Workshops
CVDB '07
DaMoN '06
DaMoN '07
DOLAP '06
DOLAP '07
ExpDB '06
ExpDB '07
HIKM '06
IDAR '07
MobiDE '06
MobiDE '07
WebDB '06
WebDB '07
WIDM '06
WIDM '07
XIME-P '06
XIME-P '07
Videos
SIGMOD '07
|
This DVD contains the proceedings of the
Thirteenth ACM SIGKDD International Conference on Knowledge Discovery
and Data Mining (SIGKDD 2007),
which was held from August 12-15, 2007 in San Jose, California
in cooperation with AAAI.
You may use the "  PDF"
link to retrieve the paper,
and the other links to find more information on the paper.
|
|
Chris Anderson
Calculating latent demand in the long tail 1
|
|
|
Usama M. Fayyad
From mining the web to inventing the new sciences underlying the internet 2-3
|
|
|
Jon M. Kleinberg
Challenges in mining social network data: processes, privacy, and paradoxes 4-5
|
|
|
Deepak Agarwal, Dhiman Barman, Dimitrios Gunopulos, Neal E. Young, Flip Korn, Divesh Srivastava
Efficient and effective explanation of change in hierarchical summaries 6-15
|
|
|
Deepak Agarwal, Andrei Z. Broder, Deepayan Chakrabarti, Dejan Diklic, Vanja Josifovski, Mayssam Sayyadian
Estimating rates of rare events at multiple resolutions 16-25
|
|
|
Deepak Agarwal, Srujana Merugu
Predictive discrete latent factor models for large scale dyadic data 26-35
|
|
|
Charu C. Aggarwal, Philip S. Yu
On string classification in data streams 36-45
|
|
|
Charu C. Aggarwal, Na Ta, Jianyong Wang, Jianhua Feng, Mohammed Javeed Zaki
Xproj: a framework for projected structural clustering of xml documents 46-55
|
|
|
Nikolay Archak, Anindya Ghose, Panagiotis G. Ipeirotis
Show me the money!: deriving the pricing power of product features by mining consumer reviews 56-65
|
|
|
Andrew Arnold, Yan Liu, Naoki Abe
Temporal causal modeling with graphical granger methods 66-75
|
|
|
Ricardo A. Baeza-Yates, Alessandro Tiberi
Extracting semantic relations from query logs 76-85
|
|
|
Hila Becker, Marta Arias
Real-time ranking with concept drift using expert advice 86-94
|
|
|
Robert Bell, Yehuda Koren, Chris Volinsky
Modeling relationships at multiple scales to improve accuracy of large recommender systems 95-104
|
|
|
Deepavali Bhagwat, Kave Eshghi, Pankaj Mehra
Content-based document routing and index partitioning for scalable similarity-based searches in a large corpus 105-112
|
|
|
Wanpracha Art Chaovalitwongse, Ya-Ju Fan, Rajesh C. Sachdeo
Support feature machine for classification of abnormal brain activity 113-122
|
|
|
Jianhui Chen, Zheng Zhao, Jieping Ye, Huan Liu
Nonlinear adaptive distance metric learning for clustering 123-132
|
|
|
Yixin Chen, Li Tu
Density-based clustering for real-time stream data 133-142
|
|
|
Peter A. Chew, Brett W. Bader, Tamara G. Kolda, Ahmed Abdelali
Cross-language information retrieval using PARAFAC2 143-152
|
|
|
Yun Chi, Xiaodan Song, Dengyong Zhou, Koji Hino, Belle L. Tseng
Evolutionary spectral clustering by incorporating temporal smoothness 153-162
|
|
|
Yun Chi, Shenghuo Zhu, Xiaodan Song, Jun'ichi Tatemura, Belle L. Tseng
Structural and temporal analysis of the blogosphere through community factorization 163-172
|
|
|
Sumit Chopra, Trivikraman Thampy, John Leahy, Andrew Caplin, Yann LeCun
Discovering the hidden structure of house prices with a non-parametric latent manifold model 173-182
|
|
|
Paul Cotofrei, Kilian Stoffel
Stochastic processes and temporal data mining 183-190
|
|
|
Daniel Crabtree, Peter Andreae, Xiaoying Gao
Exploiting underrepresented query aspects for automatic query expansion 191-200
|
|
|
Aron Culotta, Michael Wick, Robert Hall, Matthew Marzilli, Andrew McCallum
Canonicalization of database records using adaptive similarity measures 201-209
|
|
|
Wenyuan Dai, Gui-Rong Xue, Qiang Yang, Yong Yu
Co-clustering based classification for out-of-domain documents 210-219
|
|
|
Kaustav Das, Jeff G. Schneider
Detecting anomalous records in categorical datasets 220-229
|
|
|
Anirban Dasgupta, Petros Drineas, Boulos Harb, Vanja Josifovski, Michael W. Mahoney
Feature selection methods for text classification 230-239
|
|
|
Ian Davidson, S. S. Ravi, Martin Ester
Efficient incremental constrained clustering 240-249
|
|
|
Meghana Deodhar, Joydeep Ghosh
A framework for simultaneous co-clustering and learning from complex data 250-259
|
|
|
Chris H. Q. Ding, Rong Jin, Tao Li, Horst D. Simon
A learning framework using Green's function and kernel regularization with application to recommender system 260-269
|
|
|
Dejing Dou, Gwen A. Frishkoff, Jiawei Rong, Robert Frank, Allen D. Malony, Don M. Tucker
Development of NeuroElectroMagnetic ontologies(NEMO): a framework for mining brainwave ontologies 270-279
|
|
|
Gregory Druck, Chris Pal, Andrew McCallum, Xiaojin Zhu
Semi-supervised classification with hybrid generative/discriminative methods 280-289
|
|
|
Lisa Friedland, David Jensen
Finding tribes: identifying close-knit individuals from employment patterns 290-299
|
|
|
Gabriel Pui Cheong Fung, Jeffrey Xu Yu, Huan Liu, Philip S. Yu
Time-dependent event hierarchy construction 300-309
|
|
|
Byron J. Gao, Martin Ester, Jin-yi Cai, Oliver Schulte, Hui Xiong
The minimum consistent subset cover problem and its applications in data mining 310-319
|
|
|
Rong Ge, Martin Ester, Wen Jin, Ian Davidson
Constraint-driven clustering 320-329
|
|
|
Fosca Giannotti, Mirco Nanni, Fabio Pinelli, Dino Pedreschi
Trajectory pattern mining 330-339
|
|
|
Zhen Guo, Zhongfei Zhang, Eric P. Xing, Christos Faloutsos
Enhanced max margin learning on multimodal data mining in a multimedia database 340-349
|
|
|
Hannes Heikinheimo, Jouni K. Seppänen, Eino Hinkkanen, Heikki Mannila, Taneli Mielikäinen
Finding low-entropy sets and trees from binary data 350-359
|
|
|
Frizo A. L. Janssens, Wolfgang Glänzel, Bart De Moor
Dynamic hybrid clustering of bioinformatics by incorporating text mining and citation analysis 360-369
|
|
|
Yookyung Jo, Carl Lagoze, C. Lee Giles
Detecting research topics via the correlation between graphs and texts 370-379
|
|
|
Panagiotis Karras, Dimitris Sacharidis, Nikos Mamoulis
Exploiting duality in summarization with deterministic guarantees 380-389
|
|
|
Yiping Ke, James Cheng, Wilfred Ng
Correlation search in graph databases 390-399
|
|
|
Aleksander Kolcz, Wen-tau Yih
Raising the baseline for high-precision text classifiers 400-409
|
|
|
Srivatsan Laxman, P. S. Sastry, K. P. Unnikrishnan
A fast algorithm for finding frequent episodes in event streams 410-419
|
|
|
Jure Leskovec, Andreas Krause, Carlos Guestrin, Christos Faloutsos, Jeanne VanBriesen, Natalie S. Glance
Cost-effective outbreak detection in networks 420-429
|
|
|
Jinyan Li, Guimei Liu, Limsoon Wong
Mining statistically important equivalence classes and delta-discriminative emerging patterns 430-439
|
|
|
Ping Li
Very sparse stable random projections for dimension reduction in lalpha (0 <alpha<=2) norm 440-449
|
|
|
Yi Liu, Rong Jin, Anil K. Jain
BoostCluster: boosting clustering by pairwise constraints 450-459
|
|
|
David Lo, Siau-Cheng Khoo, Chao Liu
Efficient mining of iterative patterns for software specification discovery 460-469
|
|
|
Bo Long, Zhongfei (Mark) Zhang, Philip S. Yu
A probabilistic framework for relational clustering 470-479
|
|
|
Heikki Mannila, Evimaria Terzi
Nestedness and segmented nestedness 480-489
|
|
|
Qiaozhu Mei, Xuehua Shen, ChengXiang Zhai
Automatic labeling of multinomial topic models 490-499
|
|
|
David M. Mimno, Andrew McCallum
Expertise modeling for matching papers with reviewers 500-509
|
|
|
Flavia Moser, Rong Ge, Martin Ester
Joint cluster analysis of attribute and relationship data withouta-priori specification of the number of clusters 510-519
|
|
|
Ramesh Nallapati, Susan Ditmore, John D. Lafferty, Kin Ung
Multiscale topic tomography 520-529
|
|
|
Siegfried Nijssen, Élisa Fromont
Mining optimal decision trees from itemset lattices 530-539
|
|
|
Gaurav Pandey, Michael Steinbach, Rohit Gupta, Tushar Garg, Vipin Kumar
Association analysis-based transformations for protein interaction networks: a function prediction case study 540-549
|
|
|
Seung-Taek Park, David M. Pennock
Applying collaborative filtering techniques to movie search for better ranking and browsing 550-559
|
|
|
Raymond K. Pon, Alfonso F. Cardenas, David Buttler, Terence Critchlow
Tracking multiple topics for finding interesting articles 560-569
|
|
|
Filip Radlinski, Thorsten Joachims
Active exploration for learning rankings from clickthrough data 570-579
|
|
|
Mark Sandler
Hierarchical mixture models: a probabilistic analysis 580-589
|
|
|
Issei Sato, Hiroshi Nakagawa
Knowledge discovery of multiple-topic document using parametric mixture model with dirichlet prior 590-598
|
|
|
Vincent Schickel-Zuber, Boi Faltings
Using hierarchical clustering for learning theontologies used in recommendation systems 599-608
|
|
|
D. Sculley
Practical learning from one-sided feedback 609-618
|
|
|
Benyah Shaparenko, Thorsten Joachims
Information genealogy: uncovering the flow of ideas in non-hyperlinked document databases 619-628
|
|
|
Shady Shehata, Fakhri Karray, Mohamed Kamel
A concept-based model for enhancing text categorization 629-637
|
|
|
Victor S. Sheng, Charles X. Ling
Partial example acquisition in cost-sensitive learning 638-646
|
|
|
Motoki Shiga, Ichigaku Takigawa, Hiroshi Mamitsuka
A spectral clustering approach to optimally combining numericalvectors with a modular network 647-656
|
|
|
Andrew T. Smith, Charles Elkan
Making generative classifiers robust to selection bias 657-666
|
|
|
Xiuyao Song, Mingxi Wu, Christopher M. Jermaine, Sanjay Ranka
Statistical change detection for multi-dimensional data 667-676
|
|
|
Rohini K. Srihari, Li Xu, Tushar Saxena
Use of ranked cross document evidence trails for hypothesis generation 677-686
|
|
|
Jimeng Sun, Christos Faloutsos, Spiros Papadimitriou, Philip S. Yu
GraphScope: parameter-free mining of large time-evolving graphs 687-696
|
|
|
Gaurav Tandon, Philip K. Chan
Weighting versus pruning in rule validation for detecting network and host anomalies 697-706
|
|
|
Wei Tang, Hui Xiong, Shi Zhong, Jie Wu
Enhancing semi-supervised clustering: a feature projection perspective 707-716
|
|
|
Chayant Tantipathananandh, Tanya Y. Berger-Wolf, David Kempe
A framework for community identification in dynamic social networks 717-726
|
|
|
Choon Hui Teo, Alex J. Smola, S. V. N. Vishwanathan, Quoc V. Le
A scalable modular convex solver for regularized risk minimization 727-736
|
|
|
Hanghang Tong, Christos Faloutsos, Brian Gallagher, Tina Eliassi-Rad
Fast best-effort pattern matching in large attributed graphs 737-746
|
|
|
Hanghang Tong, Christos Faloutsos, Yehuda Koren
Fast direction-aware proximity for graph mining 747-756
|
|
|
David S. Vogel, Ognian Asparouhov, Tobias Scheffer
Scalable look-ahead linear regression trees 757-764
|
|
|
Jilles Vreeken, Matthijs van Leeuwen, Arno Siebes
Characterising the difference 765-774
|
|
|
Li Wan, Wee Keong Ng, Shuguo Han, Vincent C. S. Lee
Privacy-preservation for gradient descent methods 775-783
|
|
|
Xuanhui Wang, ChengXiang Zhai, Xiao Hu, Richard Sproat
Mining correlated bursty topic patterns from coordinated text streams 784-793
|
|
|
Xuerui Wang, Chris Pal, Andrew McCallum
Generalized component analysis for text with heterogeneous attributes 794-803
|
|
|
Raymond Chi-Wing Wong, Jian Pei, Ada Wai-Chee Fu, Ke Wang
Mining favorable facets 804-813
|
|
|
Junjie Wu, Hui Xiong, Peng Wu, Jian Chen
Local decomposition for rare class analysis 814-823
|
|
|
Xiaowei Xu, Nurcan Yuruk, Zhidan Feng, Thomas A. J. Schweiger
SCAN: a structural clustering algorithm for networks 824-833
|
|
|
Rong Yan, Jelena Tesic, John R. Smith
Model-shared subspace boosting for multi-label classification 834-843
|
|
|
Dragomir Yankov, Eamonn J. Keogh, Jose Medina, Bill Chiu, Victor B. Zordan
Detecting time series motifs under uniform scaling 844-853
|
|
|
Jieping Ye, Shuiwang Ji, Jianhui Chen
Learning the kernel matrix in discriminant analysis via quadratically constrained quadratic programming 854-863
|
|
|
Junsong Yuan, Ying Wu, Ming Yang
From frequent itemsets to semantically meaningful visual patterns 864-873
|
|
|
Xian Zhang, Yu Hao, Xiaoyan Zhu, Ming Li, David R. Cheriton
Information distance from a question to an answer 874-883
|
|
|
Hongkun Zhao, Weiyi Meng, Clement T. Yu
Mining templates from search result records of search engines 884-893
|
|
|
Shuyi Zheng, Ruihua Song, Ji-Rong Wen, Di Wu
Joint optimization of wrapper generation and template detection 894-902
|
|
|
Jun Zhu, Bo Zhang, Zaiqing Nie, Ji-Rong Wen, Hsiao-Wuen Hon
Webpage understanding: an integrated approach 903-912
|
|
|
Sitaram Asur, Srinivasan Parthasarathy, Duygu Ucar
An event-based framework for characterizing the evolutionary behavior of interaction graphs 913-921
|
|
|
Rebecca Castaño, Kiri Wagstaff, Steve A. Chien, Timothy M. Stough, Benyang Tang
On-board analysis of uncalibrated data for a spacecraft at mars 922-930
|
|
|
Andrew Fast, Lisa Friedland, Marc Maier, Brian Taylor, David Jensen, Henry G. Goldberg, John Komoroske
Relational data pre-processing techniques for improved securities fraud detection 941-949
|
|
|
Ming Hua, Jian Pei
Cleaning disguised missing data: a heuristic approach 950-958
|
|
|
Ron Kohavi, Randal M. Henne, Dan Sommerfield
Practical guide to controlled experiments on the web: listen to your customers not to the hippo 959-967
|
|
|
Ping Luo, Hui Xiong, Kevin Lü, Zhongzhi Shi
Distributed classification in peer-to-peer networks 968-976
|
|
|
Claudia Perlich, Saharon Rosset, Richard D. Lawrence, Bianca Zadrozny
High-quantile modeling for customer wallet estimation and other applications 977-985
|
|
|
Jun Hua Zhao, Zhao Yang Dong, Pei Zhang
Mining complex power networks for blackout prevention 986-994
|
|
|
Shubin Zhao, Jonathan Betz
Corroborate and learn facts from the web 995-1003
|
|
|
Guangyu Zhu, Timothy J. Bethea, Vikas Krishna
Extracting relevant named entities for automated expense reimbursement 1004-1012
|
|
|
Charu C. Aggarwal
A framework for classification and segmentation of massive audio data streams 1013-1017
|
|
|
Chris Curry, Robert L. Grossman, David Locke, Steve Vejcik, Joseph Bugajski
Detecting changes in large data sets of payment card data: a case study 1018-1022
|
|
|
Rong Pan, Junhui Zhao, Vincent Wenchen Zheng, Jeffrey Junfeng Pan, Dou Shen, Sinno Jialin Pan, Qiang Yang
Domain-constrained semi-supervised mining of tracking models in sensor networks 1023-1027
|
|
|
Wei Peng, Charles Perng, Tao Li, Haixun Wang
Event summarization for system management 1028-1032
|
|
|
R. Bharat Rao, Jinbo Bi, Glenn Fung, Marcos Salganicoff, Nancy Obuchowski, David P. Naidich
LungCAD: a clinically approved, machine learning system for lung cancer detection 1033-1037
|
|
|
Robert J. Yan, Charles X. Ling
Machine learning for stock selection 1038-1042
|
|
|
Yanfang Ye, Dingding Wang, Tao Li, Dongyi Ye
IMDS: intelligent malware detection system 1043-1047
|
|
|
Xiaoxin Yin, Jiawei Han, Philip S. Yu
Truth discovery with multiple conflicting information providers on the web 1048-1052
|
|
|
Srinivasan Parthasarathy
Data mining at the crossroads: successes, failures and learning from them 1053-1055
|
Copyright ©2010 Association for Computing Machinery
|