Read more
This book constitutes the refereed proceedings of the 13th Pacific-Asia Conference on Knowledge Discovery and Data Mining, PAKDD 2009, held in Bangkok, Thailand, in April 2009. The 39 revised full papers and 73 revised short papers presented together with 3 keynote talks were carefully reviewed and selected from 338 submissions. The papers present new ideas, original research results, and practical development experiences from all KDD-related areas including data mining, data warehousing, machine learning, databases, statistics, knowledge acquisition, automatic scientific discovery, data visualization, causal induction, and knowledge-based systems.
List of contents
Keynote Speeches.- KDD for BSN - Towards the Future of Pervasive Sensing.- Finding Hidden Structures in Relational Databases.- The Future of Search: An Online Content Perspective.- Regular Papers.- DTU: A Decision Tree for Uncertain Data.- Efficient Privacy-Preserving Link Discovery.- On Link Privacy in Randomizing Social Networks.- Sentence-Level Novelty Detection in English and Malay.- Text Categorization Using Fuzzy Proximal SVM and Distributional Clustering of Words.- Cool Blog Classification from Positive and Unlabeled Examples.- Thai Word Segmentation with Hidden Markov Model and Decision Tree.- An Efficient Method for Generating, Storing and Matching Features for Text Mining.- Robust Graph Hyperparameter Learning for Graph Based Semi-supervised Classification.- Regularized Local Reconstruction for Clustering.- Clustering with Lower Bound on Similarity.- Approximate Spectral Clustering.- An Integration of Fuzzy Association Rules and WordNet for Document Clustering.- Nonlinear Data Analysis Using a New Hybrid Data Clustering Algorithm.- A Polynomial-Delay Polynomial-Space Algorithm for Extracting Frequent Diamond Episodes from Event Sequences.- A Statistical Approach for Binary Vectors Modeling and Clustering.- Multi-resolution Boosting for Classification and Regression Problems.- Interval Data Classification under Partial Information: A Chance-Constraint Approach.- Negative Encoding Length as a Subjective Interestingness Measure for Groups of Rules.- The Studies of Mining Frequent Patterns Based on Frequent Pattern Tree.- Discovering Periodic-Frequent Patterns in Transactional Databases.- Quantifying Asymmetric Semantic Relations from Query Logs by Resource Allocation.- Acquiring Semantic Relations Using the Web for Constructing Lightweight Ontologies.- Detecting Abnormal Events via Hierarchical Dirichlet Processes.- Active Learning for Causal Bayesian Network Structure with Non-symmetrical Entropy.- A Comparative Study of Bandwidth Choice in Kernel Density Estimation for Naive Bayesian Classification.- Analysis of Variational Bayesian Matrix Factorization.- Variational Bayesian Approach for Long-Term Relevance Feedback.- Detecting Link Hijacking by Web Spammers.- A Data Driven Ensemble Classifier for Credit Scoring Analysis.- A Multi-partition Multi-chunk Ensemble Technique to Classify Concept-Drifting Data Streams.- Parameter Estimation in Semi-Random Decision Tree Ensembling on Streaming Data.- Exploiting the Block Structure of Link Graph for Efficient Similarity Computation.- Online Feature Selection Algorithm with Bayesian ?1 Regularization.- Feature Selection for Local Learning Based Clustering.- RV-SVM: An Efficient Method for Learning Ranking SVM.- A Kernel Framework for Protein Residue Annotation.- Dynamic Exponential Family Matrix Factorization.- A Nonparametric Bayesian Learning Model: Application to Text and Image Categorization.- Short Papers.- Safe-Level-SMOTE: Safe-Level-Synthetic Minority Over-Sampling TEchnique for Handling the Class Imbalanced Problem.- Using Highly Expressive Contrast Patterns for Classification - Is It Worthwhile?.- Arif Index for Predicting the Classification Accuracy of Features and Its Application in Heart Beat Classification Problem.- UCI++: Improved Support for Algorithm Selection Using Datasetoids.- Accurate Synthetic Generation of Realistic Personal Information.- An Efficient Approximate Protocol for Privacy-Preserving Association Rule Mining.- Information Extraction from Thai Text with Unknown Phrase Boundaries.- A Corpus-Based Approach for Automatic Thai Unknown Word Recognition using Ensemble Learning Techniques.- A Hybrid Approach to Improve Bilingual Multiword Expression Extraction.- Addressing the Variability of Natural Language Expression in Sentence Similarity with Semantic Structure of the Sentences.- Scalable Web Mining with Newistic.- Building a Text Classifier by a Keyword and Unlabeled Documents.- A Discriminative Approach to Topic-Based Citation Recommendation.-