Research Interests
I am primarily interested in the theory side of Data Mining, Machine Learning and to an ever increasing extent, Statistics. I enjoy solving the mathematical problems that arise in these fields and developing efficient algorithms that implement the solutions in large and or high dimensional data sets. Application areas include:
- Itemset and association rule mining
- Uncertain and probabilistic databases
- Rule mining
- Classification, in particular in imbalanced datasets
- Statistical approaches in data mining
- Spatial-temporal data mining
- Data streams
- Feature generation and selection
- Clustering
- Graph mining
- Mining hetrogeneous and mixed data
I am particularly interested in the use of probability and statistics in Data Mining. For example, the integration of rigorous statistical approaches into Data Mining techniques. I think it is important to ensure that the decisions made by an algorithm and the information provided to the user are meaningful (in particular, significant) as well as being easy to understand.
I have often drawn inspiration from viewing problems and solutions from a geometric perspective. In particular, using vectors and vector based algorithms as well as graph based approaches to solving problems.
I am also interested in parallel, randomised and approximation algorithms to deal with the inherent complexity of Data Mining and Machine Learning problems.
Research Problems
Please see here (NOTE: this information is out of date).