Taxonomy of Classification and Clustering

by Boris Mirkin Copyright  2020
(some of the leaf topics are supplied with synonyms after colon)

Root
├── 1 Clustering methods
│   ├── 1.1 Partition 
│   │   ├── 1.1.1. K-means
│   │   ├── 1.1.2. Number of clusters
│   │   ├── 1.1.3. Self organised map : som, kohonen map : som, kohonen map
│   │   ├── 1.1.4. Affinity propagation
│   │   ├── 1.1.5. Block-modeling
│   │   └── 1.1.6. Semi-supervised clustering: Semi-supervised learning, Labeled data
│   ├── 1.2 One cluster : single cluster : single cluster
│   ├── 1.3 Hierarchical clustering
│   │   ├── 1.3.1. Agglomerative
│   │   ├── 1.3.2. Divisive
│   │   └── 1.3.3. Decision tree: classification tree, conceptual clustering
│   ├── 1.4 Fuzzy clustering: soft clustering, fuzzy partition
│   ├── 1.5 Probabilistic clustering
│   │   ├── 1.5.1. Density function: distribution
│   │   ├── 1.5.2. Mixture
│   │   ├── 1.5.3. Expectation Maximization: Maximum likelihood
│   │   └── 1.5.4. Markov models: stochastic model, hidden markov model, markov decision process 
│   ├── 1.6 Cluster-wise regression: Regression clustering
│   ├── 1.7 Nature-inspired clustering: evolutionary algorithm, genetic algorithm, particle swarm, simulated annealing
│   ├── 1.8 Consensus clustering: Ensemble clustering
│   └── 1.9 Other clustering
├── 2 Data formats
│   ├── 2.1 Image analysis
│   │   ├── 2.10.1. Segmentation
│   │   ├── 2.10.2. Skeleton: framework, blueprint
│   │   └── 2.10.3. Edge detection
│   ├── 2.1 Similarity
│   │   ├── 2.1.1. Additive clustering
│   │   ├── 2.1.2. Graph mining: Pattern mining
│   │   ├── 2.1.3. Community detection:  clique
│   │   ├── 2.1.4. Spectral clustering: Normalized cut, Laplacian normalization
│   │   ├── 2.1.5. Maximum spanning tree: minimum spanning tree
│   │   ├── 2.1.6. Single linkage : nearest neighbor
│   │   ├── 2.1.7. Complete linkage : farthest neighbor
│   │   └── 2.1.8. Kernel function
│   ├── 2.11 Vector space
│   │   ├── 2.11.1. Nominal features:  categorical, character, attribute
│   │   ├── 2.11.2. Binary feature: Dummy variable, 
│   │   ├── 2.11.3. Data coding: data encoding, quantization
│   │   ├── 2.11.4. Feature weighting : feature selection, variable weighting, variable selection
│   │   ├── 2.11.5. Data normalization: standardization, preprocessing   
│   ├── 2.2 Distance: dissimilarity, metric
│   ├── 2.3 Tree: hierarchy, dendrogram
│   ├── 2.4 Preference: ordering, judgement
│   ├── 2.5 Incomplete: missing, gap
│   ├── 2.6 Spatial data: Geospatial data
│   ├── 2.7 Temporal data: dynamic, time series
│   ├── 2.8. Three-way : tensor, tricluster
│   └── 2.9. Text: Document
│       ├── 2.9.1. Document filtering: document summarization
│       ├── 2.9.2. Topic modeling : latent allocation : latent allocation
│       ├── 2.9.3. Information retrieval : document retrieval, search, tf-idf coding
│       └── 2.9.4. Text mining: information extraction
├── 3 Related methods
│   ├── 3.1 Ranking : ordering, seriation
│   ├── 3.10. Association
│   │   ├── 3.10.1. Contingency table
│   │   ├── 3.10.2. Comparing partitions: Rand index
│   │   └── 3.10.3. Association index: chi-squared
│   ├── 3.11 Regression analysis
│   │   ├── 3.11.1. Linear regression: correlation coefficient
│   │   ├── 3.11.2. Non-linear regression: Power law
│   │   └── 3.11.3. Spline
│   ├── 3.12. Formal concept: Galois association
│   ├── 3.2 Scaling : normalization, multidimensional scaling
│   ├── 3.3 Classifier: Pattern recognition
│   │   ├── 3.3.1. Discriminant analysis
│   │   ├── 3.3.2. Support vector machine: SVM
│   │   ├── 3.3.3. Nearest Neighbor
│   │   ├── 3.3.3. Neural network: Feedforward network, Backpropagation, Deep learning, Learning rate
│   │   ├── 3.3.4. Regularizer: Generalization
│   │   └── 3.3.5. Ensemble : Bagging, Boosting, Random forest
│   ├── 3.4. Dimensionality reduction
│   │   ├── 3.4.1. Singular value decomposition: SVD
│   │   ├── 3.4.2. Principal component analysis
│   │   ├── 3.4.3. Correspondence analysis: Dual scaling
│   │   └── 3.4.4. Independent component
│   ├── 3.6. Biclustering: Coclustering, Two-mode clustering
│   ├── 3.7. Visualization
│   │   ├── 3.7.1. Biplot
│   │   ├── 3.7.2. Dendrogram
│   │   ├── 3.7.3. Cloud
│   │   └── 3.7.4. Parallel coordinates
│   ├── 3.8. Matrix faxtorization
│   │   ├── 3.8.1. Non-negative decomposition
│   │   └── 3.8.2. Archetype analysis
│   └── 3.9 Anomaly detection : outlier
├── 4 Analysis of properties
│   ├── 4.1 Probability and statistics
│   │   ├── 4.1.1. Mixture of distributions
│   │   ├── 4.1.2. Bayesian
│   │   ├── 4.1.3. Maximum likelihood
│   │   └── 4.1.4. Markov-chain Monte Carlo methods
│   ├── 4.2 Geometry
│   ├── 4.3 Optimisation: Minimization, Maximization
│   ├── 4.4 Cluster structures
│   ├── 4.5 Computational experiment
│   │   ├── 4.5.1. Cross-validation
│   │   ├── 4.5.2. Resampling : random sample
│   │   ├── 4.5.3. Leave-one-out : jackknife
│   │   ├── 4.5.4. Bootstrap
│   │   └── 4.5.5. Data generation: synthetic data
│   └── 4.6. Efficiency
│       ├── 4.6.1. Accuracy: Precision, Sensitivity    
│       ├── 4.6.2. False positive: False negative, True Positive, False positive, F-score
│       ├── 4.6.3. Cluster recovery: Ground truth, Kappa statistic
│       ├── 4.6.4. Receiver operating characteristic: Area under the curve, ROC curve, AUC
│       └── 4.6.5. Quadratic error : Mean squared error, squared error, inertia
└── 5 Applications
    ├── 5.1. Classification: typology, taxonomy
    ├── 5.10. Sociology: social science
    ├── 5.2. Evolution: phylogeny, Darwin
    ├── 5.3. Psychology
    ├── 5.4. Antropology
    ├── 5.5. Geography
    ├── 5.6. Recommender systems: recommendation engine, collaborative filtering
    ├── 5.7. Interpretation: Explaining
    ├── 5.8. Knowledge representation and reasoning
    │   ├── 5.8.1. Ontology: Knowledge graph, knowledge base    
    │   ├── 5.8.2. Reasoning: Expert system, Inference engine, production system, if-then rules
    │   ├── 5.8.3. Robotics: Autonomous device
    │   ├── 5.8.4. Belief network: Bayesian network, decision network
    │   └── 5.8.5. Fuzzy logics: Fuzzy systems
    └── 5.9. Bioinformatics: genomics, proteomics