Taxonomy of Classification and Clustering by Boris Mirkin Copyright 2020 (some of the leaf topics are supplied with synonyms after colon) Root ├── 1 Clustering methods │ ├── 1.1 Partition │ │ ├── 1.1.1. K-means │ │ ├── 1.1.2. Number of clusters │ │ ├── 1.1.3. Self organised map : som, kohonen map : som, kohonen map │ │ ├── 1.1.4. Affinity propagation │ │ ├── 1.1.5. Block-modeling │ │ └── 1.1.6. Semi-supervised clustering: Semi-supervised learning, Labeled data │ ├── 1.2 One cluster : single cluster : single cluster │ ├── 1.3 Hierarchical clustering │ │ ├── 1.3.1. Agglomerative │ │ ├── 1.3.2. Divisive │ │ └── 1.3.3. Decision tree: classification tree, conceptual clustering │ ├── 1.4 Fuzzy clustering: soft clustering, fuzzy partition │ ├── 1.5 Probabilistic clustering │ │ ├── 1.5.1. Density function: distribution │ │ ├── 1.5.2. Mixture │ │ ├── 1.5.3. Expectation Maximization: Maximum likelihood │ │ └── 1.5.4. Markov models: stochastic model, hidden markov model, markov decision process │ ├── 1.6 Cluster-wise regression: Regression clustering │ ├── 1.7 Nature-inspired clustering: evolutionary algorithm, genetic algorithm, particle swarm, simulated annealing │ ├── 1.8 Consensus clustering: Ensemble clustering │ └── 1.9 Other clustering ├── 2 Data formats │ ├── 2.1 Image analysis │ │ ├── 2.10.1. Segmentation │ │ ├── 2.10.2. Skeleton: framework, blueprint │ │ └── 2.10.3. Edge detection │ ├── 2.1 Similarity │ │ ├── 2.1.1. Additive clustering │ │ ├── 2.1.2. Graph mining: Pattern mining │ │ ├── 2.1.3. Community detection: clique │ │ ├── 2.1.4. Spectral clustering: Normalized cut, Laplacian normalization │ │ ├── 2.1.5. Maximum spanning tree: minimum spanning tree │ │ ├── 2.1.6. Single linkage : nearest neighbor │ │ ├── 2.1.7. Complete linkage : farthest neighbor │ │ └── 2.1.8. Kernel function │ ├── 2.11 Vector space │ │ ├── 2.11.1. Nominal features: categorical, character, attribute │ │ ├── 2.11.2. Binary feature: Dummy variable, │ │ ├── 2.11.3. Data coding: data encoding, quantization │ │ ├── 2.11.4. Feature weighting : feature selection, variable weighting, variable selection │ │ ├── 2.11.5. Data normalization: standardization, preprocessing │ ├── 2.2 Distance: dissimilarity, metric │ ├── 2.3 Tree: hierarchy, dendrogram │ ├── 2.4 Preference: ordering, judgement │ ├── 2.5 Incomplete: missing, gap │ ├── 2.6 Spatial data: Geospatial data │ ├── 2.7 Temporal data: dynamic, time series │ ├── 2.8. Three-way : tensor, tricluster │ └── 2.9. Text: Document │ ├── 2.9.1. Document filtering: document summarization │ ├── 2.9.2. Topic modeling : latent allocation : latent allocation │ ├── 2.9.3. Information retrieval : document retrieval, search, tf-idf coding │ └── 2.9.4. Text mining: information extraction ├── 3 Related methods │ ├── 3.1 Ranking : ordering, seriation │ ├── 3.10. Association │ │ ├── 3.10.1. Contingency table │ │ ├── 3.10.2. Comparing partitions: Rand index │ │ └── 3.10.3. Association index: chi-squared │ ├── 3.11 Regression analysis │ │ ├── 3.11.1. Linear regression: correlation coefficient │ │ ├── 3.11.2. Non-linear regression: Power law │ │ └── 3.11.3. Spline │ ├── 3.12. Formal concept: Galois association │ ├── 3.2 Scaling : normalization, multidimensional scaling │ ├── 3.3 Classifier: Pattern recognition │ │ ├── 3.3.1. Discriminant analysis │ │ ├── 3.3.2. Support vector machine: SVM │ │ ├── 3.3.3. Nearest Neighbor │ │ ├── 3.3.3. Neural network: Feedforward network, Backpropagation, Deep learning, Learning rate │ │ ├── 3.3.4. Regularizer: Generalization │ │ └── 3.3.5. Ensemble : Bagging, Boosting, Random forest │ ├── 3.4. Dimensionality reduction │ │ ├── 3.4.1. Singular value decomposition: SVD │ │ ├── 3.4.2. Principal component analysis │ │ ├── 3.4.3. Correspondence analysis: Dual scaling │ │ └── 3.4.4. Independent component │ ├── 3.6. Biclustering: Coclustering, Two-mode clustering │ ├── 3.7. Visualization │ │ ├── 3.7.1. Biplot │ │ ├── 3.7.2. Dendrogram │ │ ├── 3.7.3. Cloud │ │ └── 3.7.4. Parallel coordinates │ ├── 3.8. Matrix faxtorization │ │ ├── 3.8.1. Non-negative decomposition │ │ └── 3.8.2. Archetype analysis │ └── 3.9 Anomaly detection : outlier ├── 4 Analysis of properties │ ├── 4.1 Probability and statistics │ │ ├── 4.1.1. Mixture of distributions │ │ ├── 4.1.2. Bayesian │ │ ├── 4.1.3. Maximum likelihood │ │ └── 4.1.4. Markov-chain Monte Carlo methods │ ├── 4.2 Geometry │ ├── 4.3 Optimisation: Minimization, Maximization │ ├── 4.4 Cluster structures │ ├── 4.5 Computational experiment │ │ ├── 4.5.1. Cross-validation │ │ ├── 4.5.2. Resampling : random sample │ │ ├── 4.5.3. Leave-one-out : jackknife │ │ ├── 4.5.4. Bootstrap │ │ └── 4.5.5. Data generation: synthetic data │ └── 4.6. Efficiency │ ├── 4.6.1. Accuracy: Precision, Sensitivity │ ├── 4.6.2. False positive: False negative, True Positive, False positive, F-score │ ├── 4.6.3. Cluster recovery: Ground truth, Kappa statistic │ ├── 4.6.4. Receiver operating characteristic: Area under the curve, ROC curve, AUC │ └── 4.6.5. Quadratic error : Mean squared error, squared error, inertia └── 5 Applications ├── 5.1. Classification: typology, taxonomy ├── 5.10. Sociology: social science ├── 5.2. Evolution: phylogeny, Darwin ├── 5.3. Psychology ├── 5.4. Antropology ├── 5.5. Geography ├── 5.6. Recommender systems: recommendation engine, collaborative filtering ├── 5.7. Interpretation: Explaining ├── 5.8. Knowledge representation and reasoning │ ├── 5.8.1. Ontology: Knowledge graph, knowledge base │ ├── 5.8.2. Reasoning: Expert system, Inference engine, production system, if-then rules │ ├── 5.8.3. Robotics: Autonomous device │ ├── 5.8.4. Belief network: Bayesian network, decision network │ └── 5.8.5. Fuzzy logics: Fuzzy systems └── 5.9. Bioinformatics: genomics, proteomics