Machine learning
Notes on Machine Learning
This collection of links serves as a comprehensive guide to various aspects of machine learning and artificial intelligence, covering everything from foundational concepts to advanced techniques and frameworks.
Machine Learning Overview
- The difference between AI and Machine Learning: Clarifying misconceptions and highlighting the distinctions between AI and ML.
Core Concepts
- Data Handling and Generation: Techniques for synthetic data creation are crucial for training models where real data is scarce.
- Model Explainability: Making machine learning models understandable to humans.
- Handling Drift: Maintaining model performance over time by addressing concept, model, and data drift.
Model Development and Evaluation
- Optimisation Techniques: Algorithms to improve model performance, including both gradient-based and gradient-free options.
- Gradient Descent, Stochastic Gradient Descent, Stochastic Gradient descent with momentum, Mini-Batch Gradient Descent, Adagrad, RMSProp, AdaDelta, Adam, reference/site/Gradient-free optimisation
- Model Selection and Evaluation: Strategies for selecting the best model and assessing its performance.
- Cross-validation, Kernel functions: Interpretation and applications, Kernel functions
- Error and Performance Metrics: Metrics to evaluate model errors and performance.
- Error metrics: Distance metrics
- Performance: Language performance metrics, Model performance metrics
Machine Learning Methods and Techniques
- Time-Series Analysis: Analysing and predicting data that changes over time, and detecting anomalies in streaming data.
- Clustering: Grouping data points based on their similarities.
- Fairness: Ensuring models do not perpetuate biases.
- Transformations: Preprocessing steps for effective algorithm performance.
- Recurrent Neural Networks (RNN): Recognising patterns in sequences of data with LSTM networks.
Machine Learning Applications and Frameworks
- Supervised and Unsupervised Learning: Differentiating these foundational approaches with examples.
- Supervised methods: Random Forest, Regression: reference/site/Gaussian Process Regression
- Unsupervised methods: Self-organising maps
LLMs (Large Language Models)
- Overview of LLMs: Understanding the architecture and capabilities of large language models, including their applications in natural language processing and beyond.
- Introduction to LLMs, Applications of LLMs
- Retrieval-Augmented Generation (RAG): Enhancing LLMs with external knowledge retrieval
- LLM Evaluation and Benchmarking: Methods for assessing the performance of LLMs, including metrics and benchmarks relevant to their capabilities.
- Tying into existing evaluation metrics: Language performance metrics, Model performance metrics
- LLM Evaluation: LLM evaluation
- LLM Security and Safety: Exploring vulnerabilities, attack vectors, and defence mechanisms for large language models, including techniques to prevent misuse and ensure responsible deployment.
- Adversarial Attacks: Understanding how malicious inputs can manipulate model behaviour through techniques like prompt injection and jail-breaking.
- Detection and Prevention: Methods for identifying and mitigating security threats to LLMs in production environments.
- Alignment and Safety: Approaches to ensure LLMs behave according to human values and resist harmful outputs.
- Machine Learning Frameworks and Tools: Tools and frameworks for ML development, from model building to deployment.
- KServe, Cookiecutter Data Science, reference/site/Scikit-learn, Model serving
Fundamental Theory and Statistics
- Statistics in Machine Learning: Statistical concepts and methods crucial for ML algorithms and evaluation.