Tools & Setup

The tools I reach for across recommendation systems, reinforcement learning, LLM products, and production ML infrastructure — shaped by building at Farfetch and Wellhub.

Core ML & Data Science

The fundamentals — used daily since Farfetch and still the backbone of most production work.

Python Language I think in — glue for everything below.
scikit-learn Classical ML, preprocessing, and evaluation utilities.
XGBoost Gradient boosting for tabular data — fast and battle-tested.
LightGBM Efficient gradient boosting for large-scale datasets.
Pandas Data wrangling, EDA, quick prototyping.
NumPy Numerical computing and array operations.
SQL / BigQuery Analytics queries and warehouse workloads.
Spark Distributed data processing at scale.
Optuna Hyperparameter optimization with Bayesian search and pruning.

Deep Learning

From custom LSTM architectures at Farfetch to LLM fine-tuning at Wellhub.

PyTorch Default framework for training and experimenting.
Transformers Pretrained models, tokenizers, pipelines.
PEFT / TRL LoRA, QLoRA, RLHF — efficient fine-tuning.
DeepSpeed Distributed training and inference optimization.
Axolotl Streamlined LLM fine-tuning workflows.
Unsloth Fast and memory-efficient LLM fine-tuning.

Recommendation & Personalization

The RecSys stack — from candidate retrieval through ranking to real-time serving.

Vespa.ai Real-time serving and ranking for recommendations at Farfetch. Candidate retrieval, filtering, and learning-to-rank in one system.
Vowpal Wabbit Contextual bandits for adaptive personalization.
FAISS Dense vector similarity for candidate retrieval.
Weaviate Vector database for semantic search and retrieval.
Elasticsearch Hybrid search and filtering in recommendation pipelines.

Reinforcement Learning & Bandits

From contextual bandits in production to policy optimization research.

Stable Baselines3 Reliable RL algorithm implementations.
Gymnasium Standard API for RL environments.
Vowpal Wabbit Contextual bandits and online learning at scale.

LLM Tooling

Building and orchestrating LLM-powered products.

LangChain Composable chains for LLM applications.
LangGraph Stateful agent orchestration with cycles.
LlamaIndex RAG pipelines and data connectors for LLMs.
DSPy Programmatic prompt optimization and pipelines.
LiteLLM Unified API proxy for multiple LLM providers.
vLLM High-throughput model serving with PagedAttention.
Ollama Local LLM inference for development and testing.

Evaluation & Monitoring

Experiment tracking, model evaluation, and production dashboards.

Weights & Biases Training dashboards and hyperparameter sweeps.
MLflow Experiment tracking and model registry.
RAGAS RAG evaluation framework.
DeepEval LLM evaluation and testing.
Opik LLM observability and tracing.
LangSmith LLM application tracing and evaluation.
Looker Production dashboards and business analytics.

Infrastructure & MLOps

Training orchestration, real-time pipelines, and deployment. Databricks and Azure as primary cloud platform.

Databricks Unified analytics platform for ML and data engineering.
Azure Primary cloud platform — compute, storage, and ML services.
Kubernetes Container orchestration for scalable deployments.
Docker Containerisation for reproducible everything.
Kubeflow ML pipelines on Kubernetes — training to serving.
Kafka Event streaming for real-time data pipelines.
Argo Kubernetes-native workflow orchestration and continuous delivery.
Airflow Workflow scheduling and pipeline orchestration.
Terraform Infrastructure as code for cloud provisioning.
AWS Secondary cloud — S3, SageMaker, Lambda.

Apps & Prototyping

Rapid prototyping for data apps and ML demos.

Streamlit Quick interactive dashboards and data apps.