LoRA and DoRA Implementation
Parameter-efficient fine-tuning from first principles — every matrix decomposition derived and implemented in PyTorch without libraries. Validated against Hugging Face PEFT outputs for correctness.
Personal project
Parameter-efficient fine-tuning from first principles — every matrix decomposition derived and implemented in PyTorch without libraries. Validated against Hugging Face PEFT outputs for correctness.
Chat inference on Apple Silicon using MLX — exploring the runtime, quantization options, and packaging story for local LLM deployment with Mistral and Llama2.