LLM-From-Scratch
PublicMedical Language Model fine-tuned using pretraining, instruction tuning, and Direct Preference Optimization (DPO). Progresses from general medical knowledge to specific instruction following, with experiments in preference alignment for improved medical text generation and understanding.