Machine-Learning-and-Language-Model
PublicThis project explores GPT-2 and Llama models through pre-training, fine-tuning, and Chain-of-Thought (CoT) prompting. It includes memory-efficient optimizations (SGD, LoRA, BAdam) and evaluations on math datasets (GSM8K, NumGLUE, StimulEq, SVAMP).