ReaderLM v2, introduced by Jina AI, is a small language model with 1.5 billion parameters, specifically designed for converting HTML to Markdown and extracting HTML to JSON with exceptional accuracy. The model supports 29 languages and can handle input/output combinations of up to 512,000 tokens in length. It employs a new training paradigm and higher-quality training data, making significant advances over its predecessor in handling long text and generating Markdown syntax, allowing for proficient use of Markdown syntax and the creation of complex elements. Additionally, ReaderLM v2 features direct HTML to JSON generation capabilities, enabling users to extract specific information from raw HTML based on a provided JSON schema, eliminating the need for intermediate Markdown conversion.