Medal's AI lab General Intuition raised $133.7M seed funding led by Khosla Ventures and General Catalyst. It leverages Medal's game videos to train AI models, focusing on spatio-temporal reasoning with superior datasets.....
SAIL-VL2, a multimodal model by TikTok SAIL and LV-NUS Lab, achieves breakthroughs on 106 datasets with compact 2B/8B parameters, outperforming peers in complex reasoning tasks like MMMU and MathVista, rivaling large closed models.....
Google launches MCP server for AI agents to access public datasets efficiently, reducing errors and providing verifiable answers. It simplifies data usage with standardized access, enabling reliable data retrieval for faster development of data-driven applications.....
Tencent released WeChat-YATT, a scalable and efficient training library for reinforcement & multimodal learning, optimized for large models and datasets.....
Radal is a no-code platform that allows you to fine-tune small language models using your own data. Connect your datasets, configure training visually, and deploy models in minutes.
An AI image annotation tool designed for lightweight and rapid construction of complex scenario datasets.
A platform for unstructured data processing that helps businesses quickly build industry datasets and integrate them into LLM RAG knowledge bases
RAG-FiT is a library designed to enhance LLMs' capability to utilize external information by fine-tuning models with specifically created RAG-enhanced datasets.
azure
$0.36
Input tokens/M
$0.72
Output tokens/M
128k
Context Length
GilbertAkham
This is a multi-task fine-tuned model based on DeepSeek-R1-Distill-Qwen-1.5B. It is trained on multiple datasets through the LoRA adapter and has powerful multi-task generalization and reasoning capabilities. It can handle a wide range of natural language and reasoning-based tasks.
bartowski
This is a quantized version of the internlm's JanusCoder-14B model. It is quantized using specific tools and datasets, providing a variety of quantization type files from low quality to high quality. It can run in LM Studio or projects based on llama.cpp.
mradermacher
Lamapi/next-12b is a multilingual large language model with 12 billion parameters, offering multiple quantized versions and supporting various natural language processing tasks such as text generation, question answering, and chatting. This model is trained on multiple domain datasets and features high efficiency and lightweight characteristics.
LiquidAI
PyLate is a tool library focused on sentence similarity calculation and information retrieval. It can perform efficient information retrieval tasks on multiple datasets, providing strong support for research and applications in related fields. This model supports 8 languages and performs excellently in multiple benchmark tests.
onnx-community
Granite-4.0-1B is a lightweight instruction model developed by IBM, fine-tuned based on Granite-4.0-1B-Base. This model combines open-source instruction datasets and internal synthetic datasets, and is developed using techniques such as supervised fine-tuning, reinforcement learning, and model merging. It is suitable for device-side deployment and research use cases.
Jesteban247
brats_medgemma_light is a fusion model based on unsloth/medgemma-4b-it, fine-tuned on the BraTS and TextBraTS datasets. It is a lightweight vision-language model specifically designed for brain MRI interpretation and radiology text generation.
mlfoundations-cua-dev
OLGA is an online reinforcement learning positioning agent built on Qwen3-VL-30B-A3B-Instruct, using a mixture-of-experts model with 3.3 billion active parameters. It is trained through a new data recipe that combines existing datasets, new data collection, automatic filtering, and online reinforcement learning, achieving advanced positioning performance among open-source models.
maomao0819
BEVANet is a deep learning model designed for real-time semantic segmentation, which performs excellently on datasets such as Cityscapes. It achieves an outstanding performance of 81.0% mIoU and 32.8 FPS on the RTX3090, balancing the requirements of accuracy and speed.
ibm-granite
Granite-4.0-350M is a lightweight instruction model developed by IBM, fine-tuned based on Granite-4.0-350M-Base. This model combines open-source instruction datasets and internal synthetic datasets, and is developed using supervised fine-tuning, reinforcement learning, and model merging techniques. It has powerful instruction-following capabilities and tool invocation functions.
Granite-4.0-1B is a lightweight instruction model developed by IBM. It is fine-tuned based on Granite-4.0-1B-Base, combining open-source instruction datasets and internal synthetic datasets, and developed using supervised fine-tuning, reinforcement learning, and model merging techniques.
Granite-4.0-H-350M is a lightweight instruction model developed by IBM, fine-tuned based on Granite-4.0-H-350M-Base. This model combines open-source instruction datasets and internal synthetic datasets, and is developed using various technologies such as supervised fine-tuning, reinforcement learning, and model merging. It has powerful instruction-following capabilities and multilingual support.
rand0nmr
Wan2.2 is a significant upgrade version of the basic video model. It introduces the Mixture of Experts (MoE) architecture, incorporates carefully curated aesthetic data, and is trained on larger datasets to enhance the ability to generate complex motions. This model supports generating 5-second videos with 480P and 720P resolutions, and has significant improvements in video generation quality and performance.
unsloth
Granite-4.0-H-Small is a long-context instruction model developed by IBM with 32 billion parameters, fine-tuned based on Granite-4.0-H-Small-Base. This model combines open-source instruction datasets and internal synthetic datasets, and uses techniques such as supervised fine-tuning, reinforcement learning alignment, and model merging. It has significantly improved instruction following and tool invocation capabilities, and is particularly suitable for enterprise-level applications.
Granite-4.0-H-Micro is a 3-billion parameter long-context instruction model developed by IBM, fine-tuned from Granite-4.0-H-Micro-Base. This model combines open-source instruction datasets and internal synthetic datasets, and is developed using techniques such as supervised fine-tuning, reinforcement learning alignment, and model merging. It has a structured chat format and performs excellently in instruction following and tool invocation capabilities.
Granite-4.0-H-Tiny is a long-context instruction model developed by IBM with 7 billion parameters, fine-tuned based on Granite-4.0-H-Tiny-Base. This model combines open-source instruction datasets and internal synthetic datasets, and is developed using techniques such as supervised fine-tuning, reinforcement learning alignment, and model merging. It has enhanced instruction-following and tool invocation capabilities, making it particularly suitable for enterprise-level applications.
Granite-4.0-Micro is a long-context instruction model with 3 billion parameters developed by IBM, fine-tuned based on Granite-4.0-Micro-Base. This model combines open-source instruction datasets and internal synthetic datasets, and is developed using technologies such as supervised fine-tuning, reinforcement learning alignment, and model merging. It has enhanced instruction-following and tool invocation capabilities, and is particularly suitable for enterprise-level applications.
Granite-4.0-H-Small is a long-context instruction model with 32 billion parameters developed by IBM, fine-tuned based on Granite-4.0-H-Small-Base. This model combines open-source instruction datasets and internal synthetic datasets and is developed using techniques such as supervised fine-tuning, reinforcement learning alignment, and model merging, with significant improvements in instruction following and tool invocation capabilities.
Salesforce
GTA1 is a state-of-the-art GUI grounding model trained based on reinforcement learning (GRPO), specifically designed for graphical user interface automation tasks. Different from methods that rely on long thought-chain reasoning, GRPO directly incentivizes actionable and well-founded responses, demonstrating excellent grounding performance and agent performance on multiple challenging datasets.
ce-lery
This is a Japanese pre - trained language model based on the Mistral 300M architecture. It is trained using the Wikipedia and cc100 datasets, and the byte - fallback technique in the SentencePiece tokenizer is adopted to suppress the generation of unknown words.
Granite-4.0-H-Tiny is a 7-billion parameter long context instruction model developed by IBM, fine-tuned from Granite-4.0-H-Tiny-Base. This model is trained by combining open-source instruction datasets and internal synthetic datasets, and has the ability to provide professional, accurate, and secure responses. It supports multiple languages and tool invocation, and is suitable for enterprise-level applications.
An MCP server that provides financial market data, supporting queries of financial data and price information of stocks and cryptocurrencies
A server project based on FastMCP that provides an access interface to Israel's OpenBudget data, supporting query and search of various budget - related datasets.
This project provides a server based on the Model Context Protocol (MCP) for accessing the open data API of Statistics Netherlands (CBS), enabling AI tools to query statistical datasets, dimensions, and observations through the MCP protocol.
The Foundry MCP Server is a Model Context Protocol server for interacting with the Foundry platform, supporting AI assistants to operate on datasets, ontology objects, and execute functions.
The Malaysia Open Data MCP service provides convenient access to government datasets and collections, supporting enhanced unified search, Parquet file parsing, hybrid data access architecture, and multi - provider geocoding functions.
An MCP server for accessing and operating on Hugging Face datasets
An MCP server that provides read-only access to the Hugging Face Hub API, supporting resources such as LLM interaction models and datasets.
The Honeycomb MCP server is an interface that enables Claude AI to interact with the Honeycomb API via the Model Context Protocol (MCP), allowing operations such as retrieving, creating, and updating datasets.
This project is a Kaggle MCP server built on the FastMCP library, which provides functions for searching and downloading Kaggle datasets and can generate EDA notebook prompts.
Honeycomb MCP is a Model Context Protocol server for interacting with Honeycomb observability data, supporting querying datasets across multiple environments.
The EOSC Data Commons MCP Server provides an HTTP interface to help users find the required datasets and tools through a search API and large language models.
This project implements an API server that complies with the MCP protocol for searching the PRIDE Archive proteomics database and supports AI models to interact with proteomics datasets through structured function calls.
An MCP server designed for quantitative research, used to manage the research knowledge graph and support the structured representation of research projects, datasets, variables, hypotheses, statistical tests, models, and results.
The Power BI MCP Server is a service based on the Model Context Protocol (MCP) that allows interaction with Power BI datasets through natural language, enabling automatic DAX query generation, data exploration, and instant insight capabilities.
MCP Analyst is an MCP server that enables Claude to analyze local CSV or Parquet files, suitable for scenarios involving large datasets that exceed the context window limit or require cost optimization.
dank-mcp is a server based on the Model Context Protocol (MCP), specifically designed to answer questions about cannabis datasets. It stores and processes data through DuckDB, supports integration with LLM tools such as Claude Desktop, and provides a query service for cannabis product information for educational purposes.
The Powerdrill MCP Server is a service based on the Model Context Protocol, providing tools for interacting with Powerdrill datasets and supporting authentication through a user ID and project API key.
The Opera Omnia MCP server provides access to rich JSON datasets, supporting random selection, filtering, and content generation.
An automated tool for building and maintaining MCP server datasets, integrating GitHub resources and curated lists for daily updates and categorization management.
A Kaggle MCP service based on NodeJS for exploring datasets and creating notebooks