Information

Latest AI News

Explore AI Frontiers, Master Industry Trends

AI Daily Brief

Your Daily AI Brief - Never Miss What's Next

Information

AI Product Finder

Smart Product Discovery - Comprehensive Market Intelligence

AI Product Rankings

AI Product Power Rankings - Performance, Buzz & Trends

AI Product Submit

Submit Your AI Product - Amplify Reach & Drive Growth

Tools

AI Tools Directory

Discover The Best AI Websites & Tools

Tools

GEO Brand Visibility

All-in-One GEO Brand Insights Platform

AI Visibility Audit

Quickly check how your brand is perceived and presented in AI-powered search results.

AI Search Visibility Checker

Detect brand's visibility on AI platforms

GEO Promotion Link Detection

Quickly evaluate the citation of promotion articles on AI platforms

Service

GEO Ranking Optimization System

Own your own GEO system and become a professional GEO optimization service provider.

GEO Services

Achieve Dominant Visibility in AI Search for Your Business or Brand with GEO Services

Information

MCP Servers

Discover Popular AI-MCP Services - Find Your Perfect Match Instantly

MCP Client

Easy MCP Client Integration - Access Powerful AI Capabilities

MCP Case Tutorials

Master MCP Usage - From Beginner to Expert

MCP Ranking

Top MCP Service Performance Rankings - Find Your Best Choice

MCP Service Submission

Publish & Promote Your MCP Services

Tools

MCP Playground

Test MCP Services Freely - Quick Online Experience

MCP Inspector

Quick MCP Service Testing - Fast Deployment

Information

LLM API Hub

One-stop integration for all major LLM APIs.

AI Models Finder

Comprehensive AI Models Collection for All Your Development & Research Needs

Model Providers

Discover Trusted AI Model Partners - Guaranteed Reliable Support

LLM Leaderboard

AI LLM Power Rankings - Performance, Buzz & Trends

Tools

Compare LLMs

Multi-Dimensional Large Model Comparison - Find Your Perfect Match

LLM Cost Calculator

Calculate AI Model Costs Accurately - Optimize Your Budget

LLM Arena

Multi-Model Real-Time Evaluation & Quick Output Comparison

AI Model Compatibility Checker

Free PC Hardware Test for DeepSeek & Llama

AI Deployment Calculator

Enter Your Large Model Computing Requirements for Instant GPU, Memory & Server Configuration Recommendations

moondream

A powerful small visual language model, accessible everywhere.

CommonProductImageVisualLanguage Model

Visit

moondream is a 1.6 billion parameter model built using the SigLIP, Phi-1.5, and LLaVA training datasets. Due to the use of the LLaVA dataset, the weights are protected by the CC-BY-SA license. You can try it out on Huggingface Spaces. The model's performance on the VQAv2, GQA, VizWiz, and TextVQA benchmark tests is as follows: LLaVA-1.5 (13.3B parameters): 80.0, 63.3, 53.6, 61.3 LLaVA-1.5 (7.3B parameters): 78.5, 62.0, 50.0, 58.2 MC-LLaVA-3B (3B parameters): 64.2, 49.6, 24.9, 38.6 LLaVA-Phi (3B parameters): 71.4, -, 35.9, 48.6 moondream1 (1.6B parameters): 74.3, 56.3, 30.3, 39.8.

Visit

moondream Visit Over Time

Monthly Visits

493360068

Bounce Rate

36.08%

Page per Visit

6.1

Visit Duration

00:06:29

moondream Visit Trend

moondream Visit Geography

moondream Traffic Sources

moondream Alternatives

moondream — A powerful small visual language model, accessible everywhere.

Image

•Visual•Language Model

534

MouSi — Multimodal Visual Language Model

Productivity

•Multimodal•Visual Language Model

450

Llama-3.2-11B-Vision — A multimodal large language model that supports image and text processing.

Productivity

•Multimodal•Image Processing

924

OpenGVLab InternVL — An AI visual language model providing image analysis and description services.

chatting

•Image Recognition•Deep Learning

204

CogVLM — A powerful open-source visual language model

Image

•visual language model•image description

1398

InternLM-XComposer-2.5 — A Multifunctional Large Visual Language Model

Productivity

•Visual Language Model•Long Context Processing

774

Trustworthy Language Model (TLM) Playground — Try Cleanlab's Trustworthy Language Model (TLM) in your browser

Productivity

•Natural Language Processing•Language Model

234

InternVL2_5-1B-MPO — A multimodal large language model that enhances integrated understanding of visual and language data.

Productivity

•Multimodal•Large Language Model

396

Qwen-VL — General-purpose Visual Language Model

Productivity

•Visual•Language Model

2592

DeepSeek-VL2-Tiny — Advanced Large-scale Mixture of Experts Visual Language Model

Image

•Visual Language Model•Mixture of Experts

708

Aquila-VL-2B-llava-qwen — A visual-language model that intelligently processes both image and text information.

Image

•Visual Language Model•Multimodal

306

Visual Anagrams — Visual illusions are created using a pre-trained diffusion model.

Image

•Visual Illusion•Diffusion Model

144

FastVLM — Efficient visual encoding technology improves the performance of visual language models.

Productivity

•Visual Model•Image Processing

PaliGemma 2 — PaliGemma 2 is a powerful visual language model that is easy to fine-tune.

Productivity

•Visual Language Model•Machine Learning

204

InternLM-XComposer2 — A large visual language model specializing in free-form text-to-image synthesis and understanding.

Design

•Visual Language Model•Text-Image Synthesis

2004

VSP-LLM — A framework that combines Visual Speech Processing with Large Language Models

Programming

•Visual Speech Processing•Large Language Models

2706

Pixtral-12B-2409 — A multimodal model with 12 billion parameters, integrating a visual encoder for image and text processing.

Productivity

•Multimodal•Image Processing

294

Vary — Visual Vocabulary Expansion for Large-Scale Visual Language Models

Image

•Visual Language Model•Image Understanding

1050

Pali3 — PaLI-3 Visual Language Model: Smaller, Faster, Stronger

Productivity

•Visual Language Model•Image Encoding

1044

Florence-2-base — An advanced visual foundation model that supports various visual and vision-language tasks.

Image

•Visual Model•Multi-Task Learning

552

VLM-R1 — VLM-R1 is a stable and versatile reinforcement learning-enhanced visual-language model focused on visual understanding tasks.

Image

•Visual-Language Model•Reinforcement Learning

498

BlueLM Large Model — An independently developed intelligent language understanding model by vivo

ChineseSelection

•Language Model•Natural Language Processing

31374

Visual Sketchpad — A visual reasoning tool for multimodal large language models (LLMs)

Productivity

•Multimodal•Visual Reasoning

336

Qwen2-VL — A next-generation visual language model that offers a clearer view of the world.

Image

•Visual Language Model•Multilingual Support

390

CogAgent — An open-source end-to-end visual language model (VLM) based GUI agent

Programming

•Visual Language Model•GUI Agent

474

MiniGPT-4 — An advanced large language model enhanced for visual language understanding.

Image

•Visual Language Understanding•Image Description

156

InternVL — Open Source Visual Basic Model

Image

•Open Source•Basic Model

2388

VisRAG — A retrieval-augmented generation model based on visual language modeling.

Image

•Visual Language Model•Retrieval-Augmented Generation

324

VMamba — Visual state-space model with linear complexity and global perception.

Image

•Visual Model•Image Processing

444

Llama-3.2-90B-Vision — A multimodal large language model optimized for visual recognition and image reasoning.

Productivity

•Machine Learning•Visual Recognition

240

正在加载AI产品数据...

Latest AI News

AI Daily Brief

AI Product Finder

AI Product Rankings

AI Product Submit

AI Tools Directory

GEO Brand Visibility

AI Visibility Audit

AI Search Visibility Checker

GEO Promotion Link Detection

GEO Ranking Optimization System

GEO Services​

MCP Servers

MCP Client

MCP Case Tutorials

MCP Ranking

MCP Service Submission

MCP Playground

MCP Inspector

LLM API Hub

AI Models Finder

Model Providers

LLM Leaderboard

Compare LLMs

LLM Cost Calculator

LLM Arena

AI Model Compatibility Checker

AI Deployment Calculator