Researchers Teach AI to Recognize Human Line Drawings

AIbase

Published inAI News · 5 min read · Jun 18, 2024

147

A research team from the University of Surrey and Stanford University has developed a new method that teaches Artificial Intelligence (AI) to understand human line sketches, even those drawn by non-artists. The model's performance in recognizing scene sketches is close to that of humans.

Dr. Yulia Gryaditskaya, a lecturer at the Centre for Vision, Speech and Signal Processing (CVSSP) at the University of Surrey and the People's Centre for Artificial Intelligence (PAI), said: "Sketches are a powerful visual communication language. They can sometimes be more expressive and flexible than verbal language. Developing tools that understand sketches is a step towards more powerful human-computer interaction and more efficient design workflows. For example, images can be searched for or created through sketches." People of all ages and backgrounds use drawing to explore new ideas and communicate. However, AI systems have always struggled with understanding sketches. AI must learn to understand images, which typically requires a time-consuming and labor-intensive process of collecting labels for each pixel in an image, followed by AI learning from these labels.

However, the research team taught AI by combining sketches with textual descriptions. It learned to group pixels and match them with categories in the descriptions. As a result, the AI demonstrated a richer and more human-like understanding than ever before. It could correctly identify and label objects like kites, trees, and giraffes with 85% accuracy, outperforming other models that rely on labeled pixels. In addition to identifying objects in complex scenes, it could also determine which object each stroke was meant to depict. This new method is not only applicable to informal sketches drawn by non-artists but also to sketches drawn without explicit training.

Dr. Judith Fan, an assistant professor of psychology at Stanford University, said: "Drawing and writing are among the most typical human activities, long used to capture people's observations and thoughts. This work has made exciting progress in enabling AI systems to understand the essence of what people are trying to convey, whether through pictures or words." The research is part of the People's Centre for Artificial Intelligence at the University of Surrey, particularly its SketchX initiative. SketchX uses AI to try to understand how we see the world through the way we draw.

Professor Song Yizhe, co-director of the People's Centre for Artificial Intelligence and head of SketchX, said: "This research is a prime example of how AI can enhance fundamental human activities like sketching. By understanding rough sketches with near-human accuracy, this technology has great potential to enhance people's natural creativity, regardless of artistic talent."

Paper link: https://arxiv.org/abs/2312.12463

Acceleration of Brain-Computer Interface Industrialization: China's Market Size to Reach 5.58 Billion Yuan by 2027

As the autumn recruitment season approaches, brain-computer interface technology is accelerating its industrialization and has become a new hot spot for college graduates' employment. This cutting-edge interdisciplinary field is expected to reach a market size of 5.58 billion yuan by 2027, with an annual growth rate of 20%. Currently, hundreds of universities and research institutions are involved in its development.

Microsoft's AI Chief Sulman: Microsoft Will Not Develop Sexual Content AI and Draw a Line with OpenAI

Microsoft's CEO of AI business, Sulman, clearly stated that the company will not develop sexual content AI services, emphasizing that this is not within the scope of its services. This statement was made a week after OpenAI announced allowing adults to create sexual content, highlighting Microsoft's firm stance on the ethics of generative AI.

AI Daily: Tencent Launches New IMA 2.0; Microsoft Unveils a Series of Major Updates for Copilot; Alibaba's Quark AI Glasses Go on Pre-sale

[AI Daily] The Kimi k2 model from the company Dark Side of the Moon has received praise for its performance surpassing GPT-5, and the company is about to complete another round of tens of millions of dollars in funding, just months after the last funding round. The domestic AI large model field remains highly active, and developers can learn about the latest product updates through the platform.

China University of Science and Technology and ByteDance Launch MoGA Long Video Generation Model: One-Click Generation of Minute-Level Multi-Shot Short Films

The University of Science and Technology of China and ByteDance jointly launched an end-to-end long video generation model that can directly generate high-quality videos with a duration of minutes, 480p resolution, and 24fps, supporting multi-shot switching. The core innovation is the underlying algorithm MoGA, a novel attention mechanism designed to tackle the challenges of long video generation, marking a key breakthrough in domestic video generation technology.

AI Data Center Company Crusoe Completes $1.38 Billion Equity Financing, Valuation Exceeds $10 Billion

Crusoe completed a $1.38 billion equity financing, with its valuation exceeding $10 billion, reflecting investors' high confidence in the AI infrastructure market. The company operates a large data center in Texas, providing services to giants such as OpenAI and Oracle. This round of financing was led by Valor Equity Partners and the Abu Dhabi Sovereign Wealth Fund.

EA and Stability AI Collaborate: Integrating AI into Game Development to Accelerate Content Creation

EA has formed a partnership with Stability AI, integrating AI technologies such as Stable Diffusion into game development. The two parties plan to jointly develop AI models and tools, redefining content production methods, aiming to accelerate iteration and expand creative boundaries. EA emphasizes that AI is positioned as an auxiliary tool to enhance efficiency, supporting rapid iteration and process optimization, rather than replacing human creativity.

Microsoft Launches a New AI Character: Mico, Clippy Returns as an AI Companion

Microsoft introduced the personified AI character Mico at the Copilot Fall Launch Event. The name comes from Microsoft Copilot, and it has features such as listening, changing color, and customization, positioning it as a warm virtual companion. Its inspiration seems to be derived from the classic Office assistant Clippy, and it includes hidden easter egg interactive designs.

Opera Neon Browser Launches Deep Research Agent ODRA

Recently, Opera announced that the Opera Neon browser will launch a new AI feature called Opera Deep Research Agent (referred to as ODRA). This marks a key step in Opera's efforts to build an AI ecosystem for browsers, providing users with a new and efficient solution for complex query problems. ODRA has been under development for over two years and is a core component of Opera's self-developed AI engine. After months of continuous optimization, ODRA has achieved significant improvements in performance.

Latest AI News

AI Daily Brief

AI Product Finder

AI Product Rankings

AI Product Submit

AI Tools Directory

AI Models Finder

LLM Leaderboard

Model Providers

Submit Your Model

Compare LLMs

LLM Cost Calculator

LLM Arena

MCP Servers

MCP Client

MCP Case Tutorials

MCP Ranking

MCP Service Submission

MCP Playground

MCP Inspector

GEO Services

AI Search Visibility Checker

AI Model Compatibility Checker

AI Dataset Collection

Intelligent Document Recognition

Researchers Teach AI to Recognize Human Line Drawings

AIbase

This article is from AIbase Daily

AI News Recommendations

Acceleration of Brain-Computer Interface Industrialization: China's Market Size to Reach 5.58 Billion Yuan by 2027

Microsoft's AI Chief Sulman: Microsoft Will Not Develop Sexual Content AI and Draw a Line with OpenAI

AI Daily: Tencent Launches New IMA 2.0; Microsoft Unveils a Series of Major Updates for Copilot; Alibaba's Quark AI Glasses Go on Pre-sale

China University of Science and Technology and ByteDance Launch MoGA Long Video Generation Model: One-Click Generation of Minute-Level Multi-Shot Short Films

AI Data Center Company Crusoe Completes $1.38 Billion Equity Financing, Valuation Exceeds $10 Billion

EA and Stability AI Collaborate: Integrating AI into Game Development to Accelerate Content Creation

Meta Integrates AI Editing Features Directly into Instagram Stories for Instant Dream Effects

Microsoft Launches a New AI Character: Mico, Clippy Returns as an AI Companion

Opera Neon Browser Launches Deep Research Agent ODRA

Two 20-Year-Old Dropouts Created Turbo AI: The AI Note-taking Myth with 5 Million Users

Latest AI News

AI Daily Brief

AI Product Finder

AI Product Rankings

AI Product Submit

AI Tools Directory

AI Models Finder

LLM Leaderboard

Model Providers

Submit Your Model

Compare LLMs

LLM Cost Calculator

LLM Arena

MCP Servers

MCP Client

MCP Case Tutorials

MCP Ranking

MCP Service Submission

MCP Playground

MCP Inspector

GEO Services​

AI Search Visibility Checker

AI Model Compatibility Checker

AI Dataset Collection

Intelligent Document Recognition

Researchers Teach AI to Recognize Human Line Drawings

AIbase

This article is from AIbase Daily

AI News Recommendations

Acceleration of Brain-Computer Interface Industrialization: China's Market Size to Reach 5.58 Billion Yuan by 2027

Microsoft's AI Chief Sulman: Microsoft Will Not Develop Sexual Content AI and Draw a Line with OpenAI

AI Daily: Tencent Launches New IMA 2.0; Microsoft Unveils a Series of Major Updates for Copilot; Alibaba's Quark AI Glasses Go on Pre-sale

China University of Science and Technology and ByteDance Launch MoGA Long Video Generation Model: One-Click Generation of Minute-Level Multi-Shot Short Films

AI Data Center Company Crusoe Completes $1.38 Billion Equity Financing, Valuation Exceeds $10 Billion

EA and Stability AI Collaborate: Integrating AI into Game Development to Accelerate Content Creation

Meta Integrates AI Editing Features Directly into Instagram Stories for Instant Dream Effects

Microsoft Launches a New AI Character: Mico, Clippy Returns as an AI Companion

Opera Neon Browser Launches Deep Research Agent ODRA

Two 20-Year-Old Dropouts Created Turbo AI: The AI Note-taking Myth with 5 Million Users

GEO Services