AI Challenges Medical Professional Exams: GPT-4 Shines in Japan's Physical Therapist Exam

AIbase基地

Published inAI News · 4 min read · Sep 2, 2024

127

A recent peer-reviewed study published in the journal Cureus has shown that OpenAI's GPT-4 language model successfully passed the Japanese National Physical Therapy Examination without any additional training.

Researchers fed GPT-4 with 1,000 questions covering aspects such as memory, comprehension, application, analysis, and evaluation. The results indicated that GPT-4 answered 73.4% of the questions correctly overall, passing all five test sections. However, the study also revealed limitations in AI's capabilities in certain areas.

GPT-4 performed exceptionally well on general questions with an accuracy rate of 80.1%, but only 46.6% on practical questions. Similarly, it excelled in handling pure text questions (80.5% correct) compared to those with images and tables (35.4% correct). This finding aligns with previous studies that highlighted GPT-4's limitations in visual understanding.

It is noteworthy that the difficulty of questions and the length of text had little impact on GPT-4's performance. Despite being primarily trained on English data, the model also performed well with Japanese inputs.

The researchers noted that while this study demonstrates GPT-4's potential in clinical rehabilitation and medical education, it should be viewed with caution. They emphasized that GPT-4 cannot answer all questions correctly and future assessments are needed to evaluate new versions and the model's capabilities in written and reasoning tests.

Additionally, the researchers proposed that multi-modal models like GPT-4v could bring further improvements in visual understanding. Currently, Google's Med-PaLM2, Med-Gemini, and Meta's medical models based on Llama3 are under active development, aiming to surpass general models in medical tasks.

However, experts believe that it may still take a long time for medical AI models to be widely applied in practice. The error margins of current models are still too large in medical environments, and significant progress in reasoning capabilities is needed before these models can be safely integrated into daily medical practice.

GPT-4 OpenAI Physical Therapy Exam Limitations of AI

This article is from AIbase Daily

Welcome to the [AI Daily] column! This is your daily guide to exploring the world of artificial intelligence. Every day, we present you with hot topics in the AI field, focusing on developers, helping you understand technical trends, and learning about innovative AI product applications.

—— Created by the AIbase Daily Team

AI News Recommendations

UTCP Makes a Strong Entry! Revolutionizing MCP AI Tool Calls into a New Era of Zero Packaging

UTCP, as an alternative to MCP, directly connects tool endpoints via JSON-defined functions, eliminating proxy layers for lower latency while maintaining security. Its simplicity and compatibility spark developer interest as a potential AI tool standard.....

Jul 15, 2025

What is UTCP? A New Tool Calling Protocol: Let AI Agents Directly Access Tools, Reducing Latency

Global developers have introduced a universal tool calling protocol (UTCP), allowing AI agents to directly call various tools without relying on proxy servers. Compared to traditional MCP protocols, UTCP supports native interfaces such as HTTP and gRPC, significantly reducing calling latency and complexity. The protocol retains existing enterprise security measures while providing SDKs in TypeScript and Python. Developers can participate in improving the protocol through open-source projects. UTCP has the potential to open up new pathways for AI tool integration.

Jul 15, 2025

Cognition Acquires Windsurf AI Coding Tool, Intensifying the Competition in AI Coding!

A dramatic acquisition has recently taken place in the AI coding field: Cognition acquired Windsurf company. Previously, this company had experienced a $2.4 billion reverse talent acquisition by Google and an unsuccessful $3 billion acquisition offer from OpenAI. Windsurf generates $82 million in annual revenue, has 350 enterprise clients, and tens of thousands of daily active users. After the acquisition, Cognition will integrate Windsurf's AI development environment with its own Devin coding assistant and regain access to the Claude AI model. This deal marks another significant move in the competition.

Jul 15, 2025

Amazon Prime Day Hits Big! U.S. E-commerce Sales Exceed $24 Billion, AI Traffic Surges 3300%!

Amazon Prime Day hit a record $24.1B US sales (+30.3%), equal to 2 Black Fridays. AI shopping surged 3300%, appliances/office supplies grew strongly. Extended to 4 days, Day 3 sales jumped 165%. Influencers drove 20% sales with 10x higher conversion rates.....

Jul 15, 2025

Musk Announces a Major Update! Grok Launches New Anime AI Companion Feature, Celebrations for Otaku Players!

Musk's AI chatbot Grok introduces a new virtual companion feature, including two character options: Ani, an anime character, and Rudy, a cartoon panda. Ani supports NSFW mode. This feature is currently released in a soft launch, requiring users to enable it manually, with plans to simplify the process in the future. Additionally, it was revealed that a new character named Chad is under development, and the NSFW content toggle for voice chat mode is already supported. This innovative feature highlights the potential of AI in personalized interaction and is expected to enhance Grok's market competitiveness.

Jul 15, 2025

Sack Announces the Launch of Anime AI Companion Feature for Grok, Sparking Widespread Attention

Tesla's Musk unveils anime-style AI companions for xAI's Grok chatbot, including virtual characters like Ani and Bad Rudy. The feature supports voice interaction and is exclusive to $30/month SuperGrok subscribers. While popular among anime fans, it raises AI ethics concerns. Musk claims Grok4 outperforms ChatGPT, but transparency issues persist. This marks xAI's key commercialization effort.....

Jul 15, 2025

Meta May Abandon the Open-Source Philosophy and Shift to Proprietary AI Model Development

Meta may shift from open-source to closed-source AI, potentially abandoning its 'Behemoth' model due to poor performance. Despite claims of commitment to open-source, this move could challenge Zuckerberg's vision, impact AI competition, and disadvantage smaller firms reliant on open models, including China's AI strategy.....

Jul 15, 2025

Amazon Launches AI Code Editor Kiro, Supporting Free Use of Claude 4/3.7 Sonnet

Amazon AWS launches a new AI development tool called Kiro, focusing on the concept of specification-driven development. The tool is based on the open-source Code OSS platform and is compatible with the VS Code ecosystem. It uses AI collaboration to first generate requirement documents and system designs, then automatically generates code, test cases, and documentation, ensuring code quality. Kiro supports multimodal input and automated testing features. It is currently available for free preview, and a paid version will be released in the future. Its specification-driven development model has the potential to address maintenance challenges with AI-generated code, but the initial usage may be complex.

Jul 15, 2025

Mita AI Search Launches Mita Edition Deep Research, Available for Free Public Access

On July 15, Mita AI Search announced the completion of a new iteration of its 'Deep Research' module and its official public beta launch, becoming the first search service in China that is freely open to the public and features multi-turn reasoning chain visualization. The upgraded system adopts a segmented reinforcement learning strategy, breaking down the originally computationally intensive 'Deep Research' tasks into multiple sub-tasks, maintaining result accuracy while reducing operational costs to a level that allows for public free access. It performs particularly well in retrieving and reasoning with Chinese corpus.

Jul 15, 2025

MiniMax Valued Over 4 Billion USD, Backed by Shanghai State Capital, Joins the 3 Billion USD Large Model Club

Chinese AI firm MiniMax raised $300M, reaching a $4B valuation. Backed by Shanghai state capital, it's now one of China's two $3B+ LLM companies. Founded by ex-SenseTime executives, with prior investments from Alibaba and Tencent, it's reportedly preparing for a Hong Kong IPO.....

Jul 15, 2025

Product Finder

Product Submit

AI Models Finder

MCP Servers

MCP Client

MCP Inspector

Case Tutorials

Latest AI News

AI Daily Brief

AI Challenges Medical Professional Exams: GPT-4 Shines in Japan's Physical Therapist Exam

AIbase基地

This article is from AIbase Daily

AI News Recommendations

UTCP Makes a Strong Entry! Revolutionizing MCP AI Tool Calls into a New Era of Zero Packaging

What is UTCP? A New Tool Calling Protocol: Let AI Agents Directly Access Tools, Reducing Latency

Cognition Acquires Windsurf AI Coding Tool, Intensifying the Competition in AI Coding!

Amazon Prime Day Hits Big! U.S. E-commerce Sales Exceed $24 Billion, AI Traffic Surges 3300%!

Musk Announces a Major Update! Grok Launches New Anime AI Companion Feature, Celebrations for Otaku Players!

Sack Announces the Launch of Anime AI Companion Feature for Grok, Sparking Widespread Attention

Meta May Abandon the Open-Source Philosophy and Shift to Proprietary AI Model Development

Amazon Launches AI Code Editor Kiro, Supporting Free Use of Claude 4/3.7 Sonnet

Mita AI Search Launches Mita Edition Deep Research, Available for Free Public Access

MiniMax Valued Over 4 Billion USD, Backed by Shanghai State Capital, Joins the 3 Billion USD Large Model Club