Japanese Developers Unveil Humanoid Robot Alter3: Now Capable of Taking Selfies with GPT-4 Technology

AIbase

Published inAI News · 5 min read · Jun 25, 2024

212

Researchers from the University of Tokyo in Japan, in collaboration with Alternative Machine Corporation, have achieved a new breakthrough in their joint research, developing a humanoid robot system named Alter3 that can directly map natural language commands to robot actions. The system's backend model employs GPT-4 technology, enabling it to perform a series of complex tasks such as taking selfies or acting as a ghost.

This is one of the increasing number of research outcomes that combine foundational models with robotic systems. Although these systems have not yet reached scalable commercial solutions, they have driven the development of robotics research in recent years and shown significant potential.

Alter3 uses GPT-4 technology as its backend model, receiving natural language instructions that describe actions or situations requiring a robotic response. Initially, the model uses an "agent framework" to plan a series of steps the robot needs to achieve its goal. Subsequently, by encoding the agent, commands necessary for the robot to execute each step are generated. Since GPT-4 was not trained on Alter3 programming commands, researchers utilized its contextual learning capabilities to adapt its behavior to the robot's API.

Thus, the prompt includes a list of commands and a set of examples explaining how to use each command. The model then maps each step to one or more API commands to be sent for robot execution.

Researchers added functionality that allows humans to provide feedback, such as "raise the arm a bit higher." These instructions are sent to another GPT-4 agent, which reasons about the code, makes necessary corrections, and returns the sequence of actions to the robot. The improved action recipes and code are stored in a database for future use.

Researchers conducted multiple tests on Alter3, including everyday actions like taking selfies and drinking tea, as well as mimicking actions like acting as a ghost or a snake. They also tested the model's ability to handle situations requiring careful planning of actions. GPT-4's extensive understanding of human behavior and actions allows for creating more realistic behavior plans for humanoid robots like Alter3. The researchers' experiments also demonstrated that they could mimic emotions such as shame and joy in the robot.

Key Points:

- 💡 Alter3 is the latest humanoid robot using GPT-4 technology for reasoning, capable of directly mapping natural language commands to robot actions.

- 💡 Researchers leverage GPT-4's contextual learning capabilities to adapt its behavior to the robot's API, enabling the robot to execute a series of required actions.

- 💡 Incorporating human feedback and memory can enhance Alter3's performance, and the researchers' experiments also showed that they could mimic emotions like shame and joy in the robot.

Reddit CEO Says AI Chatbots Have Not Brought Traffic Dividends, Search Remains the Core Engine

Reddit CEO stated that AI chatbots are not the main source of traffic for the platform, and current traffic still relies on Google search and direct visits. This statement has cooled down the heated discussion about AI replacing traditional search, revealing the complex balance between AI collaboration and user growth for social platforms.

NVIDIA Plans to Invest Up to $1 Billion in AI Startup Poolside

NVIDIA is planning to invest up to $1 billion in AI programming startup Poolside, which is expected to quadruple its valuation. Poolside was founded by former GitHub executives and focuses on developing AI programming assistants and general artificial intelligence technology, and is currently in negotiations with investors for funding.

Revolution in Automated Workflows: Pokee AI Launches a Smart Agent with a Single Sentence, Zero-Code Configuration

The Pokee AI Innovation Platform generates smart agents directly through natural language instructions, without the need for coding or node configuration, revolutionizing the traditional complex manual setup model of automation tools, and providing a more convenient way for enterprises and individuals to build workflows.

9 Billion Dollar Acquisition Rejected! Core Scientific Shareholders Bet on Independence as the Next CoreWeave

Core Scientific rejected CoreWeave's 9 billion dollar all-stock acquisition, with the largest shareholder Sina Toussi becoming the key force behind the rejection. This failed AI infrastructure merger exposes significant disagreements in the capital market regarding the valuation of computing power assets, reflecting the gambling mentality in the current AI investment boom.

Study Reveals: Pangram Is the Most Cost-Effective and Accurate, Possibly Reshaping AI Content Recognition Standards

A study by the University of Chicago found significant differences in the performance of AI text detectors, with some tools having high accuracy but others frequently misclassifying, especially in short texts. The Pangram detector performed best in terms of accuracy and cost-effectiveness. The study, based on 1992 human texts and four mainstream large models, covered six types of texts and revealed shortcomings in the reliability and robustness of detectors.

Study Reveals the Outstanding Performance of Commercial Detection Tool Pangram in AI Text Detection

A study by the University of Chicago found significant differences in the performance of AI text detection tools. The research tested commercial AI text detection tools using 1,992 human-written texts (including reviews, news, novels, and other categories) and AI-generated texts from mainstream models such as GPT-4. The results showed notable differences in accuracy among different detection tools, and the study called for improved reliability of detection technology.

China's Smart Speaker Sales Exceed 10 Million, Only 33% of Devices Equipped with Large AI Models! Is the Boom of AI Speakers Around the Corner?

In the first three quarters of 2025, China's smart speaker sales reached 10.54 million units, and the annual sales may reach 14.2 million. However, the industry faces a critical challenge: only 33% of devices are equipped with AI large models, and nearly 70% still rely on basic voice interaction, indicating insufficient popularization of intelligence. The high-end market was driven by "Super Xiaoai," with large models becoming a new selling point.

OpenAI CEO Responds to Revenue Concerns, Emphasizes Growth Prospects

OpenAI CEO Sam Altman revealed in a podcast that the company's annual revenue far exceeds 13 billion dollars, but he expressed dissatisfaction with the commitment to spend over 1 trillion dollars on computing infrastructure over the next decade. The host pointed out the stark contrast between the revenue and the spending commitment.

Latest AI News

AI Daily Brief

AI Product Finder

AI Product Rankings

AI Product Submit

AI Tools Directory

AI Models Finder

LLM Leaderboard

Model Providers

Submit Your Model

Compare LLMs

LLM Cost Calculator

LLM Arena

MCP Servers

MCP Client

MCP Case Tutorials

MCP Ranking

MCP Service Submission

MCP Playground

MCP Inspector

GEO Services

AI Search Visibility Checker

AI Model Compatibility Checker

AI Deployment Calculator

AI Dataset Collection

Intelligent Document Recognition

Japanese Developers Unveil Humanoid Robot Alter3: Now Capable of Taking Selfies with GPT-4 Technology

AIbase

This article is from AIbase Daily

AI News Recommendations

Reddit CEO Says AI Chatbots Have Not Brought Traffic Dividends, Search Remains the Core Engine

NVIDIA Plans to Invest Up to $1 Billion in AI Startup Poolside

Revolution in Automated Workflows: Pokee AI Launches a Smart Agent with a Single Sentence, Zero-Code Configuration

EU Launches $1.1 Billion Initiative to Promote Artificial Intelligence Sovereignty

Google CEO Confirms: Gemini Will be Released in 3 Years, AI Agent Capabilities May Be the Breakthrough

9 Billion Dollar Acquisition Rejected! Core Scientific Shareholders Bet on Independence as the Next CoreWeave

Study Reveals: Pangram Is the Most Cost-Effective and Accurate, Possibly Reshaping AI Content Recognition Standards

Study Reveals the Outstanding Performance of Commercial Detection Tool Pangram in AI Text Detection

China's Smart Speaker Sales Exceed 10 Million, Only 33% of Devices Equipped with Large AI Models! Is the Boom of AI Speakers Around the Corner?

OpenAI CEO Responds to Revenue Concerns, Emphasizes Growth Prospects

Latest AI News

AI Daily Brief

AI Product Finder

AI Product Rankings

AI Product Submit

AI Tools Directory

AI Models Finder

LLM Leaderboard

Model Providers

Submit Your Model

Compare LLMs

LLM Cost Calculator

LLM Arena

MCP Servers

MCP Client

MCP Case Tutorials

MCP Ranking

MCP Service Submission

MCP Playground

MCP Inspector

GEO Services​

AI Search Visibility Checker

AI Model Compatibility Checker

AI Deployment Calculator

AI Dataset Collection

Intelligent Document Recognition

Japanese Developers Unveil Humanoid Robot Alter3: Now Capable of Taking Selfies with GPT-4 Technology

AIbase

This article is from AIbase Daily

AI News Recommendations

Reddit CEO Says AI Chatbots Have Not Brought Traffic Dividends, Search Remains the Core Engine

NVIDIA Plans to Invest Up to $1 Billion in AI Startup Poolside

Revolution in Automated Workflows: Pokee AI Launches a Smart Agent with a Single Sentence, Zero-Code Configuration

EU Launches $1.1 Billion Initiative to Promote Artificial Intelligence Sovereignty

Google CEO Confirms: Gemini Will be Released in 3 Years, AI Agent Capabilities May Be the Breakthrough

9 Billion Dollar Acquisition Rejected! Core Scientific Shareholders Bet on Independence as the Next CoreWeave

Study Reveals: Pangram Is the Most Cost-Effective and Accurate, Possibly Reshaping AI Content Recognition Standards

Study Reveals the Outstanding Performance of Commercial Detection Tool Pangram in AI Text Detection

China's Smart Speaker Sales Exceed 10 Million, Only 33% of Devices Equipped with Large AI Models! Is the Boom of AI Speakers Around the Corner?

OpenAI CEO Responds to Revenue Concerns, Emphasizes Growth Prospects

GEO Services