LatticeFlow Exposes Compliance Gaps in AI Models of Major Tech Companies like OpenAI

AIbase基地

Published inAI News · 6 min read · Oct 16, 2024

204

Recently, an AI model compliance checking tool developed by Swiss startup LatticeFlow has garnered widespread attention. The tool tested generative AI models developed by several major tech companies, including Meta and OpenAI, and results showed significant deficiencies in key areas such as cybersecurity and discriminatory outputs.

AI, Artificial Intelligence, Robots

Image source: The image was generated by AI, provided by the image authorization service Midjourney

Since OpenAI released ChatGPT at the end of 2022, the EU has engaged in lengthy discussions about new AI regulations. Due to the popularity of ChatGPT and widespread public discussions about the potential risks of AI, lawmakers have started to draft specific rules for "General Purpose AI" (GPAI). As the EU's AI Act gradually takes effect, the testing tool developed by LatticeFlow and its partners has become an important tool for evaluating AI models of major tech companies.

The tool scores each model according to the requirements of the AI Act, with a score range from 0 to 1. According to the recent rankings released by LatticeFlow, multiple models from companies like Alibaba, Anthropic, OpenAI, Meta, and Mistral have received favorable average scores above 0.75. However, the LLM Checker also identified some compliance deficiencies in these models, suggesting that these companies may need to reallocate resources to ensure compliance with regulations.

Companies failing to comply with the AI Act could face fines of up to 35 million euros (about $38 million) or 7% of their global annual turnover. Currently, the EU is still working on how to enforce the rules regarding generative AI tools (such as ChatGPT) in the AI Act, with plans to convene experts to formulate relevant operational norms by the spring of 2025.

In the tests, LatticeFlow found that discriminatory output issues in generative AI models remain severe, reflecting human biases in areas like gender and race. For example, in the discriminatory output test, OpenAI's "GPT-3.5 Turbo" model scored 0.46. In another test for "prompt hijacking" attacks, Meta's "Llama2 13B Chat" model scored 0.42, and the French company Mistral's "8x7B Instruct" model scored 0.38.

Among all the tested models, Anthropic's "Claude3 Opus," supported by Google, scored the highest at 0.89. Petar Tsankov, CEO of LatticeFlow, said these test results provide direction for companies to optimize their models and comply with the AI Act. He noted, "Although the EU is still formulating compliance standards, we have already seen some gaps in the models."

Additionally, a spokesperson for the European Commission welcomed this research, viewing it as a first step in translating the EU AI Act into technical requirements.

Key Points:

🌐 Many well-known AI models fail to meet the requirements of the EU AI Act in terms of cybersecurity and discriminatory outputs.

💰 Companies failing to comply with the AI Act could face fines of up to 35 million euros or 7% of their turnover.

📊 LatticeFlow's "LLM Checker" tool offers a new method for tech companies to assess compliance, helping them improve model quality.

Moonshot AI Releases and Opensources Kimi K2 Model, Strong in Code and Agentic Tasks

Moonshot AI officially released its latest creation - the Kimi K2 model, and simultaneously announced its open source. This foundation model based on the MoE architecture has gained widespread attention in the AI field since its release, thanks to its strong coding capabilities and excellent general Agent task processing abilities. The Kimi K2 model has a total of 1T parameters, with 32B activated parameters. It has achieved top performance among open-source models in a series of benchmark performance tests such as SWE Bench Verified, Tau2, and AceBench.

AI Daily: Zhipu Launches PPT Generation Function AI Slides; Ke Ling AI Releases Ketur 2.1 Model

1. Zhipu launches free AI Slides for PPT generation. 2. Keling AI introduces KeTu 2.1 with 180 styles. 3. NVIDIA's DiffusionRenderer enables 3D scene editing. 4. Modao AI offers 30-second prototype generation. 5. Higgsfield creates avatars from 10 photos. 6. Google open-sources GenAI Processors. 7. Google Veo3 adds image-to-video. 8. Mistral AI releases Devstral2507 for code generation.....

Google DeepMind Open Sources GenAI Processors: One-Click Building of Real-Time AI Workflows

Google DeepMind open sources the GenAI Processors Python library, helping developers build efficient generative AI workflows. The library supports asynchronous processing of multimodal data and optimizes Gemini API application development, significantly reducing latency in real-time applications. Core features include a modular Processor interface, streaming API design, and concurrency optimization, enabling rapid development of real-time applications such as intelligent assistants. Currently only supports Python, but with an open community contribution model, future plans include expanding functionality to cover more scenarios.

Manus AI Official Website and Social Media Undergo Changes, Chinese Users May Be Affected

General AI company Manus adjusts its China operations, lays off employees, and relocates its core technology team to Singapore. The China region had approximately 120 employees, and the company states this move is aimed at improving operational efficiency and focusing on core business. The official website now shows that the region is unavailable, replacing previous messages about the development of the Chinese version. The official Weibo and Xiaohongshu accounts have also been cleared, indicating a significant shift in the company's market strategy in China.

Modo AI Launches: Input Your Idea and Generate a High-Fidelity, Editable Prototype in 30 Seconds

Modo AI introduces a 30-second rapid prototype generation feature, supporting multi-device adaptation and conversation optimization. Users can generate high-fidelity, editable prototypes through text, sketches, and other input methods, and support iterative conversation adjustments. The AI can intelligently parse uploaded sketches, wireframes, and more, automatically generating interfaces. It offers dual-mode editing, automatic documentation generation, and code integration features, covering multiple scenarios such as e-commerce and social networking, significantly lowering the barrier to prototype creation and improving product design efficiency.

Mistral AI Releases Devstral2507: Designed for Code-Centric Language Modeling

Mistral AI launched the Devstral2507 series with two AI models: the open-source Devstral Small1.1 (24 billion parameters, SWE-Bench score of 53.6%) and the enterprise version Devstral Medium2507 (score of 61.6%). Small1.1 supports a 128k context window and local deployment, while Medium2507 outperforms some commercial models. Both are optimized for code reasoning and program synthesis, and support integration with agent frameworks.

Product Finder

Product Submit

AI Models Finder

MCP Servers

MCP Client

MCP Inspector

Case Tutorials

Latest AI News

AI Daily Brief

LatticeFlow Exposes Compliance Gaps in AI Models of Major Tech Companies like OpenAI

AIbase基地

This article is from AIbase Daily

AI News Recommendations

Moonshot AI Releases and Opensources Kimi K2 Model, Strong in Code and Agentic Tasks

Mafengwo AI Itinerary Fully Opened, AI Travel Assistant Adds New Practical Features

AI Daily: Zhipu Launches PPT Generation Function AI Slides; Ke Ling AI Releases Ketur 2.1 Model

Google DeepMind Open Sources GenAI Processors: One-Click Building of Real-Time AI Workflows

Manus AI Official Website and Social Media Undergo Changes, Chinese Users May Be Affected

Modo AI Launches: Input Your Idea and Generate a High-Fidelity, Editable Prototype in 30 Seconds

Mistral AI Releases Devstral2507: Designed for Code-Centric Language Modeling

Generate a Professional PPT in 5 Minutes! Zhipei AI Slides Has Been Launched, GLM-Experimental Brings You a Glimpse of the Future of Work

AWS Intensifies Infrastructure in AI Competition, SageMaker Platform Receives Major Upgrade

Llama Is Abandoned! Meta Shifts to Claude, Insider Secrets Revealed