Microsoft Researchers' SpreadsheetLLM Project Teaches AI to Understand Spreadsheet Content

AIbase基地

Published inAI News · 4 min read · Jul 22, 2024

351

Microsoft researchers have recently unveiled an innovative research called SpreadsheetLLM, aimed at addressing the challenges faced by large language models (LLMs) when parsing electronic spreadsheets.

According to a paper published on Arxiv on July 12th, SpreadsheetLLM enables LLMs to "read" spreadsheet content through an encoding framework. This research is expected to significantly enhance the efficiency of data management and analysis in spreadsheets, allowing users to ask questions of AI in natural language without needing to master complex formulas and operations.

Paper Address: https://arxiv.org/html/2407.09025v1#abstract

Spreadsheets pose multifaceted challenges to LLMs. Firstly, spreadsheets can be extremely large, exceeding the character limit that LLMs can process at once. Secondly, spreadsheets use a two-dimensional layout and structure, whereas LLMs are adept at handling linear, sequential inputs. Lastly, LLMs are typically not specifically trained to interpret cell addresses and specific spreadsheet formats.

Microsoft's SpreadsheetLLM technology consists of two main components. The first is SheetCompressor, which reduces the complexity of spreadsheets to make them more understandable to LLMs. SheetCompressor includes three modules: structural anchors, methods to reduce the number of tokens, and clustering similar cells to improve efficiency. Using these modules, the Microsoft team reduced the number of tokens required for encoding by 96% and achieved a 12.3% improvement in effectiveness. The second component is the Chain of Spreadsheet, which teaches LLMs how to find relevant information and generate responses within compressed spreadsheets.

The successful application of this technology will significantly enhance the capabilities of Microsoft's Copilot in Excel, enabling it to handle more complex data analysis tasks. However, this method still faces issues with the accuracy of generated data and high computational resource consumption. Future plans for the research team include encoding cell background colors and deepening the understanding of the relevance of cell content.

Key Takeaways:

📊 **Challenges for Large Language Models (LLMs) in Spreadsheets**: Spreadsheet structures are complex and use a two-dimensional layout, which goes beyond the linear input range typically handled by LLMs.

🔍 **SpreadsheetLLM Technology Analysis**: Microsoft has introduced two core technologies, SheetCompressor and Chain of Spreadsheet, which greatly enhance the ability of LLMs to understand spreadsheets.

🛠️ **Impact on Microsoft's AI Tools**: SpreadsheetLLM is expected to strengthen the capabilities of Microsoft's Copilot in Excel, but currently faces challenges with the accuracy of generated data and computational resource consumption.

Huang Renxun Meets with MiniMax Founder Yan Junjie for an In-depth Meeting, New AI Opportunities Are Coming！

NVIDIA CEO Jensen Huang met with MiniMax founder Yan Junjie in Beijing, praising China's AI innovation. MiniMax, founded just two years ago, has made breakthroughs including the open-source M1 model, Hailuo02 video tool, and $300M funding at $4B valuation. The meeting highlights potential for US-China tech collaboration.....

Zuckerberg Reorganizes Meta AI Team, a New 3400-Person Structure Emerges

Meta reorganized its AI structure to establish a Superintelligent Lab, integrating 3400 employees, led by Alexandr Wang as Chief AI Officer. The new structure is divided into four teams: AGI Basic Research, AI Product Development (including Meta AI Assistant), the Basic AI Lab led by Yann LeCun, and a group focused on Llama5 development. Meta is offering high salaries to attract talent from companies like OpenAI and Apple, but this has raised doubts within the original team about the influx of high-paid outsiders. Recently, two AI leads from Apple have joined.

Li Auto Receives the First Batch of Automotive Generative AI Security Evaluation Certifications

Li Auto received the first domestic batch of dual safety certifications for automotive generative AI at the 2025 China Automotive Forum, becoming the first automaker to pass the national standards GB/T 45654 and GB 45438-2025. The certification was jointly issued by the CCIA Automotive Cybersecurity Working Committee and the AI-Generated Content Identification Service Platform, covering the fields of content security and identification. This achievement marks Li Auto's leading position in the industry regarding the safety of in-vehicle AIGC technology, setting a benchmark for the safe development of intelligent vehicles, while enhancing consumer confidence.

LTX-Video 13B Released! Generate High-Definition Videos 30 Times Faster, Open Source AI Makes Creation Boundless!

Lightricks releases open-source LTX-Video13B, a 13B-parameter video generation model with multi-scale rendering, achieving 30x faster speeds. It runs on consumer GPUs, supports 1216×704 real-time generation, and offers text/image/video-to-video modes. The model enhances coherence and detail, enabling keyframe control and style transfer. Free for SMEs, it includes training tools and optimized versions to democratize AI video creation.....

The First AI-Based Malware LameHug Emerges, Stealing Data from Windows Devices

New malware LameHug uses Alibaba Qwen2.5 large model to attack Windows systems, spreads through email attachments, and can dynamically generate data stealing instructions. The software collects system information and steals sensitive files, with multiple variants already discovered. Experts warn that this is the first publicly known AI-based malware and recommend users to remain vigilant and update their protection measures.

Product Finder

Product Submit

AI Models Finder

MCP Servers

MCP Client

MCP Inspector

Case Tutorials

Latest AI News

AI Daily Brief

Microsoft Researchers' SpreadsheetLLM Project Teaches AI to Understand Spreadsheet Content

AIbase基地

This article is from AIbase Daily

AI News Recommendations

Huang Renxun Meets with MiniMax Founder Yan Junjie for an In-depth Meeting, New AI Opportunities Are Coming！

Zuckerberg Reorganizes Meta AI Team, a New 3400-Person Structure Emerges

Li Auto Receives the First Batch of Automotive Generative AI Security Evaluation Certifications

ChatGPT Voice Mode Launches! Convert Meetings and Generate Plans with One Click - AI Boosts Efficiency Dramatically!

LTX-Video 13B Released! Generate High-Definition Videos 30 Times Faster, Open Source AI Makes Creation Boundless!

Perplexity Enters India: New Strategy to Challenge OpenAI in the AI Race

Apple Bows to NVIDIA! MLX Framework Supports CUDA, AI Field Competition Intensifies

Mistral AI Launches New Feature Le Chat to Catch Up with ChatGPT

The First AI-Based Malware LameHug Emerges, Stealing Data from Windows Devices

5.63% Error Rate Sets New Low: NVIDIA AI Launches Commercial-Grade Ultra-High-Speed Speech Recognition Model Canary-Qwen-2.5B