Last Week in AI #297

 Last Week in AI #297

Top News

Alibaba releases an ‘open’ challenger to OpenAI’s o1 reasoning model

Alibaba’s Qwen team has developed a new AI model, QwQ-32B-Preview, which rivals OpenAI’s o1 model in reasoning capabilities. The model, which contains 32.5 billion parameters and can consider prompts up to 32,000 words in length, outperforms OpenAI’s o1-preview and o1-mini models on certain benchmarks, including the AIME and MATH tests. However, it has some limitations, such as switching languages unexpectedly and underperforming on tasks requiring common sense reasoning. The model is available for download under an Apache 2.0 license, but only certain components have been released, making it impossible to fully replicate or understand its inner workings. This development comes as the effectiveness of “scaling laws” in AI is being questioned, leading to a search for new AI approaches and techniques, such as test-time compute, which is used in models like o1 and QwQ-32B-Preview.

More on this:

DeepSeek Introduces DeepSeek-R1-Lite-Preview with Complete Reasoning Outputs Matching OpenAI o1

DeepSeek has launched a new AI model, DeepSeek-R1-Lite-Preview, designed to address the reasoning gaps in current AI models by providing complete reasoning outputs. The model matches OpenAI’s o1 preview-level performance and is available for testing through DeepSeek’s chat interface. It incorporates Chain-of-Thought (CoT) reasoning capabilities, allowing the AI to present its thought process in real time, which is crucial for users who require detailed insight into how an AI model arrives at its conclusions. The model has demonstrated its capabilities through benchmarks like AIME and MATH, and is set to be open-sourced, making it accessible to the broader community for experimentation and integration.

Ai2 releases new language models competitive with Meta’s Llama

The nonprofit AI research organization Ai2 has released a new family of AI models, OLMo 2, which meets the Open Source Initiative’s definition of open source AI. This means that the tools and data used to develop it are publicly available. The OLMo 2 family includes two models, one with 7 billion parameters (OLMo 7B) and one with 13 billion parameters (OLMo 13B), which can perform a range of text-based tasks. Ai2 used a dataset of 5 trillion tokens to train the models, resulting in models that are competitive with other open models like Meta’s Llama 3.1. The OLMo 2 models can be downloaded from Ai2’s website and used commercially under the Apache 2.0 license.

Creating AI video just got easier — Luma Labs gives Dream Machine a huge upgrade

Luma Labs has announced a significant upgrade to its Dream Machine generative AI platform, including a new image model called Photon and a more collaborative approach to AI video creation. The upgrade, the largest since Dream Machine’s launch in June, offers faster video generation and improved natural language understanding. Photon, the new text-to-image model, is touted to be up to 800% faster than similar models, with accurate text rendering and easy character creation. The Dream Machine platform, available on both web and iOS, is also getting a new user interface and will be able to understand instructions and context, allowing users to brainstorm ideas without needing to learn prompt engineering.

Other News

Tools

OpenAI gives ChatGPT an upgrade — reclaims top spot in LLM leaderboard – OpenAI’s latest update to the GPT-4o model has significantly enhanced ChatGPT’s creative writing abilities, allowing it to surpass Google’s Gemini in the LLM leaderboard and become more engaging and insightful in its responses.

Google is prepping Gemini to take action inside of apps – Google’s Gemini Assistant is set to gain agentic-like abilities through a new “app functions” API in Android 16, allowing it to perform tasks within apps, similar to Apple’s upcoming enhancements for Siri in iOS 18.

NVIDIA Introduces Hymba 1.5B: A Hybrid Small Language Model Outperforming Llama 3.2 and SmolLM v2 – NVIDIA’s Hymba 1.5B model combines transformer attention and state space models in a hybrid-head parallel architecture, achieving superior performance and efficiency on smaller devices compared to other sub-2B models.

How OpenAI stress-tests its large language models – OpenAI employs a combination of human testers and automated systems to identify and address unwanted behaviors in its language models, using techniques like red-teaming to explore vulnerabilities and improve safety measures.

Anthropic launches tool to connect AI systems directly to datasets – Anthropic’s Model Context Protocol (MCP) allows AI systems to universally connect to various data sources, streamlining integration and enhancing performance across different platforms.

Rabbit now lets you teach the R1 to perform tasks for you – Rabbit’s new “teach mode” for R1 devices allows users to create AI agents that learn and perform demonstrated tasks, though it remains experimental and may face challenges with CAPTCHA-protected sites.

ChatGPT’s Live Video feature could be inching towards broader rollout – OpenAI is preparing to roll out ChatGPT’s Live Video feature in beta, which will allow users to interact with their surroundings through a live camera feed in the Advanced Voice Mode.

Introducing AI Backgrounds, HD Video Calls, Noise Suppression and More for Messenger Calling – Messenger has introduced new features such as AI-generated backgrounds, HD video calls, noise suppression, and hands-free calling to enhance the user experience during video and audio calls.

Microsoft will soon let you clone your voice for Teams meetings – Microsoft’s new Interpreter tool for Teams will allow users to clone their voices for real-time multilingual speech-to-speech translation, raising both opportunities for enhanced communication and concerns about potential misuse and security risks.

AI coding tool Cursor adds autonomous coding agents in latest update – Cursor’s latest update introduces AI agents capable of autonomously handling coding tasks and terminal operations, enhancing automation and project management within the modified Visual Studio Code environment.

Nvidia claims a new AI audio generator can make sounds never heard before – Nvidia’s Fugatto AI music editor can generate unprecedented sounds and transform audio inputs into unique compositions, including altering voices and creating novel sound effects.

Runway launches Frames — a new AI image generator that creates custom worlds – Runway’s new AI image generator, Frames, offers enhanced stylistic control and visual fidelity, allowing users to create consistent and unique worlds across video generations.

Anthropic says Claude AI can match your unique writing style – Anthropic’s Claude AI now allows users to customize the chatbot’s writing style to match their own or choose from preset options, enhancing personalization and appropriateness for various communication tasks.

ElevenLabs’ new feature is a NotebookLM competitor for creating GenAI podcasts – ElevenLabs’ new GenFM feature allows users to create AI-generated multispeaker podcasts by uploading various content types, incorporating natural human elements like “ums” and “ahs” to enhance conversational flow, and supports 32 languages.

ElevenLabs launches GenFM to turn user content into AI-powered podcasts – GenFM by ElevenLabs transforms user-uploaded content into AI-generated podcasts with ultra-realistic voices, supporting 32 languages and offering versatile applications like news summarization and audio storytelling.

Google has a new chess game that lets you design the pieces with AI – GenChess, a free game by Google, allows players to use AI to design custom chess pieces and play against computer-generated opponents.

Business

Nvidia Doubles Profit as A.I. Chip Sales Soar – Nvidia’s significant profit increase and optimistic revenue forecast highlight the strong demand for its new A.I. chip, Blackwell, despite concerns about the sustainability of its market dominance.

Baidu’s supercheap robotaxis should scare the hell out of the US – Baidu’s Apollo Go has launched its cost-effective RT6 robotaxi in China, priced under $30,000, posing a competitive threat to US companies like Waymo, which face higher production costs and tariffs.

Baidu says self-driving vehicle costs drop to US$34,525 as mass production ramps up – Baidu has significantly reduced the production cost of its Apollo RT6 self-driving vehicle, positioning it as the world’s only mass-produced Level-4 autonomous vehicle and enhancing its competitive edge in the autonomous driving market.

AI chip startup MatX, founded by Google alums, raises Series A at $300M+ valuation, sources say – MatX, co-founded by former Google engineers, has raised $80 million in a Series A round led by Spark Capital to develop chips optimized for large AI workloads, aiming to outperform Nvidia’s GPUs.

Apple Readies More Conversational Siri in Bid to Catch Up in AI – Apple is working to enhance Siri’s conversational abilities to better compete with advanced AI models like ChatGPT.

Google’s DeepMind and YouTube built and shelved ‘Orca,’ a ‘mind-blowing’ music AI tool that hit a copyright snag – Orca, an AI music tool developed by Google’s DeepMind and YouTube, was shelved due to copyright concerns despite its ability to generate authentic-sounding music by mimicking artists using simple prompts.

OpenAI’s Sora video generator appears to have leaked – A group leaked access to OpenAI’s unreleased Sora video generator to protest the company’s alleged exploitation of artists and lack of transparency, leading to a temporary shutdown of the early access program.

Chinese Driverless-Tech Firm Pony AI Said to Raise $260 Million in US IPO – Pony AI Inc.’s successful $260 million US IPO highlights robust investor enthusiasm for both autonomous-driving technology and Chinese companies listing in New York.

The Man Behind Amazon’s Robot Army Wants Everyone to Have an AI-Powered Helper – Brad Porter, former Amazon robotics leader, is developing Proxie robots to assist with menial tasks in warehouses, emphasizing practical AI-powered solutions over costly humanoid robots.

PlayAI Clones Voices on Command – PlayAI, a voice cloning and text-to-speech platform, has raised $21 million in seed funding to enhance its AI voice models and address ethical concerns, despite facing criticism over safety measures and potential misuse of its technology.

Vinci KPU AI achieves top scores in HumanEval and GPQA tests – Vinci KPU, an advanced AI system by Maisa AI, excels in benchmarks like HumanEval and GPQA due to its improved reasoning, execution, and context management capabilities, achieving high accuracy and efficiency.

Amazon develops video AI model, The Information reports – Amazon’s new AI model, Olympus, can process images and videos to enhance search capabilities and reduce dependency on Anthropic’s Claude chatbot.

OpenAI moves to trademark its ‘reasoning’ models – OpenAI is seeking to trademark its new “reasoning” AI model, o1, to protect its intellectual property, while also expanding its series of models designed for complex tasks.

The AI Reporter That Took My Old Job Just Got Fired – James and Rose, AI news broadcasters at The Garden Island, were terminated after a two-month run due to negative public response and their inability to present news in an engaging manner.

How advanced are China’s self-driving taxis? – China is rapidly advancing its self-driving taxi technology, with cities like Wuhan and Beijing leading pilot projects, but challenges such as traffic congestion, high costs, and competition with traditional taxis remain significant hurdles.

A message from John Furrier, co-founder of SiliconANGLE: – French generative artificial intelligence startup Mistral AI is taking the fight to OpenAI with a host of updates announced today.

Inside Elon Musk’s Quest to Beat OpenAI at Its Own Game – nan

AI chip startup MatX, founded by Google alums, raises Series A at $300M+ valuation, sources say – MatX, co-founded by former Google engineers, has raised $80 million in a Series A round led by Spark Capital to develop chips that aim to outperform Nvidia’s GPUs in training large language models.

Research

Meet The Matrix: A New AI Approach to Infinite-Length and Real-Time Video Generation – The Matrix, developed by a team from Alibaba, the University of Hong Kong, and the University of Waterloo, is a groundbreaking AI model that generates infinite-length, high-quality video simulations with real-time interactivity, using advanced diffusion techniques and learning from both game and real-world data.

OpenScholar: The open-source A.I. that’s outperforming GPT-4o in scientific research – Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More Scientists are drowning in data.

Meet LLaVA-o1: The First Visual Language Model Capable of Spontaneous, Systematic Reasoning Similar to GPT-o1 – LLaVA-o1, developed by a team of researchers, introduces a structured four-stage reasoning process and stage-level beam search to significantly enhance systematic reasoning in vision-language models, outperforming larger models with improved accuracy and efficiency.

Marco-o1: Towards Open Reasoning Models for Open-Ended Solutions – Marco-o1 explores the potential of large reasoning models to generalize across domains lacking clear standards by utilizing advanced techniques like Chain-of-Thought fine-tuning and Monte Carlo Tree Search.

Multimodal Autoregressive Pre-training of Large Vision Encoders – AIMV2, a new family of vision encoders, demonstrates exceptional performance in both multimodal and vision-specific tasks by using a multimodal autoregressive pre-training approach that integrates image and text data.

Bringing robot skills from simulation to the real world – Simulation is increasingly being used to generate diverse and high-quality data for training general-purpose robot policies, addressing challenges in real-world data collection and enabling advancements in sim-to-real transfer for tasks like navigation and manipulation.

Universities Are Woefully Under-Resourced For AI Research. They’re Fighting To Change That. – Universities are struggling to keep up with the private sector in AI research due to a lack of resources, prompting initiatives like Empire AI and legislative efforts such as the “Create AI” act to secure federal support and funding.

Procedural Knowledge in Pretraining Drives Reasoning in Large Language Models – Investigating the pretraining data of large language models reveals that procedural knowledge significantly influences their reasoning capabilities, distinguishing it from mere retrieval of factual information.

Large Language Models as Markov Chains – The paper establishes an equivalence between large language models and Markov chains, providing insights into their performance through theoretical analysis and experimental validation.

VBench++: Comprehensive and Versatile Benchmark Suite for Video Generative Models – VBench++ offers a comprehensive and versatile benchmark suite for evaluating video generative models by dissecting video generation quality into specific dimensions, aligning with human perception, and supporting both text-to-video and image-to-video evaluations.

Disentangling Memory and Reasoning Ability in Large Language Models – The article discusses the importance of citing arXiv papers in repositories to ensure their visibility and accessibility on platforms like Hugging Face.

Concerns

OpenAI sued by Canada’s biggest media outlets – Canadian media companies are suing OpenAI for allegedly using their journalism without permission to train its GPT model, seeking damages and an injunction against future use.

OpenAI accidentally erases potential evidence in training data lawsuit – OpenAI engineers inadvertently erased crucial evidence in a lawsuit over AI training data, complicating efforts to trace the use of news articles in building AI models, despite attempts to recover the lost data.

Study of ChatGPT citations makes dismal reading for publishers – A study by the Tow Center for Digital Journalism reveals that ChatGPT frequently produces inaccurate citations for publishers’ content, raising concerns about the reliability and transparency of its sourcing, regardless of whether publishers have licensing deals with OpenAI.

Netflix removes AI art poster for Arcane after an outcry from creators – Netflix removed an AI-generated poster for Arcane’s second season after backlash from fans and creators, highlighting ongoing debates about AI’s role in art and its impact on artistic integrity.

Analysis

A Revolution in How Robots Learn – Robots are increasingly learning to perform complex tasks through AI-driven imitation learning, marking a significant shift from traditional programming to self-teaching capabilities.

Fun

AI created a Minecraft AI village with up to 1,000 inhabitants — Project Sid sees AI bots implement a taxation system and spread Pastafarianism religion – AI startup Altera’s Project Sid successfully created a dynamic AI society within Minecraft, where AI agents autonomously developed roles, implemented a taxation system, and even spread a parody religion, showcasing emergent human-like behaviors.


Copyright © 2024 Skynet Today, All rights reserved.



Source link

Related post

Leave a Reply

Votre adresse e-mail ne sera pas publiée. Les champs obligatoires sont indiqués avec *