Inference Methods - Search News

Microsoft's new AI training method eliminates bloated system prompts without sacrificing model performance

Microsoft researchers have developed On-Policy Context Distillation (OPCD), a training method that permanently embeds ...

InfoWorld

Multi-token prediction technique triples LLM inference speed without auxiliary draft models

With reported 3x speed gains and limited degradation in output quality, the method targets one of the biggest pain points in production AI systems: latency at scale.

13d

Researchers baked 3x inference speedups directly into LLM weights — without speculative decoding

Researchers from the University of Maryland, Lawrence Livermore, Columbia and TogetherAI have developed a training technique that triples LLM inference speed without auxiliary models or infrastructure ...

WJFW

0G Introduces Sealed Inference: Cryptographically Private AI Where Every Response Is Verified Inside a Hardware Enclave

As AI coding agents gain access to entire codebases, 0G delivers what centralized AI cannot — privacy enforced by code, ...

ProVideo Coalition

AWS Elemental Inference: AI-powered vertical video in real time

AWS Elemental Inference enables video customers to adapt video content into vertical formats optimized for mobile and social platforms in real time. Today’s viewers consume content diﬀerently than ...

Advanced Television

AWS Elemental Inference launches

AWS has announced AWS Elemental Inference, a fully managed AI service that automatically transforms and maximises live and on-demand video broadcasts to engage ...

Mongabay News

Precision conservation: the rise of place-specific strategies where protection works best

Conservation has long wrestled with a deceptively simple question: not whether to act, but where action will matter most. Forest restoration, protected areas, wildlife corridors, and enforcement ...

Microsoft Builds A Compact AI Model That Decides When To Think

Microsoft's Phi-4-reasoning-vision-15B uses careful data curation and selective reasoning to compete with models trained on ...

TMCnet

Inception Launches Mercury 2, the Fastest Reasoning LLM - 5x Faster Than Leading Speed-Optimized LLMs, with Dramatically Lower Inference Cost

Inception, the company behind the first commercial diffusion large language models (dLLMs), today announced the launch of ...

TMCnet

Nota AI Reduces Memory Usage of Upstage's Solar LLM by 72%, Demonstrating Proprietary Quantization Technology

Nota AI, an AI optimization technology company behind the Nota AI brand, announced that it has developed a next-generation quantization technology that significantly compresses the size of Solar, a ...

TV Tech

AWS Launches New Tool for Vertical Video Conversion

The company is touting its new AWS Elemental Inference, as a tool that will help broadcasters and streamers reach audiences ...

Some results have been hidden because they may be inaccessible to you

Show inaccessible results