Sample MC Script for Church

dTRPO: Trajectory Reduction in Policy Optimization of Diffusion Large Language Models

dTRPO designs two-stage trajectory reduction techniques to enable efficient policy optimization of diffusion large language models (dLLMs): This repo provides the training code, scripts, and configs ...

Some results have been hidden because they may be inaccessible to you

Show inaccessible results

dTRPO: Trajectory Reduction in Policy Optimization of Diffusion Large Language Models

Trending now