•1 min read•from Machine Learning
How much can a video generated by the same diffusion model differ across GPU architectures if the initial noise latent is fixed? [D]
Our take
In exploring the reproducibility of video generated by diffusion models across different GPU architectures, it’s essential to consider several factors. When using identical model weights, prompts, parameters, and a deterministic sampler with a fixed initial noise latent, the expectation is for similar outputs. However, due to inherent floating-point arithmetic differences, achieving bitwise-identical results is unlikely. The generated videos may exhibit minor perceptual differences, but significant visual discrepancies that are immediately noticeable to the human eye should be rare.
Hi! I am trying to sanity-check an assumption for diffusion video generation reproducibility.
Suppose I run the same video diffusion model on two different GPU architectures, with:
- identical model weights and implementation (same attention backend, etc)
- identical prompt and parameters (same number of denoising steps, etc)
- deterministic sampler (no extra noise is injected during inference)
- the exact same starting noise latent
Could I expect more or less the same generated video?
I understand that there's no way to guarantee bitwise-identical outputs due to floating-point math differences, but could it realistically make the generated videos so different that it'd be immediately noticeable to a human eye? Or would one normally expect only tiny pixel-level/minor perceptual differences?
[link] [comments]
Read on the original site
Open the publisher's page for the full experience
Tagged with
#rows.com#natural language processing for spreadsheets#AI formula generation techniques#generative AI for data analysis#enterprise-level spreadsheet solutions#Excel alternatives for data analysis#financial modeling with spreadsheets#video diffusion model#generated videos#GPU architectures#starting noise latent#video generation#reproducibility#floating-point math differences#identical model weights#deterministic sampler#human eye noticeability#denoising steps#minor perceptual differences#attention backend