How much can a video generated by the same diffusion model differ across GPU architectures if the initial noise latent is fixed? [D]

Our take

In exploring the reproducibility of video generated by diffusion models across different GPU architectures, it’s essential to consider several factors. When using identical model weights, prompts, parameters, and a deterministic sampler with a fixed initial noise latent, the expectation is for similar outputs. However, due to inherent floating-point arithmetic differences, achieving bitwise-identical results is unlikely. The generated videos may exhibit minor perceptual differences, but significant visual discrepancies that are immediately noticeable to the human eye should be rare.

Hi! I am trying to sanity-check an assumption for diffusion video generation reproducibility.

Suppose I run the same video diffusion model on two different GPU architectures, with:

identical model weights and implementation (same attention backend, etc)
identical prompt and parameters (same number of denoising steps, etc)
deterministic sampler (no extra noise is injected during inference)
the exact same starting noise latent

Could I expect more or less the same generated video?

I understand that there's no way to guarantee bitwise-identical outputs due to floating-point math differences, but could it realistically make the generated videos so different that it'd be immediately noticeable to a human eye? Or would one normally expect only tiny pixel-level/minor perceptual differences?

submitted by /u/hellosandrik
[link] [comments]

Read on the original site

Open the publisher's page for the full experience

View original article →

Tagged with

#rows.com#natural language processing for spreadsheets#AI formula generation techniques#generative AI for data analysis#enterprise-level spreadsheet solutions#Excel alternatives for data analysis#financial modeling with spreadsheets#video diffusion model#generated videos#GPU architectures#starting noise latent#video generation#reproducibility#floating-point math differences#identical model weights#deterministic sampler#human eye noticeability#denoising steps#minor perceptual differences#attention backend