VLM Question (Image Input Bounds)

Juin 20, 2025

—

Hello,

I am currently running Qwen-2.5vl to do image processing.

My objective is to run one prompt to gather a bunch of data (return me a json with data fields) and to create a summary of the images etc. However, I am only working with 24 GBs of VRAM.

I was wondering how I can deal with n many images. I’ve thought about downscaling, but obviously there is still a limit until the GPU runs out of memory.

What’s a good way to go about this?

Thanks!

submitted by /u/spuniflo to r/learnmachinelearning
[link] [comments]

VLM Question (Image Input Bounds)

Commentaires

Laisser un commentaire Annuler la réponse