PRISMA-Performs a variety of inferences from images or videos

PRISMA is a computational photography pipeline that can perform a variety of inferences from any image or video. Similar to how light is refracted into different wavelengths through a prism, this pipeline expands images into data usable for 3D reconstruction or real-time post-processing operations. It integrates various algorithms and open-source pretrained models, such as monocular depth (MiDAS v3.1, ZoeDepth, Marigold, PatchFusion), optical flow (RAFT), segmentation masks (mmdet), and camera pose estimation (colmap), among others. The results are stored in a folder with the same name as the input file, with each band saved as a separate .png or .mp4 file. For videos, in the final step, it attempts to perform sparse reconstruction, which can be used for NeRFs (such as NVidia's Instant-ngp) or Gaussian diffusion training. The inferred depth information is exported by default as heatmap GLSL/HLSL samples that can be decoded in real-time using LYGIA, and the optical flow is encoded as HUE (angle) and saturation, which can also be decoded in real-time using LYGIA's optical flow GLSL/HLSL sampler.

AI News

AI Daily

AI Timeline

Al Hardware

Latest Cases

Image Collection

Video Collection

Audio Collection

Content Collection

Latest Tutorials

AI Product Ranking

AI Traffic Growth Ranking

AI Traffic Decline Ranking

AI Weekly Ranking

United States

China

India

Brazil

Image Generation

Personal Assistant

Character Generation

Video Generation

AI Project Ranking

AI Project Growth Ranking

AI Developer Ranking

AI Organization Ranking

Deepseek

TTS

LLM

ChatGPT

Overview

PRISMA

PRISMA Visit Over Time

PRISMA Visit Trend

PRISMA Visit Geography

PRISMA Traffic Sources

PRISMA Alternatives

DUSt3R — Dense 3D reconstruction without camera calibration information

VisFusion — Based on Video 3D Scene Reconstruction

NVAS3d — 3D room reconstruction for novel-view acoustic synthesis

VGGSfM — Depth learning-driven 3D reconstruction technology

3D AI Studio — AI-Generated Custom 3D Models

SF3D — Quickly generate textured 3D models

3D Creation — Easily create and utilize 3D content

Imagine 3D — Text to 3D

PRISMA — Performs a variety of inferences from images or videos

SceneScript — SceneScript: 3D Scene Reconstruction Achieved Through Reality Labs Research

ReconFusion — ReconFusion: 3D Reconstruction with Diffusion Prior

Stable Fast 3D — Quickly generate 3D models from a single image.

MVDrag3D — A drag-and-drop 3D editing tool based on multi-view generative reconstruction priors.

TRELLIS 3D AI — A professional tool for easily converting images into 3D assets

Ouroboros3D — A framework for generating 3D models from 3D-aware recursive diffusion using single images.

Long-LRM — Efficient 3D Gaussian reconstruction model for fast large-scale scene regeneration

ComfyUI-3D-Pack — ComfyUI 3D Processing Plugin Bundle

CSM 3D Viewer — An online 3D model viewer that supports viewing and interaction.

EgoGaussian — 3D Scene Reconstruction and Dynamic Object Tracking Technology

Draw3D — Creative 3D Drawing Tool

ComfyUI3D Pack — ComfyUI node plugin, supports 3D processing

GRM — GRM is a large-scale Gaussian reconstruction model for high-quality and efficient 3D reconstruction and generation.

Spline AI 3D Generation — AI tool for fast 3D model generation

Flex3D — Creates high-quality 3D assets from a single image or text prompt.

Tencent Hunyuan 3D — The first open-source 3D model supporting both text and image generation.

Any Image to 3D — An AI system that converts 2D images into 3D models.

Lumiere 3D AI — Create compelling 3D product videos

Stable Video 3D — Stable Video 3D is a groundbreaking 3D generation technology that generates high-quality 3D views and novel viewpoints from a single image.

Hunyuan3D-1 — A 3D generation framework launched by Tencent, supporting generation from text and images to 3D.

VastGaussian — Unofficial implementation of Vast 3D Gaussians for Large Scene Reconstruction