2024-0040

DifuzCam: Replacing Camera Lens with a Mask and a Diffusion Model

DifuzCam represents a novel approach of flat lensless camera imaging. By eliminating traditional lenses and instead using a diffuser near the sensor, the camera size, weight and cost are drastically reduced. However, reconstructing high-quality images from the raw sensor data of such cameras is an ongoing challenge. Our invention proposes a solution by leveraging a pre-trained diffusion model combined with a control network with learned camera physical mask design to reconstruct visually pleasing images with state-of-the-art quality.

UNMET NEED
While flat lensless cameras offer a promising pathway to miniaturize imaging systems, the trade-off is that the captured data is difficult to interpret, making high-quality reconstruction is hard. Previous attempts using optimization or deep learning have not yielded sufficiently accurate or perceptual images. The lack of effective reconstruction algorithms hinders the practical use of these compact and low-cost cameras.

OUR SOLUTION
Our solution addresses image reconstruction challenge by integrating a pre-trained diffusion model, which is traditionally used for generating high-quality images, with a custom-trained ControlNet for enhancing image reconstruction. The diffusion model, having been trained on vast datasets of natural images, serves as a powerful prior that helps reconstruct the most likely high-quality image corresponding to the raw sensor data from the flat camera. An innovative aspect of our method is the use of text guidance as an optional feature. By allowing the user to provide a textual description of the scene, the diffusion model can be further guided to reconstruct images that more closely align with the intended visual context. This is particularly useful in ambiguous or low-visibility scenarios where the raw data alone might not suffice to produce a clear image. We use a separable transformation and additional loss functions to improve convergence during training. Our prototype flat camera demonstrates state-of-the-art performance, outperforming previous methods both qualitatively and quantitatively. In addition, we suggest an approach to learn the camera physical mask design along with the model training for improving the reconstructions. Beyond flat cameras, this approach has the potential to be adapted to other imaging systems.

Figure 1. Results Comparison. We compare the results of our proposed method to the previous method
using GAN “FlatNet-T” on their published data set with their network weights.

INTELLECTUAL PROPERTY
Provisional patent application

Sign up for
our events

    Close
    Life Science
    Magazine

      Close
      Hi-Tech
      Magazine

        Close