Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2020 Jan 29;15(1):e0227677.
doi: 10.1371/journal.pone.0227677. eCollection 2020.

Semantic and structural image segmentation for prosthetic vision

Affiliations

Semantic and structural image segmentation for prosthetic vision

Melani Sanchez-Garcia et al. PLoS One. .

Abstract

Prosthetic vision is being applied to partially recover the retinal stimulation of visually impaired people. However, the phosphenic images produced by the implants have very limited information bandwidth due to the poor resolution and lack of color or contrast. The ability of object recognition and scene understanding in real environments is severely restricted for prosthetic users. Computer vision can play a key role to overcome the limitations and to optimize the visual information in the prosthetic vision, improving the amount of information that is presented. We present a new approach to build a schematic representation of indoor environments for simulated phosphene images. The proposed method combines a variety of convolutional neural networks for extracting and conveying relevant information about the scene such as structural informative edges of the environment and silhouettes of segmented objects. Experiments were conducted with normal sighted subjects with a Simulated Prosthetic Vision system. The results show good accuracy for object recognition and room identification tasks for indoor scenes using the proposed approach, compared to other image processing methods.

PubMed Disclaimer

Conflict of interest statement

The authors have declared that no competing interests exist.

Figures

Fig 1
Fig 1. Configuration of a retinal prostheses.
The external and internal components include a micro camera, a transmitter, a external processing unit and a implanted electrode array. First, the external camera acquires an image. Then, the external processor converts the image to a suitable pattern of electrical stimulation of the retina through an electrode array.
Fig 2
Fig 2. Stimuli generation.
Top row: Example of a bathroom scene with the three processing methods used in this work (a) Direct image, (b) Edge image and (c) SIE-OMS image. Bottom row: the three processing methods in the SPV.
Fig 3
Fig 3. Processing pipeline.
The stimulation of the electrode array is based on two information pathways to extract the regions of pixels that represents important objects (OMS) and structural edges (SIE). The regions are computed using two different types of FCN from He et al. [55] and Fernandez-Labrador et al. [54].
Fig 4
Fig 4. Scene layout from an indoor image.
Using [54] we detect the main structure of the room extracting the structural informative edges (SIE) (right) which are those formed by the intersection of walls, ceiling and floor of the room (middle).
Fig 5
Fig 5. Box and mask branch from OMS.
Above: box branch for classification and bounding box regression. Below: mask branch for predicting segmentation masks on each Region of Interest (ROI). Numbers denote spatial resolution and channels. Arrows denote either convolutions, deconvolutions, or fully conected layers. The x4 means 4 consecutive convolution layers. (Adapted from He et al. [55]).
Fig 6
Fig 6. Objects masks and silhouettes (OMS).
Object masks were generated from [55] and were sorted by probability scores to avoid occlusions between objects. The extracted information was combined in an image highlighting the silhouettes of the objects in white with the object masks in gray.
Fig 7
Fig 7. SPV and trial setup.
SPV setup: Subjects were seated on a chair facing a computer screen at 1m distance. The visual field was 20 degrees that simulates the prostheses device. Trial setup: Each gray rectangle represents the image shown on the computer monitor during the trial. Each image appeared for 10 seconds and switched for the next image automatically. Break time between image sequences was 30 seconds. The complete experiment took approximately 15 minutes.
Fig 8
Fig 8. Examples of stimuli used in the experiment.
Six examples of indoor environments represented with 1024 phosphenes (rows: bathroom, bedroom, dining room, kitchen, living room and office, respectively). Each column shows: a) input images, b) images processed using the Edge method, c) images processed using the Direct method and d) images processed by our SIE-OMS method, respectively.
Fig 9
Fig 9. Global results by phosphenic stimuli method.
Percentage of correct, incorrect and not answered responses in a single trial. Higher scores in correct responses indicate that subjects were able to identify and recognize the objects and the type of room in each test image. Higher ratios of not answered indicate that subjects were not able to identify and recognize the objects and the type of room in each test image. The general findings are that: SIE-OMS method improves the identification of the objects resulting to be the most effective method. This translates in an increase in the number of correct answers for the room type identification test for the SIE-OMS method. Results also show that the Edge method is the least effective with the highest percentage of non responses images for the two tasks. The test found significant difference between SIE-OM and Direct method (p<.001). The same conclusion was found between SIE-OM and Edge method (p<.001). Where: *** = p<.001; ** = p<.01; * = p<.05; ns = p>.05. All t-tests paired samples, two-tailed.
Fig 10
Fig 10. Object recognition results for each room-type.
Higher scores in correct responses indicate that subjects were able to recognize the objects in each room. Higher ratios in non responses indicate that subjects were not able to recognize the objects in each room. The SIE-OMS method obtained the highest score of the three methods in all room types compared with Edge and Direct methods. The results also show how the most difficult room was the kitchen. *** = p<.001; ** = p<.01; * = p<.05; ns = p>.05. All t-tests paired samples, two-tailed.
Fig 11
Fig 11. Room identification results for each room-type.
Higher scores in correct responses indicate that subjects were able to recognize the type of room in each test image. Higher ratios in non responses indicate that subjects were not able to recognize the type of room in each image. The SIE-OMS method obtained the highest score of the three methods in all room-type compared with Edge and Direct methods. In the same way as in the identification of objects, results also showed how the most difficult room was the kitchen. *** = p<.001; ** = p<.01; * = p<.05; ns = p>.05. All t-tests paired samples, two-tailed.
Fig 12
Fig 12. Successful and failed images results.
Some examples of phosphenic images generated with the three methods. Successful images (top rows) and cases of images failed by the subjects (bottom rows) with the three approaches: Edge, Direct and SIE-OMS, respectively.

Similar articles

Cited by

References

    1. Hartong DT, Berson EL, Dryja TP. Retinitis pigmentosa. The Lancet. 2006;368(9549):1795–1809. 10.1016/S0140-6736(06)69740-7 - DOI - PubMed
    1. Yu DY, Cringle SJ. Retinal degeneration and local oxygen metabolism. Experimental eye research. 2005;80(6):745–751. 10.1016/j.exer.2005.01.018 - DOI - PubMed
    1. Zhou DD, Dorn JD, Greenberg RJ. The Argus® II retinal prosthesis system: An overview. In: 2013 IEEE International Conference on Multimedia and Expo Workshops (ICMEW). IEEE; 2013. p. 1–6.
    1. Cheng DL, Greenberg PB, Borton DA. Advances in retinal prosthetic research: a systematic review of engineering and clinical characteristics of current prosthetic initiatives. Current Eye Research. 2017;42(3):334–347. 10.1080/02713683.2016.1270326 - DOI - PubMed
    1. Lovell NH, Hallum LE, Chen S, Dokos S, Byrnes-Preston P, Green R, et al. Advances in retinal neuroprosthetics Handbook of Neural Engineering. 2007; p. 337–356.

Publication types

Grants and funding

This work was supported by projects DPI2015-65962-R, RTI2018-096903-B-I00 (MINECO/FEDER, UE) and BES-2016-078426 (MINECO). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
-