In everyday life we use our sense of sight, commonly without stopping to reason how it works and which are the components and processes into play between our brain and our eyes. It is an accepted and understood notion that sight is applied from a cognitive process that receives and interprets signs from the outside. Many centuries ago, Aristotle maintained in his Metaphysics that «the sight, better than other senses, makes us aware of the objects, and discovers among them a great number of differences». Thus, although as living beings we are characterize by the fact that we can distinguish the constitution of things that belong to a real and delimited space, we also obey a superior intuition that —along with memory— defines us as rational beings: in other words, humanize us.
On the other hand, in recent years, in the scientific field we find multiple innovations that pretend to give a certain reasoned vision to the different technological prototypes that begin to emerge on the competitive system of today’s market. Some machines conceived from artificial intelligence offer, among its many other qualities, the capacity to have the sight set on the real and palpable world. This way, the device exercises a kind of freedom to act and decide conveniently, according to the utility and service that society has conferred on it. In summary, this is the purpose of computer vision.
A first look to computer vision
This discipline bursts in the sixties with a preconceived and somewhat misleading idea on the part of the firsts scientist specialized on artificial intelligence: that offering images to a computer to interpret them was relatively easy. Actually it turned out to be quite the opposite. At that time, there were plenty of issues to take account of in order to get significant advances on computer vision.
In this sense, computer vision continues to be understood as a scientific discipline that studies and promotes the training of different devices and computers for the interpretation of images of the real world, in a way that both the processes as the results are as similar as possible to the human capacities.
Vision and memory: to look from the mind
For such case, it is necessary to understand the relation that the cognitive process keeps with the visual capacity in an informative context. There are interesting theoretical postulates that address the problem of computational memory (angular branch of artificial vision) from different edges. Some defend the necessity of an application external to the device that perform the primary memory processes for its operation. Other academics think it is valid to investigate from an intrinsic notion, that is, to try to define a cognitive structure in the construction of the electronic model itself. Also, another developers make a contribution from a more basic and rudimentary exploration, applying the simulation of specific tasks of the memory in some computers.
Consequently, the conjunction between memory and vision is transcendental in the field of computer science, so as to be able to track in advance certain storage of information that is in constant revision with the images found in real life. In a nutshell, the machine must follow a series of steps to achieve its purpose: it needs to learn in order to see.
Levels of computer vision
The study of artificial vision divides its components in three essential horizons, from the most elemental one to the most complex:
- Level 2D. Distinction of the basic features in a superimposed image: lines, strokes, edges, arcs, zeroes.
- Level 21/2 Identification, in the image, of specific nuances belonging to stereoscopy (discipline that studies the three-dimensional illusion of the objects), like lighting, reflectance and shading, in conjunction with the elemental features mentioned before.
- Level 3D. Representation of the objects in real space. This allows the consequent classification and comparison of the images with information previously collected on the device.
Therefore, different theoretical sections of computer science defend that this three levels must form in conjunction an adequate guideline on the experimentation and development of computer vision.
Applications in sight
Computer vision has presented an extraordinary boom in different fields of the industry and technology. To name a few cases, is transcendental its use in the optical metrology: from an artificial vision system, an object is designed with the proper physical measurements and magnitudes; it also has a significant application for data reading: here it is conceived from the recognition and verification of characters and codes that allow a correct compilation of information (files, documents, products) for its subsequent storage and categorization (a daily example of this application is the machines that detect the barcode of the products available at a supermarket); other cases are the control and surveillance systems, that allow to detect unconformities on a space previously outlined by the user; even, on the medical field, X rays and other complex systems for the detection of diseases from tomography, radiographs, magnetic resonances, among different utilities to consider.
Among other favours that the computer vision have offered to the new tradition of innovation and competitiveness, we find its current focus on robotics. Thus, the research carried out for the development of domestic robots are notable: it is pretended to innovate from the creation of a human-prototype model that execute the cleaning chores and take care of children and elderly people. One of the enthusiast researches in this area is the Mexican scientist José Martínez Carranza, who holds a PhD in Computer Science from the University of Bristol, once said: «Not only in Mexico, but in the whole world, exists the necessity to have robots that can help us in our homes, robots that can clean. There are companies that are already selling robots that can clean floors, for example. Then, it is about having a robot that can take care of the kids, that can make them company, that can talk to you». Nonetheless, to materialize a product that clearly offers you a great variety of utilities essential to the human necessities may sound fantastic or distant: the research, on the contrary, show that this path is soon to be travelled.
Thus we find within the development of new robotic models some considerable attempts to conquest those qualities characteristically humans —that soon will cease to be exclusive if we share them with computer products— so peculiar of us like emotions, reflection, hearing, speech and even vision.
Written by Andrés Márquez (Communication and Diffusion, PIT-UAS), translated by Belem Ruiz (Edition and Communication, PIT-UAS).