CHAPTER - COMPUTER VISION
GRADE
X AI
Computer vision is a
field of computer science that focuses on enabling computers to identify and
understand objects and people in images and videos. Like other types of AI,
computer vision seeks to perform and automate tasks that replicate human
capabilities. In this case, computer vision seeks to replicate both the way
humans see, and the way humans make sense of what they see.
The range of practical
applications for computer vision technology makes it a central component of
many modern innovations and solutions. Computer vision can be run in the cloud
or on premises.
How computer vision works
Computer vision
applications use input from sensing devices, artificial intelligence, machine
learning, and deep learning to replicate the way the human vision system works.
Computer vision applications run on algorithms that are trained on massive
amounts of visual data or images in the cloud. They recognize patterns in this
visual data and use those patterns to determine the content of other images.
How an image is analyzed
with computer vision
A sensing device captures
an image. The sensing device is often just a camera, but could be a video
camera, medical imaging device, or any other type of device that captures an
image for analysis.
The image is then sent to
an interpreting device. The interpreting device uses pattern recognition to
break the image down, compare the patterns in the image against its library of
known patterns, and determine if any of the content in the image is a match.
The pattern could be something general, like the appearance of a certain type
of object, or it could be based on unique identifiers such as facial features.
A user requests specific
information about an image, and the interpreting device provides the
information requested based on its analysis of the image.
Deep learning and
computer vision
Modern computer vision
applications are shifting away from statistical methods for analyzing images
and increasingly relying on what is known as deep learning. With deep learning,
a computer vision application runs on a type of algorithm called a neural network,
which allows it deliver even more accurate analyses of images. In addition,
deep learning allows a computer vision program to retain the information from
each image it analyzes—so it gets more and more accurate the more it is used.
The system classifies the
objects in an image according to a defined category. For example, with object
classification, a computer could distinguish people from objects in a photo and
determine how many people appear in the photo.
Object Identification
The system identifies a
particular object in a photo, video, or image. For example, with object
identification, the system would be able to not only distinguish people in a
photo, but also analyze their appearance to determine the identity or traits of
those people.
Object Tracking
The system analyzes a
video to process the location of a moving object over time. For example, with
object tracking, a parking lot surveillance camera could identify cars in a
parking lot and provide information about the location and movements of those cars
over time.
Optical Charachter
Recognition
The system identifies
letters and numbers in images and convert that text into machine-encoded text
that can be read by other computer applications or edited by users.
Applications of CV
In the 1970s, computer
vision as a concept was first introduced. Everyone was excited by the new uses
for computer vision. However, a considerable technological advance in recent
years has elevated computer vision to the top of many companies’ priority lists.
Let’s examine a few of them:
Facial Recognition
Computer vision is
essential to the advancement of the home in the era of smart cities and smart
homes. The most crucial application of computer vision is facial recognition in
security. Either visitor identification or visitor log upkeep is possible.
Face Filters
Many of the functionality
in today’s apps, including Instagram and Snapchat, rely on computer vision. One
of them is the usage of facial filters. The computer or algorithm may recognise
a person’s facial dynamics through the camera and apply the chosen facial
filter.
Google’s Search by Image
The majority of data that
is searched for using Google’s search engine is textual information, but it
also has the intriguing option of returning search results via an image. This
makes use of computer vision since it examines numerous attributes of the input
image while also comparing them to those in the database of images to provide
the search result.
One of the industries
with the quickest growth is retail, which is also utilising computer vision to
improve the user experience. Retailers can analyse navigational routes, find
walking patterns, and track customer movements through stores using computer
vision techniques.
Self-Driving Cars
Computer Vision is the
fundamental technology behind developing autonomous vehicles. Most leading car
manufacturers in the world are reaping the benefits of investing in artificial
intelligence for developing on-road versions of hands-free technology.
Medical Imaging
A reliable resource for
doctors over the past few decades has been computer-supported medical imaging
software. It doesn’t just produce and analyse images; it also works as a
doctor’s helper to aid in interpretation.
The software is used to
interpret and transform 2D scan photos into interactive 3D models that give
medical professionals a thorough insight of a patient’s health.
Google Translate App
To read signs written in
a foreign language, all you have to do is point the camera on your phone at the
text, and the Google Translate software will very immediately translate them
into the language of your choice. This is a useful application that makes use
of Computer Vision, utilising optical character recognition to view the image
and augmented reality to overlay an accurate translation.
Computer Vision Tasks
The many Computer Vision
applications are based on a variety of tasks that are carried out to extract
specific information from the input image that may be utilised for prediction
or serves as the foundation for additional analysis. A computer vision application
performs the following tasks:
Classification
Image Classification
problem is the task of assigning an input image one label from a fixed set of
categories. This is one of the core problems in CV that, despite its
simplicity, has a large variety of practical applications.
Classification +
Localisation
This is the task which
involves both processes of identifying what object is present in the image and
at the same time identifying at what location that object is present in that
image. It is used only for single objects.
Object Detection
Finding occurrences of
real-world items like faces, bicycles, and buildings in pictures or movies is a
process known as object detection. To identify occurrences of a certain object
category, object identification algorithms frequently employ extracted features
and learning techniques. Applications like image retrieval and automatic car
parking systems frequently employ it.
Instance Segmentation
The process of
identifying instances of the items, categorising them, and then assigning each
pixel a label based on that is known as instance segmentation. An image is sent
into a segmentation algorithm, which produces a list of regions (or segments).
Basics of Images
We all see a lot of
images around us and use them daily either through our mobile phones or
computer system. But do we ask some basic questions to ourselves while we use
them on such a regular basis.
Basics of Pixels
A picture element is
referred to as a “pixel.” In digital form, pixels make up each and every image.
They are the tiniest
piece of data that go into a picture. They are normally structured in a
2-dimensional grid and are either circular or square.
Resolution
The resolution of an
image is occasionally referred to as the number of pixels. One approach is to
define resolution as the width divided by the height when the phrase is used to
describe the number of pixels, for example, a monitor resolution of 1280×1024.
Accordingly, there are 1280 pixels from side to side and 1024 pixels from top
to bottom.
Pixel value
Each of the pixels that
make up an image that is stored on a computer has a pixel value that specifies
its brightness and/or intended colour. The byte image, which stores this number
as an 8-bit integer with a possible range of values from 0 to 255, is the most
popular pixel format.
Zero is typically used to
represent no colour or black, and 255 is used to represent full colour or
white.
Grayscale Images
Grayscale images are
images which have a range of shades of gray without apparent colour. The
darkest possible shade is black, which is the total absence of colour or zero
value of pixel. The lightest possible shade is white, which is the total
presence of colour or 255 value of a pixel . Intermediate shades of gray are
represented by equal brightness levels of the three primary colours. A
grayscale has each pixel of size 1 byte having a single plane of 2d array of
pixels. The size of a grayscale image is defined as the Height x Width of that
image. Let us look at an image to understand about grayscale images
Here is an example of a grayscale image. as you check, the value of pixels are within the range of 0- 255.The computers store the images we see in the form of these numbers.
RGB Images
Every image we encounter
is a coloured image. Three main colors—Red, Green, and Blue—make up these
graphics. Red, green, and blue can be combined in various intensities to create
all the colours that are visible.
Exercise
Let us experience!
Go to this online link
https://www.w3schools.com/colors/colors_rgb.asp. On the basis of this online tool,
try and answer all the below mentioned questions.
1) What is the output
colour when you put R=G=B=255 ?
___________________________________________________________________________
2) What is the output
colour when you put R=G=B=0 ?
___________________________________________________________________________
3) How does the colour
vary when you put either of the three as 0 and then keep on varying
the other two?
___________________________________________________________________________
___________________________________________________________________________
___________________________________________________________________________
___________________________________________________________________________
4) How does the output
colour change when all the three colours are varied in same
proportion ?
___________________________________________________________________________
___________________________________________________________________________
___________________________________________________________________________
___________________________________________________________________________
5) What is the RGB value
of your favourite colour from the colour palette?
___________________________________________________________________________
Task
: Go to the following link www.piskelapp.com
and create your own pixel art. Try and make
a GIF using the online app for your own pixel art.
Comments
Post a Comment