AI - COMPUTER VISION NOTES

CHAPTER - COMPUTER VISION

GRADE X AI

Computer vision is a field of computer science that focuses on enabling computers to identify and understand objects and people in images and videos. Like other types of AI, computer vision seeks to perform and automate tasks that replicate human capabilities. In this case, computer vision seeks to replicate both the way humans see, and the way humans make sense of what they see.

The range of practical applications for computer vision technology makes it a central component of many modern innovations and solutions. Computer vision can be run in the cloud or on premises.

How computer vision works

Computer vision applications use input from sensing devices, artificial intelligence, machine learning, and deep learning to replicate the way the human vision system works. Computer vision applications run on algorithms that are trained on massive amounts of visual data or images in the cloud. They recognize patterns in this visual data and use those patterns to determine the content of other images.

How an image is analyzed with computer vision

A sensing device captures an image. The sensing device is often just a camera, but could be a video camera, medical imaging device, or any other type of device that captures an image for analysis.

The image is then sent to an interpreting device. The interpreting device uses pattern recognition to break the image down, compare the patterns in the image against its library of known patterns, and determine if any of the content in the image is a match. The pattern could be something general, like the appearance of a certain type of object, or it could be based on unique identifiers such as facial features.

A user requests specific information about an image, and the interpreting device provides the information requested based on its analysis of the image.

Deep learning and computer vision

Modern computer vision applications are shifting away from statistical methods for analyzing images and increasingly relying on what is known as deep learning. With deep learning, a computer vision application runs on a type of algorithm called a neural network, which allows it deliver even more accurate analyses of images. In addition, deep learning allows a computer vision program to retain the information from each image it analyzes—so it gets more and more accurate the more it is used.

Object Classification

The system classifies the objects in an image according to a defined category. For example, with object classification, a computer could distinguish people from objects in a photo and determine how many people appear in the photo.

Object Identification

The system identifies a particular object in a photo, video, or image. For example, with object identification, the system would be able to not only distinguish people in a photo, but also analyze their appearance to determine the identity or traits of those people.

Object Tracking

The system analyzes a video to process the location of a moving object over time. For example, with object tracking, a parking lot surveillance camera could identify cars in a parking lot and provide information about the location and movements of those cars over time.

Optical Charachter Recognition

The system identifies letters and numbers in images and convert that text into machine-encoded text that can be read by other computer applications or edited by users.

Applications of CV

In the 1970s, computer vision as a concept was first introduced. Everyone was excited by the new uses for computer vision. However, a considerable technological advance in recent years has elevated computer vision to the top of many companies’ priority lists. Let’s examine a few of them:

Facial Recognition

Computer vision is essential to the advancement of the home in the era of smart cities and smart homes. The most crucial application of computer vision is facial recognition in security. Either visitor identification or visitor log upkeep is possible.

Face Filters

Many of the functionality in today’s apps, including Instagram and Snapchat, rely on computer vision. One of them is the usage of facial filters. The computer or algorithm may recognise a person’s facial dynamics through the camera and apply the chosen facial filter.

Google’s Search by Image

The majority of data that is searched for using Google’s search engine is textual information, but it also has the intriguing option of returning search results via an image. This makes use of computer vision since it examines numerous attributes of the input image while also comparing them to those in the database of images to provide the search result.

Computer Vision in Retail

One of the industries with the quickest growth is retail, which is also utilising computer vision to improve the user experience. Retailers can analyse navigational routes, find walking patterns, and track customer movements through stores using computer vision techniques.

Self-Driving Cars

Computer Vision is the fundamental technology behind developing autonomous vehicles. Most leading car manufacturers in the world are reaping the benefits of investing in artificial intelligence for developing on-road versions of hands-free technology.

Medical Imaging

A reliable resource for doctors over the past few decades has been computer-supported medical imaging software. It doesn’t just produce and analyse images; it also works as a doctor’s helper to aid in interpretation.

The software is used to interpret and transform 2D scan photos into interactive 3D models that give medical professionals a thorough insight of a patient’s health.

Google Translate App

To read signs written in a foreign language, all you have to do is point the camera on your phone at the text, and the Google Translate software will very immediately translate them into the language of your choice. This is a useful application that makes use of Computer Vision, utilising optical character recognition to view the image and augmented reality to overlay an accurate translation.

Computer Vision Tasks

The many Computer Vision applications are based on a variety of tasks that are carried out to extract specific information from the input image that may be utilised for prediction or serves as the foundation for additional analysis. A computer vision application performs the following tasks:

Classification

Image Classification problem is the task of assigning an input image one label from a fixed set of categories. This is one of the core problems in CV that, despite its simplicity, has a large variety of practical applications.

Classification + Localisation

This is the task which involves both processes of identifying what object is present in the image and at the same time identifying at what location that object is present in that image. It is used only for single objects.

Object Detection

Finding occurrences of real-world items like faces, bicycles, and buildings in pictures or movies is a process known as object detection. To identify occurrences of a certain object category, object identification algorithms frequently employ extracted features and learning techniques. Applications like image retrieval and automatic car parking systems frequently employ it.

Instance Segmentation

The process of identifying instances of the items, categorising them, and then assigning each pixel a label based on that is known as instance segmentation. An image is sent into a segmentation algorithm, which produces a list of regions (or segments).

Basics of Images

We all see a lot of images around us and use them daily either through our mobile phones or computer system. But do we ask some basic questions to ourselves while we use them on such a regular basis.

Basics of Pixels

A picture element is referred to as a “pixel.” In digital form, pixels make up each and every image.

They are the tiniest piece of data that go into a picture. They are normally structured in a 2-dimensional grid and are either circular or square.

Resolution

The resolution of an image is occasionally referred to as the number of pixels. One approach is to define resolution as the width divided by the height when the phrase is used to describe the number of pixels, for example, a monitor resolution of 1280×1024. Accordingly, there are 1280 pixels from side to side and 1024 pixels from top to bottom.

Pixel value

Each of the pixels that make up an image that is stored on a computer has a pixel value that specifies its brightness and/or intended colour. The byte image, which stores this number as an 8-bit integer with a possible range of values from 0 to 255, is the most popular pixel format.

Zero is typically used to represent no colour or black, and 255 is used to represent full colour or white.

Grayscale Images

Grayscale images are images which have a range of shades of gray without apparent colour. The darkest possible shade is black, which is the total absence of colour or zero value of pixel. The lightest possible shade is white, which is the total presence of colour or 255 value of a pixel . Intermediate shades of gray are represented by equal brightness levels of the three primary colours. A grayscale has each pixel of size 1 byte having a single plane of 2d array of pixels. The size of a grayscale image is defined as the Height x Width of that image. Let us look at an image to understand about grayscale images

Here is an example of a grayscale image. as you check, the value of pixels are within the range of 0- 255.The computers store the images we see in the form of these numbers.

RGB Images

Every image we encounter is a coloured image. Three main colors—Red, Green, and Blue—make up these graphics. Red, green, and blue can be combined in various intensities to create all the colours that are visible.

Exercise

Let us experience!

Go to this online link https://www.w3schools.com/colors/colors_rgb.asp. On the basis of this online tool, try and answer all the below mentioned questions.

1) What is the output colour when you put R=G=B=255 ?

___________________________________________________________________________

2) What is the output colour when you put R=G=B=0 ?

___________________________________________________________________________

3) How does the colour vary when you put either of the three as 0 and then keep on varying

the other two?

___________________________________________________________________________

4) How does the output colour change when all the three colours are varied in same

proportion ?

___________________________________________________________________________

5) What is the RGB value of your favourite colour from the colour palette?

___________________________________________________________________________

Task : Go to the following link www.piskelapp.com and create your own pixel art. Try and make a GIF using the online app for your own pixel art.

CBSE : INFORMATICS PRACTICES/CS & AI

Search This Blog

AI - COMPUTER VISION NOTES

Labels

Comments

Post a Comment

Popular posts from this blog

CS - SORTING/SEARCHING ALGORITHMS

GRADE XI - NESTED FOR LOOP

GRADE XII - CS / IP - MYSQL NOTES