00:00:06
Hey friends, welcome to the YouTube channel ALL ABOUT ELECTRONICS.
00:00:10
So in this video, we will learn about the floating point numbers.
00:00:14
And we will see that how very large numbers like the mass of the planets or the Avogadro's
00:00:20
number and similarly very small numbers like the mass of the atom or the Planck's constant
00:00:26
is stored in the computers.
00:00:28
So, during the video, we will also see the difference between the fixed point number
00:00:32
and the floating point numbers.
00:00:34
And with this comparison, we will understand the importance of the floating point numbers
00:00:38
in the digital systems.
00:00:40
So first, let us understand what is the fixed point numbers.
00:00:45
So in our day-to-day life, we are all dealing with the integers as well as the real numbers.
00:00:51
Now when these numbers are represented in the fixed point representation, then the position
00:00:55
of the radix point or the decimal point remains the fixed.
00:00:59
So all the integers are the example of the fixed point numbers.
00:01:03
So for the integers, there is no fractional part or in other words, the fractional part
00:01:08
is equal to zero.
00:01:10
So by default, the position of the decimal point is at the end of the least significant
00:01:15
digit.
00:01:16
Hence, there is no fractional part, so typically, we do not represent this decimal point.
00:01:22
But we can say that there is a decimal point on the right-hand side of this least significant
00:01:27
digit.
00:01:29
And this position of this decimal point will also remain fixed.
00:01:33
So similarly, for the real numbers, the position of the decimal point is just before the fractional
00:01:38
part.
00:01:39
For example, this 11.75 is the real number, where 11 is the integer part and 75 just after
00:01:48
this decimal point represents the fractional part.
00:01:51
So when these real numbers are represented in the fixed point representation, then the
00:01:55
position of this decimal point remains the fixed.
00:01:58
Now in any digital system, these numbers are stored in a binary format using the certain
00:02:03
number of bits.
00:02:04
Let's say in a one digital system, these numbers are stored in a 10-bit format.
00:02:10
Now the issue with the fixed point representation is that with the given number of bits, the
00:02:14
range of the numbers that we can represent is very less.
00:02:18
So if we take the case of the integers, and specifically, an unsigned integer, then in
00:02:24
the 10-bit format, we can represent any number between 0 and 1023.
00:02:29
On the other hand, for the signed integers, the MSB is reserved for the signed bit.
00:02:35
So using the 10 bits, we can represent any number between -512 to +511.
00:02:41
That means using the 10 bits, the range of the numbers that we can represent is very
00:02:46
limited.
00:02:47
So here, basically the range refers to the difference between the smallest and the largest
00:02:52
number.
00:02:53
So of course, by increasing the number of bits, we can increase this range.
00:02:58
But still, if we want to represent the very large numbers, like 10^24 or 10^25, for example
00:03:05
the mass of the earth, then we need more than 80 bits.
00:03:10
And the issue of the range becomes even more prominent with the real numbers.
00:03:14
So when we are dealing with the real numbers, then we always come across this decimal point
00:03:19
or in general this radix point.
00:03:22
So the digits on the left of this decimal point represents the integer part and the
00:03:26
digits on the right represents the fractional part.
00:03:29
So to store such numbers in a binary format in the computers, some bits are reserved for
00:03:34
the integer part and the some bits are reserved for the fractional part.
00:03:39
So let's say, once again, these real numbers are stored in a 10-bit format.
00:03:44
And out of the 10-bit, the 6 bits are reserved for the integer part and the 4 bits are reserved
00:03:48
for the fractional part.
00:03:50
Now when we store these numbers in a binary format, then there is no provision for storing
00:03:55
this binary point explicitly.
00:03:57
But here, we have different sections for the integer as well as the fractional part.
00:04:02
And accordingly, each bit will have its place value.
00:04:06
So here, just after the 2^0s place, we will assume that there is a binary point.
00:04:12
So out of the 10 bits, if we reserve 6 bits for the integer part, then for the unsigned
00:04:17
numbers, we can represent any number between 0 to 63.
00:04:22
And for the fractional part, the maximum number that we can represent is equal to 0.9375.
00:04:29
And the minimum number will be equal to 0.0625.
00:04:34
That means in the 10-bit format, if we want to represent any real number, then the minimum
00:04:38
non-zero number that we can represent is equal to 0.0625, while if we see the maximum
00:04:44
number, then that is equal to 63.9375.
00:04:49
That means in general, in this fixed-point representation, the location of the radix
00:04:53
point is fixed.
00:04:55
And once we decide it, then it will not change.
00:04:58
So in a 10-bit fixed-point representation, once we freeze this specific format, like
00:05:03
the 6 bits for the integer and the 4 bits for the fraction, then we cannot represent
00:05:08
any number smaller than this 0.0625.
00:05:12
For example, if we want to represent this 22.0125 or this 35.0025, then we cannot represent
00:05:20
it in this 10-bit fixed-point representation.
00:05:23
So if we want to represent such smaller numbers, then we need to assign more bits for this
00:05:28
fractional part, like the 5-bit or the 6 bits for the fractional part.
00:05:33
So, of course, by doing so, certainly we can increase the precision.
00:05:38
But now, our range will get compromised.
00:05:41
For example, now we have only 4 bits for the integer part.
00:05:46
And now, in these 4 bits, we can represent any number between 0 to 15.
00:05:52
That means in this fixed-point representation, once the location of this radix point is fixed,
00:05:57
then our range and the precision will also get fixed.
00:06:01
But in the floating-point representation, it is possible to change the location of
00:06:05
this radix point or the binary point dynamically.
00:06:08
For example, for the given number of bits, let's say a 10-bit, if we want more range,
00:06:13
then we can shift this binary point towards the right.
00:06:17
Or for example, for some application, if we require more precision, then it is possible
00:06:23
to shift the radix point towards the left.
00:06:26
That means using the floating-point representation, it is possible to represent the very large
00:06:30
numbers like the distance between the planets or the mass of the planets and the very small
00:06:35
number like the mass of the atom using these floating-point numbers.
00:06:40
So this floating-point representation provides both good range as well as the precision.
00:06:46
So now, let's see how to represent these floating-point numbers.
00:06:51
So the representation of this floating-point number is very similar to how we are representing
00:06:55
the decimal numbers in the scientific notation.
00:06:58
So in the scientific notation, the radix point or the decimal point is set in such
00:07:03
a way that we have only one significant digit before the decimal point.
00:07:08
So for the integers, by default, the radix point or this decimal point is set to the
00:07:13
right-hand side of this least significant digit.
00:07:17
So here, to represent this number in the scientific notation, the decimal point is shifted to
00:07:22
the left-hand side by the five decimal places.
00:07:25
And that is why, here the exponent is equal to 5.
00:07:29
So as you can see over here, we have only one significant digit before the decimal point.
00:07:34
But if the same number is represented like this, then that is not the scientific notation.
00:07:40
Because if you see over here, then the digit before the decimal point is 0.
00:07:45
But in the scientific notation, it has to be non-zero.
00:07:48
Similarly, if you take this number, then in the scientific notation, this is how it can
00:07:54
be represented.
00:07:56
So here, for the scientific notation, the decimal point is shifted to the right by three
00:08:01
decimal places.
00:08:03
And that is why over here, in the exponential term, we have this 10 to the power minus 3.
00:08:08
So in the scientific notation, we have total two components.
00:08:12
That is the significand and the exponent.
00:08:15
So here, in the second representation, if you see the significand, then that is equal
00:08:20
to 4.345.
00:08:22
And similarly, the exponent is equal to minus 3.
00:08:26
And here of course, since we are representing the decimal numbers, so the base of the exponent
00:08:31
is equal to 10.
00:08:33
So here in the scientific notation, we are normalizing the numbers so that we have only
00:08:38
one significant digit before the decimal point.
00:08:41
And because of this normalization, it is possible to represent all the numbers in a uniform
00:08:46
fashion.
00:08:47
For example, if we take the case of this number, then the same number can also be represented
00:08:53
like this.
00:08:55
And of course, the value of the number will still remain the same.
00:08:58
But as you can see, all these representations are different.
00:09:02
So that is why it is good to have a uniform representation for each number.
00:09:07
So in general, we can say that in a scientific notation, this is how the decimal number is
00:09:12
represented, where this D represents the decimal digit.
00:09:16
So similarly, this floating point representation is very similar.
00:09:21
And here, this B represents the binary digits.
00:09:24
So if you see this floating point representation, then it consists of the three parts.
00:09:29
That is sign, fraction, and the exponent part.
00:09:33
And here, the base of the exponent is equal to 2.
00:09:37
So in this representation also, first binary numbers are normalized in this format.
00:09:43
So in a scientific notation, we have seen that we must have only one significant digit
00:09:47
before the decimal point.
00:09:49
Now in the case of the binary, we have only two digits, that is 1 and 0.
00:09:55
And therefore in the binary, the only possible significant digit is equal to 1.
00:10:00
That means in this floating point representation, this significant digit just before the binary
00:10:04
point will always remain 1.
00:10:07
So we can say that, this is the general representation for the floating point number.
00:10:12
So now let's see, how to normalize any binary number and how to represent it in the floating
00:10:17
point representation.
00:10:20
So let's say, this is our binary number.
00:10:23
And we want to represent this number in the normalized form.
00:10:26
So for that, we need to shift this binary point in such a way that just before the binary
00:10:31
point, the significant digit is equal to 1.
00:10:35
That means here, we need to shift the binary point to the left by 2 bits.
00:10:40
And that is why over here, this exponent is equal to 2.
00:10:44
That means whenever we shift the radix point to the left by a 1-bit position, then the
00:10:48
exponent will increase by 1.
00:10:50
So here, since the radix point is shifted to the left side by 2 bits, so the exponent
00:10:55
will increase by 2.
00:10:57
So similar to the left-hand side, when the radix point is shifted to the right by a 1-bit
00:11:02
position, then the exponent will decrease by 1.
00:11:05
For example, if we have this number and to represent this number in a normalized form,
00:11:11
we need to shift the binary point to the right side by 2 bits.
00:11:15
And that is why here the exponent will decrease by 2.
00:11:18
Or in other words, here this exponent is equal to minus 2.
00:11:22
So these two representations are in the normalized form.
00:11:26
So in this way, we can normalize any binary number and we can represent it in the floating
00:11:31
point form.
00:11:33
So now, let's see how this floating point number is actually stored in the memory.
00:11:38
So while storing, the 1-bit is reserved for the sign bit.
00:11:42
That means while this number is stored, then the MSB will represent the sign bit.
00:11:47
So if this bit is 0, then it means that the number is positive.
00:11:52
And whenever this bit is equal to 1, then it indicates that the number is negative.
00:11:57
So after the sign bit, the few bits are reserved for storing the exponent value.
00:12:03
And then the remaining bits are reserved for storing this fractional part.
00:12:07
So now if you see this significand, then here the integer part of this significand will
00:12:12
always remain 1.
00:12:14
And therefore, this 1 is not stored and instead of that only the fractional part is stored.
00:12:21
So this fractional part is also referred as the mantissa or the significand.
00:12:26
That means while storing this floating point number, we have total 3 parts.
00:12:30
That is sign, exponent and the mantissa.
00:12:33
Now to store this floating point number, a certain standard has been defined.
00:12:38
Like how many bits will be reserved for the exponent as well as the mantissa part.
00:12:42
And similarly, how to store this mantissa as well as the exponent part.
00:12:46
Because this exponent part if you see, then it can be positive or the negative.
00:12:51
That means we need to decide how to store this exponent part.
00:12:55
So to store such numbers, a common standard has been defined.
00:12:59
And one such commonly used standard is the IEEE 754.
00:13:03
So in the next video, we will see the format of this IEEE standard and we will understand
00:13:08
that as per this standard, how the floating numbers are stored.
00:13:13
But I hope in this video, you understood the difference between the fixed point numbers
00:13:16
and the floating point numbers.
00:13:19
And using this floating point representation, how it is possible to represent the very large
00:13:23
numbers or the very small numbers with good precision.
00:13:27
So if you have any question or suggestion, then do let me know here in the comment section
00:13:31
below.
00:13:32
If you like this video, hit the like button and subscribe to the channel for more such
00:13:36
videos.