Questions tagged [floating-point]

Mathematical questions concerning floating point numbers, a finite approximation of the real numbers used in computing.

465 questions
0
votes
1 answer

Max Mantissa $2^{bits}-1$

if we look at a $5$ bit mantissa, the max value will be $11111$ which is $2^5-1$, Why is it in the form of $2^{bits}-1$ is it a combinatorial explanation?
newhere
  • 3,115
0
votes
0 answers

How many discrete points can be expressed in $[-1/2, 1/2)$with IEEE 754 double floats and what is the meaning of precision?

I know that IEEE754 double floats (64-bit floating number) is known to provide 52 bits of precision (or 53 bits including implicit 1). But I do not know the exact meaning of the precision. Suppose we want to approximate a rational number $v$ using…
0
votes
1 answer

Floating-point arithmetic and loss of precision: Shifting mantissa until exponents match

My book says the following about floating-point arithmetic involving the addition/subtraction of two numbers, $x$ and $y$, that differ in their exponent: In adding or subtracting two floating-point numbers, their exponents must match before their…
Aleksandr Hovhannisyan
  • 2,983
  • 4
  • 34
  • 59
0
votes
1 answer

What is the purpose of regime bits in posit encoding?

Why do we need regime bits in posit? posit encoding:
kevin998x
  • 123
0
votes
0 answers

Why Cardinality not the same for rounded floats?

I need to solve a question: Two float values are equivalent if they return same integer with Math.round(). Why the equivalence classes arising from this equivalence relationship does not have the same cardinality? I was thinking that cardinality…
0
votes
1 answer

How to calculate a floating point of negative number?

In IEEE double precision n=53. So to represent 16 I can do the following: The next biggest number from $16=+(.10 \dots 01)_22^5=2^{-1}2^5+2^{-53}2^5=16+2^{-48}$ Now the biggest number from $-16=-16+2^{-49}$, but how to show this formally like i did…
ASROMA
  • 579
0
votes
1 answer

conversion of floating point to 8- bit binary word

What is the representation of $-(52.625)_{10}$ in 8 bit word? In 8 bit word , first bit represents the sign of the no. The next three bits represents the exponent and the last 4 bits the mantissa..Now since there is no bits to represent the sign…
shadow kh
  • 953
  • 5
  • 15
0
votes
2 answers

How to convert 601.0 to IEEE-754 Single Precision

I am trying to understand how to convert from decimal to IEEE-754 Single Precision binary representation. I make up a random number which happen to be 601.00 I tried my best to figure it out and this is what I got: Step 01: I divided 601 by 9 (since…
0
votes
0 answers

Calculate the largets and smallest number fiven exp, bias, and fract

Give 8 bits total, where 3 bits are exp bits, 4 bits are frac bits and 1 for sign bit. Have to find the largest and smallest values. 0 110 1111 - largest 1 110 1111 - smallest 1) E = exponent - Bias = 6 - 3 =3 2) M = 1 + f = 1 + 1/2 + 1/4 + 1/8 +…
Inferno
  • 11
0
votes
1 answer

Closest number to 1.22

Given: 1 bit for sign 3 bits for exp 4 bits for fract How to find the closest floating-point number to 1.22 ??
Inferno
  • 11
0
votes
0 answers

Whether assigning of single precision IEEE754 float to double is reversible?

Within scope of IEEE754 standard let's assign single precision variable s to double precision variable d and then assign d to single precision variable s'. Whether this operation is reversible(lossless) for any value that can be represented in…
Vlad
  • 129
0
votes
2 answers

16.75 How to convert to floating point representation?

16.75 convert to base 2 floating point representation. Need help on formula, Thanks.
0
votes
1 answer

can anyone please explain how 2.8 modulo 2 is 0.7999999999999998?

I am not a mathematician, and just started programming in javascript and wonders how 2.8 % 2 = 0.7999999999999998. Note: I know it is remainder operation. May be I forgot my school mathematics concepts. Can anyone please explain this? Thanks.
jsingh
  • 31
0
votes
1 answer

biased exponent vs unbiased exponent

Following IEEE-754, I am looking for an example that shows processing an unbiased floating point representation is harder than processing a biased one. All I see in the texts is that unbiased numbers have to be compared with a negative system…
mahmood
  • 223
0
votes
1 answer

Precision and Accuracy

How would I go about calculating the precision and accuracy of a given number? For example 0.05 has an accuracy of 2 and a precision of 3. 1 has an accuracy of 0 and a precision of 1. Is there an algorithm for calculating this?
cxdf
  • 111