Quantcast
Channel: Simple Talk
Viewing all articles
Browse latest Browse all 182

Rounding Numbers

$
0
0

I would hope by now everybody has figured out when you see a price like $39.99 that it might as well be forty bucks. Dropping that penny from the price is done is to fool the consumer! Since we read numbers left to right, the merchant hopes you will see the 3 and assume the price is emotionally less than 40. This is the concept of “significant digits” which figures in the rounding rules.

This concept of “lying to yourself” can be formalized with different functions and can turn out to be useful in the real world. I am going to be somewhat informal in this discussion, so bear with me. Let’s start off with the two functions that you see in programming languages and everyday arithmetic.

FLOOR() and CEILING() Functions

The FLOOR(x) function is easy to define informally. Cut off the fraction and simply return the integer part of a real number. This is what you were doing when you Ignored the pennies in those prices I just talked about. There is actually more to worry about than just this simple heuristic.

When you took freshman calculus, those functions of one variable that you studied were nice and smooth. However, when you graph out FLOOR(x), You get a “staircase” of half open intervals, with one “step” for each integer. The domain Is all the real numbers, and the range is all the integers. But this function is not continuous; think about taking the first derivative and you have problems. Likewise, if you try to take the integral, there can be some problems with the half open intervals. Do not worry about this, unless you’re in a calculus class.

When the parameter is a positive real number or zero, such as FLOOR(21.53) = 21, this is natural and easy. But when the parameter is negative, such as FLOOR(-21.53) = -22, you must remember that negative numbers get smaller as you move away from zero. The CEILING() function returns the smallest integer value which is greater than or equal to it’s parameter. This means when the parameter is positive, CEILING(21.53); = 22 and when the parameter is negative, CEILING(-21.53) = -21.

You can write the FLOOR(x) or the CEILING(x) with an expression in the parameter using the other function. The CEILING(x) function returns the next highest integer. You probably won’t see CEILING(x) used too often because it’s generally easier to write it in terms of FLOOR(x+1).

TRUNC function

Truncation is another function related to FLOOR and CEILING and that is the truncation function TRUNC(x,d). The name pretty well explains itself. There’s no attempt to change a digit on the cusp. This function will simply chop off the digits below a certain degree of precision (which is the second parameter).

SELECT FLOOR(54.4) AS Floor, CEILING(54.4) AS Ceiling, 
       TRUNC(54.9,0) AS Trunc;
Floor   Ceiling     Trunc
------- ----------- -------------
54      55          54.0

And for the negative values:

SELECT FLOOR(-54.4) AS Floor, CEILING(-54.4) AS Ceiling, 
       TRUNC(-54.9,0,1) AS Trunc;
Floor   Ceiling     Trunc
------- ----------- -------------
-55     -54         -54.0

Rounding

All these functions just discussed are useful, but there are situations where reducing a real number to an integer is not useful. This is where we get into what people usually think of rounding to be. We talk about rounding to (n) decimal places. In fact, this is a parameter in the ROUND() function in SQL Server and most other SQL products so you can decide if you want to round to the nearest penny, nearest dollar, or whatever unit your measurement is in.

Decades ago, before we had calculators and had take square roots by hand, we were taught that when we wanted to round up a number to (n) decimal places, the rule was to look at the (n+1) decimal place. If position(n+1) < 5, then it was rounded down; if position(n+1) > 5 then increment that digit, rounding up. Thus 3.141 became 3.14 and 3.147 became 3.15. The problem was with 3.145 because we didn’t have a clear rule for position(n+1) = 5 digit.

Of course, typically you will round up the digit with the value of 5, but it is important to understand the rule because it can be different.

Scientific Rounding

The rule I just gave was known as scientific rounding back in the dark ages of public school education. The usual correction given for the cusp case of five was if position(n+1) >= 5 then increment. This was a “benefit of the doubt.” sort of rule.

It typically worked well in situations where the value being rounded was a value in a relatively small set, usually one data point. Since this is how a scientific experiment would behave, we call it scientific grounding.

In SQL Server, we can use the FLOOR() function to round down a decimal number to the nearest integer. For example, if you have a DECIMAL (2,1) number (3.8), the FLOOR function will round it down to 3. In regular arithmetic, this is just fine, but SQL is a strongly typed language; we really wanted (3.0) to match the DECIMAL data type used in the DDL. The trailing zeros tell us to what number of decimal places the original calculations were done. In some situations, you might also consider displaying leading zeros to show the source data type.

Commercial Rounding

In the old days, when we were given a long list of values and got a five, we wanted to round up or down In such a way as to average out the errors, so that the totals would be more accurate. It’s called commercial rounding because it was first used in banks and financial computations.

The rules were the same as the scientific rounding for digits not equal to five. When you got to five you were to go over to the position (n+2) digit and see if it was odd or even To determine whether you’re going to round up or down. If this position has another five, then go to the position (n+3) digit and repeat the process until you got an answer or ran out of digits. The idea was that as you got to smaller and smaller decimal fractions, the digits would tend to be random. This randomness would smooth out total errors.

In fact, there’s a piece of IT folklore that crops up every now and then about an embezzler who saves the rounding errors until they accumulate enough to be worth cashing out. The version of this folklore that I learned was a story about a bank that had promotional campaign. They wanted to look at their customers “from A to Z” in an ad campaign, so they pulled off the customers with names at both ends of the alphabet. The first customer was named something like ”Aaron” and the last customer was “Zwabo”; this last customer did not really exist. It was a dummy account set up by the embezzler where all the rounding errors were accumulated.

Statistical Rounding

If you know something about the population, Assuming that the distribution of odd and even digits after position(n) would be evenly distributed is a leap of faith. If we know something about our population, then we can determine if the data is skewed. For example, if I am measuring liquids, I would expect the total number of liters pumped to customers to be less than the total number of liters In the original storage tank. Between evaporation, spillage and cumulative rounding error at the pump, I should see a slight loss.

The systematic sort of error occurs in many engineering applications. To give you another piece of folklore, when Rolls-Royce decided to use automatic transmissions in their automobiles, they bought patents rights from an American motor company. Why reinvent the wheel?

Unfortunately, they built these transmissions to Rolls-Royce tolerances and precision. The gears meshed so precisely that transmission fluid. was squeezed out. The transmissions did not work. Another example is that you can get a handgun in which the parts fit together so precisely that the gun has a kick that make it almost impossible to aim after the first shot. This is real mess for quality control people; they can check the individual parts and find that they are just what they’re supposed to be, but the machinery as an assembled whole does not work.

The ROUND() Function

As to have parameters for the value you want to round and how much rounding you want to do. The syntax for the ROUND function in Oracle, MySQL and PostgreSQL is:

ROUND (<input>, <decimal places>)

The purpose of the ROUND() function is to take a number and round it to the specified number of decimal places to either side of the decimal point. ROUND can also be used with DATE input values in some products.

As a bit of an aside, in Oracle, you can supply a DATE value to this function, as well as the format, and it will round the date to that format. While it is possible, I strongly recommend against it. It’s not standard and it’s rather confusing. Mixing temporal and numeric values makes no sense. What do you think the square root of Ramadan should be? Christmas or Thanksgiving?

In ANSI Standard SQL, I can use the EXTRACT (<field name> FROM <temporal value>) function, where <field name> is YEAR, MONTH, DAY, HOUR, MINUTE, SECOND or an interval specification, defined by concatenation of the appropriate temporal field names. Let’s not worry about that right now. }}

Oracle presents some particular problems. Their data storage model is based on a COBOL PICTURE approach to data. That is, they see numbers as strings of digits and not an internal binary representation. This is why they can talk about rounding to the left or right of the decimal point. It is also why they can mix temporal and numeric data in the inputs.

Instead of following the ISO 8601 standards for temporal data, they use a proprietary display format; This means that ’2024-01-01’ Is stored as ‘01/JAN/24’, so it is mess to sort and depends on English to read. Don’t feel too smug SQL Server users; you have all kinds of options with the non-standard CONVERT() function to screw up your dates.

Oracle also has some weird rules for their BINARY_FLOAT and BINARY_DOUBLE data type.

You can use a negative value for the <decimal places> to get rid of the fractional part, the number and the lower digits. For example, to round number to the hundreds:

SELECT ROUND(12345.67, -2);

Will return 12300.00

To round to the nearest 5, you use the ROUND function and divide your number by 5, then multiply it by 5 . For example:

SELECT ROUND(3162.845 / 5, 0) * 5;

Returns 3165, This expression is equal to the following steps: 3162.845 / 5.0 = 632.569; ROUND(632.569,0) = 633; 633 * 5 = 3165;

The action of dividing, rounding, and then multiplying causes the result to be bumped to a multiple of five. Play with this a bit and you can find how to do this for any value not just five.

Floating-Point Math

When I was learning to write FORTRAN, we had to spend two weeks learning how to handle floating-point arithmetic. Since all these values are approximate, the floating-point implementation or hardware is constantly doing rounding under the covers. Unfortunately, this leads to some problems. It means that ((a + b) + c) will not always equal (a + (b + c)) in even simple arithmetic computations. It also means that a test for equality can succeed when it’s not an exact equality. In floating-point math two values can be said to be equal, if they differ by less than a certain valued, called epsilon.

In Standard SQL, we have REAL, FLOAT(p) and DOUBLE PRECISION data types. A lot of programmers today have never had to work with floating-point. Originally, there was a subtle difference between FLOAT and REAL; At that time, floating-point numbers could be based on a decimal or a binary model for the exponent and mantessa. Every vendor had slightly different implementations. This disappeared with IEEE Standard 754, which uses 32 bits for single precision and 64 bits for double precision.

But wait, it gets worse! The IEEE model of arithmetic also includes the concept of a NaN (not a number). These are bit configurations that hold special symbols. In fact, the IEEE standard even defines rules for floating-point arithmetic with NaNs,. Given the problems that people have with NULL in SQL, it’s probably not worth it to add more complexity to the language.

The two interesting NaNs for us are -∞ (negative infinity) and +∞ (positive infinity), since they could potentially be stored in data. The IEEE Standards set them up as a way to handle overflow problems.

During the last few decades, software has evolved to handle larger and larger numbers precisely in INTEGER and DECIMAL data types. Since most databases are commercial and not scientific, there’s little need for floating-point math in the real world. In the last 20 years or so, the only time I have seen someone actually declare a column to be of type REAL, it was a serious design error. They were using it to store currency amounts or measurements that should have been in DECIMAL. Given the programmer the benefit of the doubt, I will assume they are old Fortran programmers who simply collapsed back into the keywords they remembered from their original language.

Conclusion:

If you need to do rounding, read the manual for your particular SQL product and maybe take a moment or two to make sure if you are rounding numbers correctly based on your clients needs. It may not be as easy as you think. Make sure that you test for positive, zero and negative numbers as needed, too.

The post Rounding Numbers appeared first on Simple Talk.


Viewing all articles
Browse latest Browse all 182

Trending Articles