Displaying all posts in the series entitled Mathematics.

Calendrical Calculations, Part 2: Mod Math

— Arvada, Colorado UNITED STATES

I'm hungry. Now, I have one of those microwavable Asian meals; all I need to do is add water and put it in the microwave. The instructions are telling me to heat it up for ninety seconds. Okay, but my microwave won't accept ninety seconds. Instead, I have to express the value in components of minutes and seconds. Now, the answer is 1:30—I know this, and I didn't need a calculator to figure this out… but how would we program a computer to calculate this for us?

These type of problems come up frequently when working with calendars, so I want to talk about this before we delve further because different computer languages do different things and also come to different conclusions when you ask them to perform the same calculation.

When I first learned long division back in third grade, I was taught to compute remainders. For example, if I was asked to calculate nine divided by four, I would have told you that it was two with a remainder of one. The next year, in fourth grade, I was taught how to divide using decimals. Now, nine divided by four became 2.25.

In some computer languages like JavaScript or PHP, if I program it to divide two numbers, I will get the fourth-grade answer—that is, a floating-point number as a result. In JavaScript, console.log(90 / 60) gives me 1.5. Simple.

However, when working with calendars, it's typically the third-grade answer that we want. This is called integer division. Other computer languages like C, C++, Java and C# will do integer division by default if we're dividing two integers. For example, in Java, evaluating System.out.println(90 / 60); will give a result of 1 and not 1.5 as we may expect. What about the remainder then? We would have to do something called a modulo operation to obtain that. In Java, the modulo operation would look like this: System.out.println(90 % 60);. That percent sign is essentially saying, “Divide 90 by 60, but return the remainder instead of the quotient”. The result that we get from that is 30. Combining those two results, we just split 90 seconds into its components of 1 minute and 30 seconds.

Integer division and modulo are two operations that have a special mathematical relationship. The general rule is this:

dividend mod divisor = dividend divisor round ( dividend divisor )

Most computer languages follow this rule. If that's the case, how then can computer languages come to different results? Well, you'll notice that there's a round function in that formula. While each computer language may follow this general rule, computer languages do not necessarily round using the same method.

So let's talk about rounding now. In mathematics, there are several different ways to round numbers. We're going to focus on two: the floor function and the ceiling function. You'll typically find these in any programming language's standard library.

The ceiling function (sometimes spelled ceil), will round the parameter to the closest integer that is greater than or equal to the parameter. We don't use it that much for calendrical calculations. The floor function on the other hand gets used very much. It does the opposite of ceiling: it rounds the parameter to the closest integer that is less than or equal to the parameter. When we do integer division and modulo, we almost always want to use the floor function as our rounding method.

That is not what we get with C, C++, Java, C#, JavaScript or PHP. When these languages perform integer division, they simply get rid of the digits on the right side of the decimal point of the result. This is called truncation. Now, if you think about it, truncation sounds like it's the same as the floor function. That's because it is… for positive numbers! As soon as negative numbers get thrown into the mix, we end up with results consistent with performing the ceiling function. That's bad!

The consequences for modulo arithmetic are this: if we use the floor function when doing integer division, the result of the modulo operation will always have the same sign as the divisor. However, if we truncate the result of the integer division instead, the result of the modulo operation will always have the same sign of the dividend.

Here's an example. Last week, I talked about representing dates as linear numbers. I put all of my family members' birthdates into a table with the linear-date values. One benefit of representing dates like this is that it's easy to find out what day of the week a certain date is. You just divide the number by seven and take the remainder. The remainder will correspond to a day of the week. What happens if we do that to my family members' birthdates?

NameOrdinal DateOrdinal Date mod 7Day of the Week
Lance203685Thursday
Lucy205590Saturday
Samuel297286Friday
Helena302271Sunday
Nicholas308556Friday
Susan310693Tuesday
Daniel314753Tuesday
Juliana373822Monday
Ana393481Sunday
Amy404666Friday
Catherine411692Monday
Anastasia413630Saturday
Emily417314Wednesday
Patricia421525Thursday
Elisa421551Sunday

It worked great. Now suppose that I want to add my great-grandmother to the list. She was born 1893 February 13. In the number system that we're using, that translates to -2511. That's a negative number, but the beauty of representing dates on a linear scale is that negative numbers aren't a problem. So what day of the week was my great-grandmother born on? Well, if we do the modulo operation in a computer language that uses the floor method like Ruby, Lua, Perl or Python, we get 2 which corresponds to Monday. However, what happens when we try that modulo operation in computer languages that use the truncate method like C, Java or PHP? Well, the dividend is negative and, in these programming languages, the result of the modulo operation has to have the same sign as the dividend, so we end up with… -5. Positive dividends can have positive results and negative dividends can have negative results. Since both positive and negative dividends are possible, the unfortunate consequence of this is that we need to program our code to handle negative and non-negative results.

With calendars, we consistently want to use the floor method, so how do we deal with the these problems?

One way is to use a computer language that has floor division right out of the box. Guido van Rossum, the creator of Python, once wrote a blog post about Python's use of the floor method instead of the truncate method. He made the right choice in my opinion.

Some computer languages provide give you options. For example, while Java's / and % operators use the truncate method, Java now has floorDiv and floorMod functions that you can use when you need the floor method. Both Haskell and Lisp have a rem function that uses the truncate method and a mod function that uses the floor method.

Sometimes, we don't have a choice in what programming language that we use. In those instances, one thing that can help is to rewrite code to avoid negative numbers. Obviously, that isn't always an option.

The method that I see most often in production code is to check if the result of the modulo operation is negative and, if it is, add the divisor to it. Here's a C function that does just that.

int Modulo(int X, int Y) {
	int Result;
	
	Result = X % Y;
	if (Result < 0) {
		Result += Y;
	}
	return Result;
}

This works all of the time… unless your divisor is negative.

If you're using computer languages that don't have integer division (like PHP and JavaScript), there's a more straightforward solution.

function Modulo(X, Y) {
	return X - Y * Math.floor(X / Y);
}

The advantage with this method is that it works even when the divisor is negative.

Now that we have proper understanding of the nuances of integer division and modulo operations in programming languages, we can move on to applying these in converting calendar dates to linear dates. More on that to come.

Tags: Calendars, Mathematics, Software Development

Write or View CommentsPermanent Link

Add Comment

If you would like to comment on something that you read, by all means, leave a note here. Please note that all comments are approved before being displayed to prevent spam comments.

Calendrical Calculations, Part 1: Cyclical vs. Linear

— Arvada, Colorado UNITED STATES

By some accident of probability, I have exactly eight nieces and no nephews. The fact that they're all nieces isn't relevant to the story; it's just interesting. The fact that there are eight of them is relevant.

So, it's my job as the bachelor uncle to buy the cool toys for my nieces on their birthdays. Since there are eight of them, I'm going to need help remembering their birthdates. Yes, I know that they have calendars and apps for your phone that will tell you these sorts of things, but I want us to understand how those apps work.

First, I e-mail my sister in Mexico, Helena, and my sister-in-law across town, Susan, to get a list of all the birthdates in their families. I'm going to compile their responses into a table of some sort.

Amy10/15/10
Ana23.9.2007
Anastasia30.3.2013
Catherine9/17/12
DanielMarch 4, 1986
Elisa31.5.2015
Emily4/2/14
Helena3.10.1982
Juliana6.5.2002
LanceOctober 6, 1955
LucyApril 14, 1956
Nicholas6/22/84
Patricia5/28/15
Samuel22.5.1981
Susan1/22/85

Yuck. The dates don't look the same. Helena lives in Mexico, so, as one would expect, she provided the dates with the day of the month first, the month next and the year last. Susan, on the other hand living in the United States, gave me the dates in the American format: the month, followed by the day of the month and then followed by the year. Helena used four-digit years while Susan gave me two-digit years. I myself entered in a few birthdates for family members that I already knew, but I spelled out the months. When I got the responses, I just threw them onto a table the same way that I got them. They need to be consistent.

Let's try putting them all in my format.

AmyOctober 15, 2010
AnaSeptember 23, 2007
AnastasiaMarch 30, 2013
CatherineSeptember 17, 2012
DanielMarch 4, 1986
ElisaMay 31, 2015
EmilyApril 2, 2014
HelenaOctober 3, 1982
JulianaMay 6, 2002
LanceOctober 6, 1955
LucyApril 14, 1956
NicholasJune 22, 1984
PatriciaMay 28, 2015
SamuelMay 22, 1981
SusanJanuary 22, 1985

Now, it's sorted alphabetically by name. Here's the thing: I need to know who's birthday is coming up next. If the list were in order by birthday, I may be able to figure this out better. Let's tell the computer to sort it that way.

LucyApril 14, 1956
EmilyApril 2, 2014
SusanJanuary 22, 1985
NicholasJune 22, 1984
AnastasiaMarch 30, 2013
DanielMarch 4, 1986
SamuelMay 22, 1981
PatriciaMay 28, 2015
ElisaMay 31, 2015
JulianaMay 6, 2002
AmyOctober 15, 2010
HelenaOctober 3, 1982
LanceOctober 6, 1955
CatherineSeptember 17, 2012
AnaSeptember 23, 2007

That didn't turn out the way that I expected it. It sorted it alphabetically by month. Also, it didn't even sort properly within the month. Look at the month of May: Samuel, Patricia and Elisa's birthdays seem to fall in order. However, Juliana's birthday is before any of theirs, but she's listed after. That's because ‘6’ comes after ‘2’ and ‘3’ when we sort alphabetically. We need to make these numbers somehow.

At my work, when we enter in dates into the computer system, we enter a two-digit month, a two-digit day of the month and then a two-digit year. For example, today, 2018 September 16 would be entered in as “091618”. Let's try entering the dates with this method and then sorting on that.

Susan012285
Daniel030486
Anastasia033013
Emily040214
Lucy041456
Juliana050602
Samuel052281
Patricia052815
Elisa053115
Nicholas062284
Catherine091712
Ana092307
Helena100382
Lance100655
Amy101510

Yes, this really is how we enter the dates into the computer at work, and it's a bad way to do it. At least now, they're in the order that I want. Also, Catherine's having a birthday tomorrow! I need to get her a gift tonight! Before I can do that, I need to know how old she's going to be. Of course, I could just look at the number and deduce that she'll be six, but I need to get the computer to figure that out. Actually, what will be helpful would be to sort the list in such a way where the oldest people are on top and the youngest on the bottom. To do that, I need to change the date format again.

Lance19551006
Lucy19560414
Samuel19810522
Helena19821003
Nicholas19840622
Susan19850122
Daniel19860304
Juliana20020506
Ana20070923
Amy20101015
Catherine20120917
Anastasia20130330
Emily20140402
Patricia20150528
Elisa20150531

Okay, we have a four-digit year, two-digit month and two-digit day of the month. This is still a bad way to do it in my opinion. However, there are computer systems that use this method. MySQL uses a similar method for storing its dates. Why is this bad then?

Let's take a moment to talk about time in the platonic sense. Not only do we perceive time in cycles, we order our lives around these cycles as well. Think of how often you wake up, attend religious ceremonies, get paid, pay the rent or the mortgage, pay taxes, vote or watch the Olympics. All of these events occur in cycles. The day is probably the most fundamental of these cycles, but even the day is broken up into smaller cycles. An analogue clock is the absolute perfect way to represent this. The instant that the second hand reaches the top of the clock, a new cycle of sixty seconds begins, and the number of these cycles that have passed is represented by the minute hand (and, similarly, the hour hand). Since we perceive time in cycles, it is natural that we represent time that way in our speech and our writing.

However, it is exactly deplorable to do mathematical calculations when time is represented in cycles. (It's possible… but it's also exactly deplorable.)

When dates get represented as numbers, they need to act like numbers. For example, if I'm travelling on the highway and if I'm at mile marker 269 and if I know my destination is at mile marker 116, I can easily subtract the two numbers to know that my destination is 153 miles away. This makes sense because… that's… just how numbers work!

Suppose that I want to know how much older Patricia is than Elisa. Using the numbers on our table, we can subtract Elisa's birthday (20150531) from Patricia's birthday (20150528). Doing the math tells us that Patricia is three days older than Elisa. Perfect right?

No! Let's try that again, except, let's see how much older Lance is than Nicholas. By doing the same subtraction, we get 289,616 days—which is close to 793 years. I happen to know that Lance was twenty-eight when Nicholas was born. Why did the math work for Patricia and Elisa but not for Lance and Nicholas? It's because Patricia and Elisa were both born in the same month—May of 2015. Lance and Nicholas weren't. This system has gaps in it—literally. For instance, the difference between 20171231 and 20180101 is only one day. However, if we subtract the two numbers, we get 8,870 and not… one (you know, kind of like we should). Also, what about a date like 20171581? That's not even a date that ever existed, but what would a computer program even do if it was given that input?

We want the dates to work the same way that our mile markers worked. In order to get that to happen, we have to abandon notating our dates in cycles and put them on a linear scale like the mile markers. (I do have to point out that the mile-marker analogy breaks down as soon as we realize that the mile markers reset at the state line.)

This is the way that most software that calculates dates already works! If you're a spreadsheet person, try this: open up your spreadsheet software—Excel, LibreOffice or, if you don't have any of these, Google Sheets. Select a cell and type in today's date (or press Ctrl+;). Select another cell and type in your birthdate. Select now a third cell and create a cell to subtract your birthdate from today's date. Once you press enter, the result that you should see is the number of days old you are. Divide that by 365.2425, and you should get your age in years.

The reason this works is because each cell in a spreadsheet has a value and a format. The format that you see is the date represented in the cycles that we humans are used to (i.e., years, months, days). However, the value that's in the cell is a number. If you want to see this number, select a cell and then change the number format for the cell to ‘General’ or ‘Automatic’. (This process is different in different spreadsheet programs.) It goes the other way around. Type ‘37045’ into a cell. Now, change the number format of the cell to a date. You'll end up with 2001 June 3.

If we take that table of dates from above and put it in a spreadsheet, we'd end up with this data:

Lance20368
Lucy20559
Samuel29728
Helena30227
Nicholas30855
Susan31069
Daniel31475
Juliana37382
Ana39348
Amy40466
Catherine41169
Anastasia41363
Emily41731
Patricia42152
Elisa42155

While that's what the spreadsheet will actually store in the background, we can give the spreadsheet instructions to format the data however we want. It will convert the dates from these linear formats into the cyclical representations that we humans deal with. How those conversions are done is a topic for another time, but these dates on a linear scale are an improvement. Remember how I said that mathematical calculations are exactly deplorable with cyclical representations? Well, now, mathematical calculations are elegant and straightforward. Need to know what date it will be a hundred days from now? Tell the spreadsheet to add the days and let it do the math for you!

Tags: Calendars, Mathematics, Software Development

Write or View CommentsPermanent Link

Add Comment

If you would like to comment on something that you read, by all means, leave a note here. Please note that all comments are approved before being displayed to prevent spam comments.

Alcohol by Volume vs. Alcohol by Weight

— Arvada, Colorado UNITED STATES

Let's talk about math. If math scares you, don't worry: we're really going to talk about beer.

Beer (or any alcoholic drink for that matter) is mostly just water with some ethyl alcohol thrown in for good measure and trace amounts of other stuff that provides the flavor. Typically, about 5% of a beer is alcohol, so a 12 floz serving of beer will have 0.6 floz of alcohol. Now, I said that a beer is typically 5% alcohol: beer does come in many different strengths. For example, last night, at the Wynkoop Brewing Company, I had a beer that was 8.2% alcohol, and, because the content was so high, the bar gave me a “short pour” which means that they gave me a brandy snifter's worth of beer.

One common strength of beer that is seen especially in the retail area is 3.2% beer. In Colorado, where I live, beer can only be sold in a grocery store if its strength is less than 3.2%. If you're fine with drinking 3.2% beer, you can pick up your watered-down, mass-marketed, American pale lager at the same time you buy your milk and eggs. If you prefer wine, whisky or beer that's stronger than 3.2%, you need to make a special trip to the liquor store.

So, if we do the simple math on this and divide 3.2 by 5, 3.2% beer has only 64% percent of the amount of alcohol that a normal 5% beer has. That math would be fine except that it's completely wrong.

For some legal or historical oddity that I think has to do with taxation laws (but I am not sure), when a beer advertises itself as 3.2%, what it means is 3.2% by weight as in 3.2% of the total weight of the liquid is alcohol. Just about every other time you look at an alcoholic beverage label, the alcohol content is listed by volume… and they are most definitely not the same. Well, they are the same, but only in exactly two instances: 0% alcohol by weight is 0% by volume, and 100% by weight is also 100% by volume. However, between those two extremes, the relationship is not linear.

Most calculators that I come across on the Internet, however, do assume that the relationship is linear with the common assumption that the alcohol by weight (ABW) is 80% of the alcohol by volume (ABV). This may be close enough for the range of alcohol that covers most beers, but “close enough” only counts in horseshoes and handgrenades. We can do better.

So, let's think: how is alcohol by volume (or ABV) defined? Well, it's a ratio of the volume of alcohol divided by the total volume. Now, the total volume of an alcoholic beverage is alcohol and water… and some other stuff that provides the taste, but for our purposes, let's just ignore that other stuff—alcohol, water and nothing else. When we put it together, we get something like this:

ABV = VolumeOfAlcohol VolumeOfAlcohol + VolumeOfWater

Okay, but we're dealing in weights—not volumes. Since a volume of something is equal to the weight of that something divided by its density, we can replace the volumes with weights, so let's consult Wikipedia (the most peer reviewed publication on the planet). If we look up the appropriate articles, we'd find that ethyl alcohol (the type used in alcoholic beverages) has a density of 0.78945 kilograms per liter at 20°C (68°F). At that same temperature, water has a density of 0.9982336 kilograms per liter. For this exercise, let's round it at 0.99823 kilograms per liter. If we replace the volumes in our equation, we end up with:

ABV = WeightOfAlcohol 0.78945 kg l WeightOfAlcohol 0.78945 kg l + WeightOfWater 0.99823 kg l

…and if we simplify that, do the dimensional analysis and eliminate the decimals, we get:

ABV = 99823 WeightOfAlcohol 99823 WeightOfAlcohol + 78945 WeightOfWater

Our units cancel out in the dimensional analysis, which, since the ABV is a ratio, makes sense. Now, remember when I said that we're dealing with weights? I lied. What we actually know is the alcohol by weight (ABW), and we know the ABW is the ratio of the weight of the alcohol to the total weight. This is very similar to the equation that we started with for the volumes:

ABW = WeightOfAlcohol WeightOfAlcohol + WeightOfWater

Well, we could try to solve this all as a system of equations. Let's solve our ABW equation for either the weight of the alcohol or the weight of the water. (It won't matter which one in the end, but let's pick weight of water.) When we go through the steps to solve for the weight of water, we get:

WeightOfWater = WeightOfAlcohol ( 1 ABW ) ABW

Let's plug that into tho ABV equation and simplify. If you've been following along on your TI-89 from high school, you should get:

ABV = 99823 ABW 20878 ABW + 78945

…and there's our formula. What's interesting is that when we plug in the weight of water from the ABW equation, the weight of the alcohol gets canceled out. If we would have plugged in the weight of alcohol instead, we get the same results.

If you need to convert the other way around (i.e., obtain the ABW from an ABV), you just need to solve the above formula for ABW:

ABW = 78945 ABV 20878 ABV 99823

So, when we put 3.2% into the equation for the ABW, we come within a rounding error of 4.0%. Remember how we did the math and we said that 3.2% beer is 64% as strong as regular beer? Well, it turns out that, since 3.2% beer should really be marketed as 4.0% beer, it's really 80% as strong as regular beer. Keep that in mind before you consider chugging down another 3.2% bottle after you've already had a few: it may be more potent that you think.

Tags: Beer, Mathematics

Write or View CommentsPermanent Link

Add Comment

If you would like to comment on something that you read, by all means, leave a note here. Please note that all comments are approved before being displayed to prevent spam comments.