recent comments

recent articles

  • How long would it take to read Wikipedia?

    Almer S. Tigelaar 21 / 02 / 2012

    Wikipedia has become the de facto encyclopedia on the Internet. A traditional encyclopedia spans many textbook volumes which would take any normal person ages to read. Few people would likely engage in such an endeavor. However, since Wikipedia is readily accessible: should you take up the challenge?

    read more 0 comments
  • Life in a Day

    Almer S. Tigelaar 09 / 02 / 2012

    The premise behind the YouTube documentary “Life in a Day” is interesting: invite everyone around the world to shoot video on one specific day: July 24th 2010. Have people upload their raw footage and edit it so it becomes a short, ninety minute, documentary that chronicles a single day on our planet. Does this extreme form of crowdsourcing actually work?

    read more 0 comments
  • Top 8 Prejudices about Americans

    Almer S. Tigelaar 07 / 02 / 2012

    When travelling abroad it is difficult to go with an open mind. Despite our best efforts we bring with us an excess of prejudice shaped by our own culture and view of the destination country. So to it was for me when I visited the United States. When coming back, people at home are very insistent that you play into their prejudice regarding where you’ve been as well, perhaps as a means of reinforcing their own identity.

    read more 0 comments

Category: Puzzles

Puzzle #4: Working Days

Almer S. Tigelaar 06 / 10 / 2011, 09:00

Calculating the number of days between two arbitrary dates is doable for humans and trivial on modern computers. However, calculating the number of working days: Mondays through Fridays, is more complicated. In this puzzle you are asked to devise an algorithm for doing this. It is something you can do on the back of a napkin, so you don’t need anything besides pen, paper and the rational part of your brain.

Problem:
Given two dates d_{1}, d_{2} where d_{2}>d_{1}, what is the number of workdays w between these two dates? Assume that only Saturdays and Sundays are not workdays: holidays can be ignored.

You may use the following definitions augmented with any standard mathematical functions, like div and min.

  • d is a date
  • Subscripts are used to indicate different dates: d_{1},d_{2}\cdots d_{n}
  • d_{x}-d_{y} yields the number of days between two dates. This is negative if d_{x}<d_{y} and zero if the dates are identical
  • weekday\left(d\right) gives the day of the week for the given date as a number (1 = Monday, 2 = Tuesday, …, 7 = Sunday)
  • weeknumber\left(d\right) gives the number of the week for the specified date (in the range 1 up to including 53)
  • workdays=\left\{ 1,2,3,4,5\right\} is the set of numbers that represent working days (a subset of the domain of the weekday function)

Try to work out a solution first, then expand mine below.

More »


Solution:

There are really two cases to consider.

If both days are in the same week:
weeknumber\left(d_{1}\right)=weeknumber\left(d_{2}\right)\Rightarrow d_{2}-d_{1}\leq7,
the solution might seem simple: w=d_{2}-d_{1}. This will work provided that both dates have a weekday less than or equal to five. If this is not the case, we need a more complex function: w=min\left(weekday\left(d_{2}\right),5\right)-min\left(weekday\left(d_{1}\right),5\right)+1. The min function returns the lesser of its two arguments (or either one in case they are equal). The +1 is to count the end date as a whole day. Note that we exploit here the fact that the workdays form a continuous range (1–5).

If both days are not in the same week:
weeknumber\left(d_{1}\right)\neq weeknumber\left(d_{2}\right),
then we must devise a way to find the number of weekends in between the two dates (each weekend subtracts two days from the interval \left[d_{2},d_{1}\right]). We can do this in three steps:

  1. Determine if the starting day is in a workweek: 5-weekday\left(d_{1}\right)\geq0. If so, then set this as the initial count: c=\left(5-weekday\left(d_{1}\right)\right), otherwise set c=0.
  2. c now covers the number of working days in the first week of the range. To find the number of remaining days we first subtract 7-weekday\left(d_{1}\right) days from the days between the dates d_{2}-d_{1}. We wil define this as: r=\left(d_{2}-d_{1}+1\right)-\left(7-weekday\left(d_{1}\right)\right).
  3. The number of weekends must be div\left(r,7\right), since each weekend consists of two days, we must subtract this twice, giving: w=c+r-\left(2\cdot div\left(r,7\right)\right). The div function is an integer division yielding a whole number. It is equivalent to \left\lfloor x/y\right\rfloor (the floor of a division, in this case between r and 7).

These steps are the full solution for the case where the weeks are not equal. Below are some examples of the method presented. However, perhaps you’ve come up with a simpler or more elegant solution? Let me know in the comments below.

Examples:
For the number of working days between d_{1}=16-02-2009 (weekday 1) and d_{2}=18-02-2009 (weekday 3): since these are within the same week (week 8), the number of working days is:

w=min\left(3,5\right)-min\left(1,5\right)+1=3-1+1=3

For the number of working days between d_{1}=16-02-2009 (weekday 1) and d_{2}=22-02-2009 (weekday 7): these are also within the same week (week 8), but the end date is in the weekend. Still this should work:

w=min\left(7,5\right)-min\left(1,5\right)+1=5-1+1=5

For the number of working days between d_{1}=16-02-2009 (weekday 1) and d_{2}=25-02-2009 (weekday 3): these are not within the same week, thus we must use the three step approach:

c=\left(5-1\right)=4
r=\left(25-16+1\right)-\left(7-1\right)=10-6=4
w=4+4-\left(2\cdot0\right)=8

Here we see that the last part of the formula has no effect. This is always true for adjacent weeks, since div\left(r,7\right)\geq1 only if r\geq7.

So, now for non-adjacent weeks:
For the number of working days between d_{1}=09-02-2009 (weekday 1) and d_{2}=25-02-2009 (weekday 3):

c=\left(5-1\right)=4
r=\left(25-9+1\right)-\left(7-1\right)=17-6=11
w=4+11-\left(2\cdot1\right)=13

read more 3 comments

Puzzle #3: Boys and Girls

Almer S. Tigelaar 13 / 09 / 2011, 09:00

Problem:
Assume a country where every family wants to have a boy and continues having babies until they actually have one. We consider only one generation. The probability of having a boy or a girl is always the same. After some time has passed, what is the ratio between boys and girls in the country? Try to work it out yourself and then expand the solution below.

More »

Solution:
The right answer to this question can be determined by intuition. However, does your intuition tell you that there will be more boys and less girls eventually? That is what most people would say. However, let’s see if we can solve it mathematically.

We know that for each birth the probability of a boy or a girl must be equal and is thus fifty percent for each. It is perhaps best to think of this as flipping a coin with half a chance on heads and the other half on tails.

Let’s first break it down by analysing what must be true at all times:

  1. All families always have at least 1 boy.
  2. All families must have somewhere between (inclusive) 0 and \infty[1] girls.

We can further break down the second question in infinitely many sub-questions of the form: “How many families have n girls?”. Where n ranges between 0 and \infty. A more visual way to express this is by coding girls with a G, boys with a B and then determining the likelihood of existence of each of the following family compositions (not counting the father and mother):

  • B
  • GB
  • GGB

Well, we already established that the probability of having either a boy or girl is fifty percent. Hence, the probability of a family with only a boy is fifty percent (equivalent to a family with zero girls), a family with one girl and one boy: twenty five percent, and a family with two girls and one boy: twelve point five percent, et cetera. Based on this we can calculate the number of girls NG as follows:

NG=\frac{1}{2}\cdot0+\frac{1}{2}\cdot\frac{1}{2}\cdot1+\frac{1}{2}\cdot\frac{1}{2}\cdot\frac{1}{2}\cdot2+\ldots

If you find this difficult to follow, keep in mind that each family that does not have a boy yet will, at the next birth, again have a fifty/fifty chance on a boy. However, as there are more births in the same family the probability of not having a boy goes down with each girl that joins the family. Realise that this is not because the independent probability of having a girl or boy changes, but the probability of having ever longer strings of girls being born in the same family goes down as the family grows larger. This problem is in fact equivalent to flipping a fair coin and determining the probability of heads or tails over an infinite set of trials also known as a Bernoulli trial.

People familiar with mathematics and probability have probably noticed that the above approximation can more accurately be written as an infinite series:

NG=\sum_{n=1}^{\infty}\frac{1}{2^{n}}
where n is the number of families.

So, what is the outcome of this? The number of girls approaches 1 as we let n approach infinity. We also know the number of boys in each family must be at least 1. Hence, the counter-intuitive answer is that most families will consist of one boy and one girl.

Some people will now draw the conclusion that the ratio must be 1:1 as well. However, that conclusion is controversial. Economist Steven Landsburg went as far as to offer a public bet on this. His main point is that even if the main variables (boys and girls) have an expected difference of zero, you can not conclude that they have a ratio of 1 to 1.

Mathematically the expected value of boys and girls are both one:
E\left(B\right)=1
E\left(G\right)=1

However, the question is what is the expected ratio of girls to the child population:
E\left(G/\left(G+B\right)\right)=?

The approximation given by Landsburg is:
E\left(G/\left(G+B\right)\right)=\frac{1}{2}-\frac{1}{4\cdot n}
where n is the number of families.

This seems like a more satisfactory answer then saying the ratio is 1:1, as it captures the value of n very explicitly. Based on this, we can safely say that the right answer is: for a very large population the ratio will approach 1:1, but never actually reach it.

There are many other people that have given perspectives on this problem, and more thorough mathematical background, like here and here. However, some of them also make the original problem much more complicated than it was. Other sources I used to compile this article are here and here.

read more 0 comments

Puzzle #2: The Soccer Trading Cards Problem

Almer S. Tigelaar 02 / 08 / 2011, 09:00

A well known Dutch supermarket chain has on several occasions handed out soccer trading cards: for each € 10,- you would spend at one of their supermarkets you get a blind packet with five trading cards. So, you do not know the contents of the packets beforehand and therefore may end up with duplicates. These cards were a true hype among youngsters, many of whom clung to buyers exiting supermarkets to ask for their packs[1]. Because the number of unique cards is finite, a logical question is: how much money does one have to spend before having the entire card collection?

Problem
There are 270 soccer trading cards in total. For every € 10,- you spend you get a pack of 5 (blind) cards. The cards may overlap between packs[2]. You may not directly buy cards from other parties or trade cards with others (as is common in real life). Your only source of cards is the supermarket itself. How much do you have to spend to get all 270 cards? First try to work out a solution yourself, then click below to expand mine.

More »

Solution
For the case were there is no overlap between the contents of the packs this problem is easy. In this case it would be guaranteed that each pack would contain cards that you do not yet have. You would have to spend: \left(\frac{270}{5}\right)\cdot10= 540,-. Although this is a starting point, it is not the case and we can safely assume that the supermarket wants you to spend more than just € 540,- ;)

The notion of overlap introduces a probabilistic component to this problem. We will call the event in which you get a card you already have O for overlap. There is a probability of overlap P(O), for the time being we will assume that at all times this probability is fixed. This is actually an oversimplification as will become clear later. I also received the cards from the (same) supermarket and based on this I think that an estimate of 10 to 20 percent for P(O) is reasonably accurate as a starting point. Let us assume P\left(O\right)=0.2 (20 percent) and of course P\left(\neg O\right)=0.8. Now, if we receive a number of cards n, we can say that there should be n\cdot P\left(\neg O\right) new cards among these and n\cdot P\left(O\right) duplicates. We are really only interested in the new cards, since our `count’ of new cards must eventually equal the entire collection: 270 cards. The question `how many cards do we need to buy before we have 270 unique ones?’ is trivial to answer:

n\cdot P\left(\neg O\right)=270\\n\cdot0.8=270\\n=\frac{270}{0.8}\\n=337.5

So we need to obtain at least 337.5 cards to have all 270 unique ones. Since half a card does not exist, and cards are always distributed in packets of five, we need to round this up to 340 cards. Since each quintet of cards costs € 10,- this means we must spend at least the amount s defined as:

s=\frac{340}{5}\cdot10=680

Which is indeed somewhat more. One would need to spend € 680,- to obtain these 340 cards and as a result have the complete collection of 270 cards and an excess of 70 duplicates.

The generic formula is of the form:

s\left(P\left(O\right)\right)=\frac{\left\lceil \frac{n}{\left(1-P\left(O\right)\right)}\right\rceil }{5}\cdot10

Where parameter P(O) denotes the probability of obtaining an overlapping card, n the number of cards (270) and the ceiling function (denoted with the symbols \left\lceil \,\right\rceil) rounds up to the nearest value divisible without remainder by 5.

This in turn can be simplified to the following:

s\left(P\left(O\right),c\right)=\left\lceil \frac{n}{\left(1-P\left(O\right)\right)}\right\rceil \cdot c

Where the additional parameter c denotes the costs of one card, for this case: c=2.

So, what is P(O)?
Let’s assume that the supermarket knows the buying behaviour of its customers and bases P(O) largely on that information. Let us further assume that the period in which the cards are `given away’ lasts for about four weeks and that the average family goes to the supermarket once a week spending € 200,-. This means that they will spend € 800,- in the entire period. Based on this the optimal value of P(O) can be calculated:

800=\left\lceil \frac{270}{\left(1-P\left(O\right)\right)}\right\rceil \cdot2\\400=\frac{270}{\left(1-P\left(O\right)\right)}\\400\cdot\left(1-P\left(O\right)\right)=270\\1-P\left(O\right)=\frac{270}{400}\\-P\left(O\right)=\frac{270}{400}-1\\P\left(O\right)=0.325

So, based on this simple model the chance of obtaining duplicates for our average family would be 32.5 percent, instead of the 20 percent used before.

Complications & Trading
The previously described model is a bit simplistic. There are many complicating factors. One is that the probability of duplicates changes over time and that this probability is not uniform for all cards: each card has its own P(o). By varying the probabilities one can create highly sought after cards and cards that everyone already has multiple times. I would assume that cards of popular players are more scarce than those of relatively unknown soccer players.

Interestingly this non-uniform distribution of cards causes cards to have a value relative to each other laying the foundations for an economy. A card with a P(o) of say 0.10 can be said to be `worth’ twice as much as one with a P(o) of 0.20, even though the underlying monetary value for obtaining the cards really is equal. This scarcity principle is seen in many trading card games, where eventually the monetary value changes to reflect the scarcity of the card! For example: some cards in the popular `Magic’ card sets are worth hundreds of Euros.

Another complications arises from the phenomenon of actually trading the cards. Trading is beneficial in the sense that it can reduce the minimum spending needed to obtain all cards. By trading you can also rid yourself of duplicates useless to you. However, people are only willing to trade those cards that they have duplicates of. The previously mentioned scarcity influences the amount of trading that takes place. Hence, if the P(o) is equal for all cards: uniform, then there will be maximum trading opportunity. Whereas any deviation from uniformity likely results in less trading, since scarce cards are not easily traded nor likely to be found as duplicate.

The supermarket chain sold the excess cards it still had for € 0,10 per card after the handout ended. Assuming that this reflects about double the true value of a card, after all: the supermarket wants to make a profit, this would imply that all the products bought in the action period had about 2.5 percent soccer trading card tax (\frac{0.05}{2.00}=0.025).

Conclusion
The only reason for a supermarket to give anything to its customers `for free’ is to influence their buying behaviour. If your child is constantly nagging you for more soccer trading cards, you are more likely to round up your spending at the supermarket to the nearest `ten’ to get extra packets of cards. Rounding down, which means: buying less or cheaper products, is probably less common.

Of course trading cards have an entertainment value as well: you can play games with the cards. The trading process also has a certain educational value for children. Especially since this trading appears to mimic normal economic principles. Nevertheless, there is always one party that definitely wins here which is the supermarket itself. Without spending any extra money of their own, they succeed in getting higher profits.

read more 0 comments

Puzzle #1: The Student Admission Problem

Almer S. Tigelaar 01 / 07 / 2011, 09:00

Background: I will occasionally present a small (mathematical) puzzle here. These are a tribute to the column “Vuiks Verhandelingen”, written by Kees Vuik which regularly appeared in the Dutch PCM Magazine nearly a decade ago. Sometimes I may think of them myself, and sometimes they may come from elsewhere.

Let’s get started: 114 students applied for a specific study, only 100 were eventually admitted by lottery. Among the 114 students were 5 foreign students. However, none of these foreigners were among the 100 that were admitted. Intuitively this seems quite unlikely. But how unlikely is it really? Were the foreigners intentionally not admitted?

Your task: find the probability that the 5 foreign students were left out due to chance. This can be solved with basic knowledge of probability and set theory. First try it yourself, then click below to expand the solution and check your answer.

More »

Tip: if you are not too familiar with mathematical notation, don’t be intimidated by it: it’s just a way to formally write things down, and the notation here should be easy to follow. I’ve added some footnotes to help: hover over them with your mousepointer to see an explanation.

Definitions
We start with some generic definitions:

  • S is the set of all students that applied.
  • F is the subset of S which is foreign.
  • n is the number of students we may select from S (without replacement[1]).
  • The resulting selection we call set L (where L is a subset of S with cardinality[2] n )

Let’s fill out the variables above for this specific case:

  • There are 114 students in total: \#S=114
  • Of which 5 are foreigners: F\subset S\wedge\#F=5[3]
  • We may select 100 students: n=100

Problem:
What is the probability that we select no foreign student when we choose 100 students out of the 114 that applied? Formal: what is the probability that the intersection of F and L is empty?: P\left(F\cap L=\varnothing\right)

Solution:

  1. We know that there are \binom{\#S}{n} ways[4] in which we can select n students from the \#S (without replacement).
  2. There are \binom{\#F}{k} ways to select k students from the foreign subset.
  3. Given (2) there are \binom{\#S-\#F}{n-k} ways to select the remaining n - k students.
  4. Combining (1-3) we get:P\left(F\cap L=\varnothing\right)=\frac{\binom{\#F}{k}\cdot\binom{\#S-\#F}{n-k}}{\binom{\#S}{n}}

In this last step we effectively state that the probability is the number of ways in which we can select k and n - k students, relative to the total number of possible selections we can make. It is a generic solution for any set sizes, for this specific instance we take the problem specific values and simply substitute them:

P\left(F\cap L=\varnothing\right)=\frac{\binom{5}{0}\cdot\binom{114-5}{100-0}}{\binom{114}{100}}=\frac{\binom{5}{0}\cdot\binom{109}{100}}{\binom{114}{100}}=\frac{\binom{109}{100}}{\binom{114}{100}}\thickapprox1.36\cdot10^{-5}

All the values are as specified in the problem specific definition. The value of k is 0, since we are interested in the probability of selecting 0 students from the Foreign group. We see that the problem simplifies, since \binom{5}{0}=1.

We can conclude that the probability of selecting none of the 5 foreign students is quite low: 0.00136 %

The probability that the lottery was unfair can be deduced from this:
P\left(F\cap L\neq\varnothing\right)=1-P\left(F\cap L=\varnothing\right)=1-1.36\cdot10^{-5}=0.9999864

Which is 99.99864%. Indeed, it is very likely that at least one foreign student would have been selected. Hence we can state that it is likely that the lottery applied was unfair in this case and that the probability of a foreign student of being selected must have been much lower than that of a local student, or even zero.

read more 2 comments