20

So I ran accross the following problem today on a forum:

"In an attempt to reduce male birth rates, feminists have passed a law which forces families to stop having children after their first male child. After this law is passed, what is the expected ratio of male children to female children?"

It's unclear whether families are supposed to continue having children until their first male child or if they may stop at any earlier point, but let's assume the former. Let's also assume that everybody obeys the law, and that either sex has an equal chance of being born.

I thought I had a simple solution, because the problem is logically equivalent to the following process:

  • Generate N infinite random sequences of boys and girls
  • Cut each sequence after the first occurrence of a boy
  • Concatenate the results and measure the ratio of boys to girls

This is equivalent to generating a random sequence of boys and girls by stopping at each boy but then continuing again (up to N times), so the only difference from a normal random sequence is that this sequence always ends with a boy. However, if N grows to infinity, it seems like this last element (which can never be reached) becomes irrelevant, so the ratio of boys and girls should be 1:1 like in a normal infinite random sequence. This seems perfectly logical to me, but a few people kept insisting that it is wrong, ranting about "biased estimators" and claiming that the real ratio will be biased in favor of females.

Is my reasoning flawed? If so, why?

[EDIT]

Contrary to some suggestions, I don't think this question is a duplicate. It asks about the validity of a particular approach to solving the problem, not just for a solution.

  • 6
    Can we take biology into account? Because if the probability of a boy being born for any father indeed is $0.5,$ your reasoning is correct. In reality, however, the chromosome passed by the father determines the gender of the child. In some cases, the father only passes the X chromosome, resulting in only female children being born. As soon as one boy is born (50% probability for a father able to pass both X and Y chromosomes), its father can no longer reproduce. This gives fathers passing only the X chromosome more chances to reproduce, hence resulting in more daughters being born. – jvdhooft May 28 '17 at 15:31
  • In both cases the rate of birth will be unaffected (unless like @jvdhooft we take into account that the relevant random variables in reality are not actually independent). – Stefan Mesken May 28 '17 at 15:33
  • @Stefan: if my approach is correct, then almost any condition for cutting the sequences at step 2 would work, including "when the male:female ratio is greater than 1:100". – user3026691 May 28 '17 at 15:39
  • 3
    Well, your approach actually doesn't back up your claim (because it leaves out most relevant details dealing with the probability distribution). It's more like an intuitive reasoning than a valid proof. That's why I deliberately didn't mention it in my first comment. (This isn't meant to be a criticism. It's fine to argue this way to get an intuition but it's far from a proof.) – Stefan Mesken May 28 '17 at 15:45
  • @Stefan: Why isn't it sufficient to show that the sequence generated with this approach is equivalent to a normal infinite random sequence of boys and girls? – user3026691 May 28 '17 at 15:47
  • Well, that's precisely what you want to show - but didn't do. – Stefan Mesken May 28 '17 at 15:49
  • @Stefan: which part of my argument do you consider to be a non-sequitur? – user3026691 May 28 '17 at 15:51
  • 1
    Well, all of it. But you force me to make it sound like I take any issues with your outline - which I don't. – Stefan Mesken May 28 '17 at 15:52
  • @Stefan: Simply insisting that I haven't proven anything doesn't prove anything. What's the first leap that you consider unjustified? – user3026691 May 28 '17 at 15:54
  • 2
    "[..] because the problem is logically equivalent to [..]" – Stefan Mesken May 28 '17 at 15:55
  • @Stefan: Do you agree that "generate N infinite random sequences of boys and girls, cut each sequence after the first occurrence of a boy" is trivially equivalent to "take N families and have each one make children until the first boy is born"? Or does this need some further logical justification? Or is it not sufficiently clear that concatenating the sequences is logically equivalent to considering the children of all N families? – user3026691 May 28 '17 at 15:59
  • 1
    If you actually want to prove this you first have to establish a suitable probability space and then show that your claimed transformations (which do in fact work) actually work for this space. But, assuming that you have at some point taken a probability course, you already know that. So I really don't see the point in arguing with me here. The beauty of mathematics is that we have proofs and don't merely rely on opinions - stated on the internet or elsewhere. – Stefan Mesken May 28 '17 at 16:03
  • 1
    Instead of "If this law is passed" it should say "If this law is obeyed". The rate of compliance with the one-child policy in China was fairly low, if I'm not mistaken. I've heard the Chinese census has a huge undercount because couples who violate the policy report falsely that they have only one child. (I seem to recall that it's been changed to a two-child policy.) – Michael Hardy May 28 '17 at 16:04
  • There is a small but positive probability of no female children whatsoever, in which case the ratio of male children to female children is infinite. Thus the expected ratio of male to female is infinite. Did you mean to ask about the expected proportion of children then are male? Or about the ratio (expected number of male children) / (expected number of female children)? – Julian Rosen May 28 '17 at 16:33
  • @JulianRosen: I didn't formulate the problem, but from what I understand, it's asking to find which number (expected number of male children) / (expected number of female children) approaches as the population grows. – user3026691 May 28 '17 at 16:38
  • Some of the answers (including mine, which I suspect is wrong) assume couples keep having children as long as the law allows them to. One should consider that the probability distribution of the number of children might differ from that. – Michael Hardy May 28 '17 at 16:48
  • @MichaelHardy: While it's unclear in the original problem statement, the question does state that you can assume couples keep having children until they have a boy. – user3026691 May 28 '17 at 16:56
  • 1
    This same question appears on stats (dot) stackexchange (dot) com: https://stats.stackexchange.com/questions/93830/expected-number-of-ratio-of-girls-vs-boys-birth/93890#93890 – Michael Hardy May 28 '17 at 17:23
  • 12
    I think this is an interesting question that would be better off without the portrayal of feminists as authoritarians who want a female dominated society. I don't want to start a debate on the merits of feminism, but the question as stated currently takes a political stance against feminism. – ZachMcDargh May 28 '17 at 18:48
  • 2
    @ZachMcDargh: it also portrays feminists as incompetent and mathematically clueless, because I'm quite convinced the result is still a 1:1 ratio. Not every joke is a strong political statement. – user3026691 May 28 '17 at 19:41
  • Biology is dumb – mtheorylord May 28 '17 at 23:00
  • 1
    Did you search for this? It's been asked many times before: https://math.stackexchange.com/questions/20426/ https://math.stackexchange.com/questions/116706 https://math.stackexchange.com/questions/218674 – BlueRaja - Danny Pflughoeft May 29 '17 at 08:58
  • @user3026691 I'm not sure how that's an argument that this joke is apolitical; you've just made it sound even more political, and pointed out that it relies on sexist stereotypes as well. If you consider that appropriate, by all means, leave the question as it is, but I have voted it down. – ZachMcDargh May 30 '17 at 18:42
  • @ZachMcDargh: I didn't say the joke was apolitical. I said the joke is... you know... a joke. If you're so triggered by it that you have to vote the question down because you feel personally insulted by it, be my guest. – user3026691 May 31 '17 at 16:40

12 Answers12

13

families are supposed to continue having children until their first male child


Possible families
b (p = 0.5)
gb (p = 0.25)
ggb (p = 0.125)
gggb (p = 0.0625)
ggggb (p = 0.03125)
gg.... (p = 0.5^length)

So every family will always have exactly óne boy for sure (total_p=1). But let's prove that.

Girls

The amount of girls you have is basically:

0.5 * 0
0.25 * 1
0.125 * 2
0.0625 * 3
0...... * 4

The extra amount of girl per n is 0.5^(n+1) * n. Summing this formula:

enter image description here

Boys

The amount of boys you have is similar:

0.5 * 1
0.25 * 1
0.125 * 1
0.0625 * 1
0...... * 1

So the extra amount of boy per n is 0.5^(n+1). Summing this formula:

enter image description here

So the ratio is 1:1!

8

A birth is either a new male or a new female, each with probability $1/2$. This is clearly the case regardless of laws or policies, so by linearity of expectation the expected ratio of males to females in the population will remain $1:1$.


Edit: If you disagree with the minor simplifying assumption that boys and girls are equally likely, just replace $1/2$ with some fixed $p$ and $1-p$, and $1:1$ with $p:(1-p)$.

Here is some JavaScript code to simulate this for $1$ million families:

var heads = 0;
var tails = 0;
for(var i=0; i<1E6; i++){
   var gotTails = false;
   while(!gotTails){
      if(Math.random() < 0.5){ 
         heads++ 
      }
      else{
         tails++;
         gotTails = true;
      }
   }
}
console.log(heads + "," + tails);
  • 1
    Your code seems to match my argument about concatenating sub-sequences of random sequences, but I don't quite get the linearity of expectation part. Can you elaborate on that? – user3026691 May 28 '17 at 18:25
  • Indeed, assuming that every family includes a boy, this holds. –  May 28 '17 at 18:28
  • @user3026691 If we place a value of $1$ on heads and $0$ on tails, then each fair coin flip has expected value $0.5 = 0.5(1) + 0.5(0)$. The expected value of a sum of random variables is the sum of the expected values, always, so here the expected value of the sum is just $0.5$ times the number of flips. In the original example, this means that since each birth is equally likely to yield a boy or a girl, across all $N$ births we expect $N/2$ boys and $N/2$ girls. – Eric Tressler May 28 '17 at 18:46
  • @YvesDaoust That assumption isn't necessary – Eric Tressler May 28 '17 at 18:58
  • 2
    Is "got tails" a euphemism?... – FreeElk May 28 '17 at 19:12
  • I'm going with this answer because your code illustrates very simply the point I was trying to make and confirms that the expected ratio is 1:1. – user3026691 May 28 '17 at 19:43
8

The law will change the sexual habits of people and the number of children born, but it will not change the laws of nature. In the assumed model these are as follows: The sex of a child is determined at the moment of conception, and if conception takes place the probability that the child is a boy is ${1\over2}$, independently of social circumstances.

6

I understand the instinct to argue against you but you are right, the ratio should remain 1:1. For an intuitive reason why: consider the before and after for families with three children.

Options before

  • bbb
  • bbg , bgb, gbb
  • bgg , gbg, ggb
  • ggg

Total is 12 boys, 12 girls

Options after the law

  • b
  • b , b, gb
  • b , gb, ggb
  • ggg

Total is 7 girls, 7 boys.

Two children gives:

  • bb
  • bg, gb
  • gg

vs

  • b
  • b, gb
  • gg

We could continue this to all possible sets and the results would be the same.

However, this is ignoring the possibility that there may be some genetic component (none that I've heard of but possibly the case) which says some couples are more likely to produce a particular sex. This may be the bias people are talking about.

FreeElk
  • 231
  • I understand the intuition, but it neither conclusively proves a 1:1 ratio nor points out a logical error with my argument. – user3026691 May 28 '17 at 16:31
  • 1
    @YvesDaoust Would you like to point out why the options aren't equally likely? Sure the different bullet points are different likelihoods but as far as I'm aware each comma separated event is (and that is what I'm comparing). – FreeElk May 28 '17 at 16:48
  • 3
    @FreeElk ggg, gg are not valid options... the sequence must always end with a boy if I read op's post correctly. – Thomas Wagenaar May 28 '17 at 19:36
  • @ThomasW As I read the question you aren't forced to continue having children if your child is a girl. You have to consider the actual scenario, not just the maths involved. I think a lot of the answers here consider some sort of logic forced pregnancy rather than applying maths to the model a scenario where people choose the number of children they might want. – FreeElk May 29 '17 at 09:52
  • 1
    @FreeElk the question clearly states that "families are supposed to continue having children until their first male child" is assumed – Thomas Wagenaar May 29 '17 at 10:12
  • @ThomasW Ah yes, I had been going off passed a law which forces families to stop having children after their first male child which gives no indication of a forced pregnancy law. But you're right, the asker assumes such a law is also implemented. I suppose this is where we start to detract from reality and incorrectly model the situation. – FreeElk May 29 '17 at 10:18
6

Since the probability of $k$ girls then $1$ male is $2^{-k-1}$, the expected number of $(\text{female},\text{male})$ births $$ \sum_{k=0}^\infty(k,1)2^{-k-1}=\left(1,1\right) $$ We are ignoring multiple births, or families that cannot have children. Esentially, we are assuming that all families have children and that they keep having children until they have a male.


Note that this is the same as answering the question "flip a fair coin until we get heads; how many heads and how many tails do we expect to see?"

robjohn
  • 345,667
5

So to take this to a mathematical model and ignore all biological facts, we have the probability space $\Omega = \{g,b\}^{\mathbb{N}}$ with the product sigmaalgebra of the components and product probability measure ($p_i = \frac{1}{2}$). An element of the probability space represents the sequence of children of a single family.

For example take a set $A = \{\alpha \in \Omega : \alpha_i = \beta_i \text{ for } i=1,\dots,n\}$ for a fixed values $\beta_i$ then the probability is $P(A) = \frac{1}{2^{n}}$.

We want a random variable $G_n :\Omega \to \mathbb{R}$ that tells us if the $n$-th child is a girl if there is a $n$-th child according to our rules. So $G_n$ is either $1$ or $0$. \begin{align} G_n(\alpha) = \begin{cases}1 & \text{if } \alpha_i = g \text{ for } i\leq n \\ 0 & \text{else.} \end{cases} \end{align} The expected value of $G_n$ can be calculated by $P(G_n=1) = \frac{1}{2^n}$. The number of girls in a family is represented by the random variable $G = \sum_{n=1}^\infty G_n$. So we want the expected value of $G$: \begin{align} \mathbb{E}(G) = \mathbb{E}\Big(\sum_{n=1}^\infty G_n\Big) = \sum_{n=1}^\infty \mathbb{E}(G_n) = \sum_{n=1}^\infty \frac{1}{2^n} = 1 \end{align} Since the $G_n$ are highly dependent it is easy to verify the swap of integral and sum or just use fubini for positive functions.

Now to the boys: Let $B_n$ be the random variable that tells us if the $n$-th child is a boy \begin{align} B_n(\alpha) = \begin{cases}1 & \text{if } \alpha_n = b \text{ and }\alpha_i = g \text{ for } i<n \\ 0 & \text{else.} \end{cases} \end{align} The expected value of $B_n$ is again $P(B_n=1) = \frac{1}{2^n}$. Hence the same game again: the number of boys in a family is $B = \sum_{n=1}^\infty B_n$. \begin{align} \mathbb{E}(B) = \mathbb{E}\Big(\sum_{n=1}^\infty B_n\Big) = \sum_{n=1}^\infty \mathbb{E}(B_n) = \sum_{n=1}^\infty \frac{1}{2^n} = 1 \end{align} By the way we also showed that the expected number of children is $2$ with this rule. However we do have more than one family. So we have to multiply the whole result with the number of families but this doesn't change the ratio.

4

Consider a slightly different law, which allows families to have as many children as they want, but in which all children with an older brother are sent away. This law is equivalent to the one proposed, because in each family only the children up to the first boy remain.

Of all the children born, the proportion of girls is $\frac12$.

Of all the children sent away, the proportion of girls is $\frac12$. (Children born in families in which already one boy has been born are from both sexes with equal probabilities.)

In conclusion, the proportion of girls in the remaining population must also be $\frac12$.

user133281
  • 16,073
2

Here's an answer that assumes couples continued having children until they're no longer allowed to: $$ \begin{array}{ccccccc|c} & & & & & & & \text{probability} & \%\text{ male}/100 \\ \hline m & & & & & & & 1/2 & 1 \\ f & m & & & & & & 1/4 & 1/2 \\ f & f & m & & & & & 1/8 & 1/3 \\ f & f & f & m & & & & 1/16 & 1/4 \\ f & f & f & f & m & & & 1/32 & 1/5 \\ \vdots & \vdots & \vdots & \vdots & \vdots & \ddots & & \vdots & \vdots \end{array} $$

The expected proportion of male children in a family is therefore \begin{align} & \sum_{n=1}^\infty \frac 1 n \left(\frac 1 2 \right)^n = \sum_{n=1}^\infty \frac {x^n} n = \sum_{n=1}^\infty \int_0^x u^{n-1} \, du \\[10pt] = {} & \int_0^x \sum_{n=1}^\infty u^{n-1} \, du = \int_0^x \frac 1 {1-u} \, du = \int_{1/2}^1 \frac1u\, du \\[10pt] = {} & \log(1)-\log \left(\frac12\right) = \log(2) \approx 0.69 \quad \text{(where $\log(\cdot)$ means $\log_e(\cdot)$)} \end{align} The interchange of the integral and the sum is justified because the function is everywhere positive. (It is only when the positive and negative parts both diverge to infinity that one can get two different answers by changing the order.)

However (as pointed out below by "Paul") the fact that the average family has $69\%$ boys doesn't mean $69\%$ of all births will be boys because families with more girls than boys will be larger. If all couples get equal weight then the average is $69\%$ but if all babies get equal weight then it's $50\%$.

  • 3
    This computes the expected proportion of male children in a family, right? I read the question as asking about the ratio of male to female in the society as a whole. – Julian Rosen May 28 '17 at 16:35
  • I understand the purpose of the table and the sum, but unfortunately I don't know enough to tell if the rest is correct or not. – user3026691 May 28 '17 at 16:35
  • 3
    @JulianRosen Exactly, families have on average 69% boys, but if you combine all families you get 50% because the families with more girls are bigger. – Paul May 28 '17 at 19:32
  • Without adapting the argument to address what the question is actually asking about, i.e. the overall proportion of boys and girls (not within a single family), I don't think this answer is particularly useful for this question. – David Z May 29 '17 at 00:28
2

This is a geometric distribution, with $p=0.5$. Let's call $X$ the number of children a family has.

The answer is going to be $\frac{1}{\mathbb{E}(X)-1}= \frac{1}{\frac{1}{0.5}-1}.$

For a proof of why $\mathbb{E}(X)=\frac{1}{p}$, see https://en.wikipedia.org/wiki/Geometric_distribution#Moments_and_cumulants

1

I like to model such problems as a finite state machine. Just for fun I'll generalize by making the sex ratio at birth variable; let's call the probability of a daughter $r$. Also, let's not assume that every couple keeps on making babies until they have a son; let $q$ be the probability of stopping after a daughter.

A couple begin in state $A$ (fertile), and with each child —

  • with probability $rq$, they have a daughter and move to state $B$ (infertile).
  • with probability $r(1-q)$, they have a daughter and remain in state $A$.
  • with probability $1-r$, they have a son and move to state $B$.

In state $B$ the expected number of future daughters ($d_B$) or sons ($s_B$) is obviously zero. In state $A$, the expected number of future daughters is $$d_A = r(1-q)(1+d_A) + rq(1+d_B) + (1-r)(0+d_B)$$ because in returning to state $A$ you add one to your daughters. Simplifying, $$d_A = r+rd_A-rqd_A $$ $$(1-r+rq)d_A = r$$ $$d_A = r/(1-r+rq)$$

The expected number of future sons in state $A$ is $$s_A = r(1-q)(0+s_A) + rq(0+s_B) + (1-r)(1+s_B)$$ $$s_A = r(1-q)s_A + 1-r $$ $$s_A(1-r+rq) = 1-r$$ $$s_A = (1-r)/(1-r+rq)$$

So the sex ratio is $r : 1-r$, independent of $q$ (to my mild surprise; intuitively I expected fewer boys when $q>0$).

1

Here is another way to think about it - although you have already had a lot of ways.

First observe that the average size of a family is 2. Hopefully this is obvious, it follows from the rule that you have to wait an average of 1/pattempts to succeed where probability is p.

Second observe that all families will have exactly one boy.

Hence the average number of girls must also be one.

0

Lets rearrange the problem as a coin toss game.

For the purposes of visualisation, there is just one coin.

Each family gets a turn tossing the coin. As long as they toss a head, they can keep playing and toss the coin again. When they toss a tail, they have to hand the coin on to the next family. The family can also choose to stop playing at any point and hand the coin on.

Since we are now simply talking about a long series of coin tosses, it becomes obvious that the ratio of heads:tails will be approximately 1:1

It doesn't matter if there is only one coin that is handed one from one family to the next, or every family has their own coin that they toss, the final ratio will be the same.