A Puzzle

kenberg · May 16, 2014

Edit: It is clear that I have completely misunderstood the problem and I still do. Ignmore this post and all other posts by me on this.

He arrives at 11:00? Who knew?

I had misread and thought he arrived randomly, which is not what you said, but you also did not say, or I did not understand you to say, that he arrived at the top of the hour.

Let me try again at understanding this. Suppose that X, Y, Z are three independent random variables uniformly distributed on [0,1] You are saying that the expected value of Min(X,Y,Z) is 1/4? I don't know if this is a fact but I could check it. It isn't what I understood the problem to be.

Edit: yes, the expected value of min(X,Y,Z) is 1/4 when X, Y, Z are three independent random variables uniformly distributed on [0,1], at least that's what I got by direct calculation.

But this does not match my understanding of the original question.

kenberg · May 16, 2014

Edit: It is clear that I have completely misunderstood the problem and I still do. Ignmore this post and all other posts by me on this.

For anyone interested in the direct calculation of the expected value of min(X,Y,Z) it goes as follows:

For t in [0,1] let F(t) be the probability that min(X,Y,Z)<t. This happens if and only if X,Y,Z all greater than or equal to t does not happen, and the probability that all three are greater or equal to t is (1-t)^3 by independence (and by the fact that the distribution is uniform). The probability that something happens is 1 minus the probability that it doesn't happen, so F(t)=1-(1-t)^3. The expected value is obtained by integrating t dF(t) from t=0 to t=1, and you get 1/4.

So if he arrives at the top of the hour it's easy enough. If he arrives at some (uniformly distributed) random time between the top of the hour and the end of the hour, the way I mistakenly interpreted the question, I think the problem is more complicated but can be done in essentially the same way. Integrate some polynomials from somewhere to somewhere.It will be multiple integrals.

gwnn · May 16, 2014

Let us define the problem exactly. The person arrives at 11 pm, and knows that between 11:00:00 pm and 11:59:59 pm inclusive there will be three trains, not an average of three trains, which is an entirely different problem.

Wrong. He got to the station at 11:00:01 and knows that there was a train at 11:00:00, which he missed. As per StevenG's argument, he knows that there is a train at 12:00:00, so there will be exactly two trains between 11:00:01 and 11:59:59, at some unknown time (equivalently: there are three trains between 11:00:00 and 11:59:59, but the first one just left at 11:00:00! He saw it with his own eyes!). That is the double counting I meant, or more accurately, a non-counting. You can define an hour to be between 11:00:00-11:59:59, or 11:00:01-12:00:00, but not 11:00:01-11:59:59, which you insist on doing, especially if you ignore a whole train in the process (technically, two half trains)!!

I repeat my other question: do you really think that if there is only one train per hour and you just missed one, you have to wait only 30 minutes on average to get the next one? How would that work? Oh, OK, now I see what the answer is:

And I will have nothing to add to the above, which will be my last post on this subject.

Thanks, that's a very mature way of handling the situation. I hope if you realise that you are wrong you will find it in you to admit it.

broze · May 16, 2014

And I will have nothing to add to the above, which will be my last post on this subject.

This should be your signature Lamford!

kenberg · May 16, 2014

Having now read the problem, I want to briefly try again.

If we are asking how long, a person must wait if he has just missed the train, then we don't really need the person, is that right? We are just asking how long, on average, is the time between trains.

Let's start at 1 am and look at all three trains. There is a wait from the time the first train arrives to the time the second train arrives, a time from the second to the third, and a time from the third trani to the first train that arrives after 2. If we add these expected times we get the expected time from the first train after 1 to the first train after 2. That, I trust, is 60 minutes. Not all three of these expected times are the same, but their average values is 20 minutes, no?

So if we assume that his arrival is such that he is as likely to just miss one of these trains as any other, then I guess it's 20 minutes, right? This is not one of those things where he is more likely to arrive during a longer interval because we are stipulating that he comes right at the beginning of an interval.

Obviously I was way off base before, working with uniformly distributed arrival times. I just didn't read carefully. But if we prescribe that he arrives at the beginning of one of the intervals then I see no reason he would be more like to arrive at the beginning of one interval instead of another, so we just take the average of the lengths of the intervals I guess.

I find the problem a little confusing, if what I am saying is still nuts I will be happy to hear why.

nige1 · May 16, 2014

And I will have nothing to add to the above, which will be my last post on this subject.

Thank you Paul, for another interesting puzzle :)

lamford · May 16, 2014

Edit: If you have just missed one, you know there will only be two in the next 60 minutes, so the question is equivalent to "what is the expected value of the lowest of two random numbers in the range [0,60)." This is 20 minutes.

That is indeed the trivial answer to the first part, which I thought was not in dispute. Just for completeness, I wanted to clarify that the simulation and long explanation was for the second part only. That should have been apparent, but obviously I did not explain myself properly!

helene_t · May 16, 2014

That is indeed the trivial answer to the first part, which I thought was not in dispute. Just for completeness, I wanted to clarify that the simulation and long explanation was for the second part only. That should have been apparent, but obviously I did not explain myself properly!

Ah right, so we all agreed all the way, we were just discussing different things. Merry christmas everyone:)

lamford · May 16, 2014

Ah right, so we all agreed all the way, we were just discussing different things. Merry christmas everyone:)

Indeed, I should have heeded the following advice:

Don't write so that you can be understood, write so that you can't be misunderstood - William Howard Taft

kenberg · May 16, 2014

Man am I having problems following this. OK, I gather we are working on the second problem. Descibed as "And what would my average wait have been if I had not just missed one?" This only says what the passenger did not do, we need to pin down what he did do.

May I then go back to my assumption that the passenger arrives between the top of the hour and the end of the hour and his arrival time is uniformly distributed over that hour, say between 2 and 3?

And what about the trains arriving at "random intervals"? Trains arrive at times, not at intervals, but from what you said in an earlier paost I take it you mean that the arrival times between 2 and 3 are three independent uniformly distributed random variables, and the arrival times between 3 and 4 are also independent, both of each other and of the arrival times of the earlier trains, and also uniformly distributed. The six arrival times between 2 and 4 are independent, the first three uniformly distributed between 2 and 3, that last three uniformly distributed between 3 and 4.

Is this right?

If it is, I think I can go back to a previous post of mine, one I renounced because I thought that I had misunderstood. It would go:

X1 = arrival time of the first train between 2 and 3

X 2= arrival time of the second train between 2 and 3

X3 = arrival time of the third train between 2 and 3

Y=arrival time of first train between 3 and 4

Z = arrival time of the passenger between 2 and 3.

(Edit: The X1,X2,X3 have distributions as follows: A1, A2,A3 are independent and uniformly distributed, X1 is the min(A1,A2,A3) and so on, similarly Y is the minimum of some unif dist B1,B2,B3))

F(X1,X2,X3,Y,Z) is the time he has to wait given the values of the random variables.

Thus:

F(X1,X2,X3,Y,Z)=

X1-Z if 2<Z<X1

X2-Z if X1<Z<X2

X3--Z if X2<Z<X3

Y- Z - if X3<Z<3

The problem, rather a problem, is to find the expected value of F.

The above is a mathematical problem with a computable solution, although I have not computed it. I had first been thinking that this was the problem then came to understand I was wrong, this was not the problem, now I am getting the idea that maybe this is the problem. Is it?

I am confused.

GreenMan · May 17, 2014

It is because girls who grow up in the Bronx turn out to be interesting, intelligent, witty and beautiful.

Plot twist: The Bronx lover is a man. :rolleyes:

gwnn · May 17, 2014

Sure, then we all agree on everything. A Festivus miracle! For future reference, if one posts two questions and people post only one answer, it is usually because they think the first one has to be settled first or because they didn't see the second question. I will not say which was the case this time.

Fluffy · May 17, 2014

It is quite hard to phrase a probability question unambiguously, but SteveG is right that if there are three trains in every hour, at random intervals, then one would wait 15 minutes.

You are not taking into account that you just missed a train.

hotShot · May 17, 2014

As I understand the scenario there started 3 trains in one hour perhaps following a schedule.

Once they started their round, they are randomly delayed or if the distance between them get to short systemically accelerated, because the leading train picked up everybody already allowing the following train to make shorter stops.

So a typical scenario would be that 2 trains will have a short interval (e.g. 6 minutes) between them while the 3rd will have a bigger one (e.g. 27 minutes) .

kenberg · May 17, 2014

Bringing to mind the olde song The Bear Missed the Train, sung to the tune of Bei Mir Bistu Shein.

The bear missed the train and now he's walking....

gwnn · May 17, 2014

You are not taking into account that you just missed a train.

Yea, but we are also not taking into account that there were actually two questions in the opening post all along. :)

It's kind of strange, I thought about it a bit today and it seems like the 'You just missed one.' condition somehow enters in our mind as a constant, and it somehow takes more than just another question to over-ride it. I am not trying to diss lamford (and I apologise for my irritated post above) but I think the best solution for stating a problem like this is to put all the common conditions in a paragraph and then have two clear questions. For example :

(3 trains an hour bla bla)
1. You get to the train station and there is no sign of a train. What is the average waiting time?
2. You get to the train station and you just miss a train. What is the average waiting time?

Again, not as a criticism of lamford, just something that I learned through this thread. This way it is clear that there are two parts of the problem and it is clear which conditions are common. Somehow reading comprehension decreases when we need to think of two maths problems at the same time.

kenberg · May 17, 2014

This is what I get out of the discussion so far. Perhaps we agree?

Suppose there are n trains rather than specifically 3. Suppose each arrives randomly (uniform distribution) between 2 nad 3, and suppose that between 3 and 4 there are also n trains, arriving randomly in the same way. Then:

A. If a guy arrives at the top of the hour his expected waiting time to catch a train is the fraction 1/(n+1) of an hour. eg if n=3 he waits 15 minutes on average. This sort of works even for n=0, he waits for 1/(0+1) =1 hour, although he doesn't get a train then either.

B. If he arrives during [2,3) just as a train leaves, and if there is no reason to think he is more likely to have just missed one train than another, his expected waiting time is 1/n. The exepeted waitng time might, I believe it does, depend on which train he just missed. But if the experiment includes a random, equally likely, choice of missed train then it's 1/n.

C. If he arrives randomly (uniform distribution) during the hour from 2 to 3, his expected waiting time can be computed. I haven't thought it through enough to commit myself to an answer.

In case A, we don't care about what happens after 3. In cases B and C we do. For example, suppose we are still on this random schedule from 5 to 6, but starting at 6 a train arrives at 6:03 and every ten minutes after that. Then if he misses the last train between 5 and 6, he waits until 6:03 rather than for some randomly arriving train after 6. .

It is not unusual for problems such as this, especially probabilistic ones, to lead to confusion. Problems about an imagined situation have to be stated with great precision or else the analysis is pretty hopeless. In a real world case, we would investigate the mechanism so we clearly understood just how these random arrivals are generated. If the guy arrives at a random time uniformly distributed between 2 and half past 2, the answer will be different than if he arrives between 2 and 3, uniformly distributed. Without a precise statement of the problem, everyone guesses what is meant, not everyone guesses the same.

gszes earlier said "Questions like this without sufficient details are a MENSA specialty so they

can claim their interpretation is the best:". I don't know about Mensa onw way or the other, a friend suggested I take the rest, I declined. But his general point about the need for clarity is right.

kenberg · May 17, 2014

With 3 randomly arriving trains during each hour (from the top of one hour to the top of the next) and a randomly arriving passenger during one of those hours (uniform distribution of arrival time during that hour) I think the expected waiting time is 5/32 (oops, scratch that, 6/32 which is 3/16) of an hour. I don't have time to check this carefully, but try this reasoning:

The train arrivals break the hour into four intervasL before the first train, between the first and second, and so on.

The expected value of the random lengths of each of these intervals is 1/4.

If the passenger arrives during the first interval his expected waiting time is half the length of the interval. So, on average, 1/8.

Same thing if he arrives during the second or their interval

If he arrives during the fourth interval, he has to first wait until the end of this interval, again with an expected waiting time of 1/8, and then wait for the first train of the next hour, expected waiting time of 1/8 (no, 2/8 since he is at the beginning of the interval).

So we have waiting times of 1/8, 1/8, 1/8, 3/8 depending on the interval of arrival. All intervals are equally likel.y for arrival so we average these four numbers and get 6/32=3/16.

Maybe I am missing something here, I will think about it while mowing the grass.I ahve already interrupted my mowing to correct 5/32 to 6/32=3/16. Back to the grass.

Added: I guess I am claiming that the general formula with n trains is an expected waiting time of (n+3)/(2(n+1)^2). In an odd sort of way this makes sense again for n=0. The guy arrives sometime during the first hour, waits on average a half hour, assumes he missed whatever trains there were that hour, waits out the entire next hour, and then realizes there are no trains and goes home. he waits, on average, 3/2 of an hour just as the formula says. With one train per hour the formula says 4/8=1/2. That seems nuts, no on second thought seems right. I have to think about it. Spouse is picking up knife and telling me to get off the computer.

I am about to go away for the weekend so I'll check in later.

kenberg · May 18, 2014

Focusing in this post only on the second question:

I will state one interpretation to the second question, and then the answer that I believe to be correct. There is nothing new here from the last post except perhaps I can phrase it more carefully.

I am assuming that the passenger arrives sometime between the top of one hour and the top of the next hour, and that we are then to compute the average length of time he will have to wait. We phrase it assuming the passenger arrives between 2 and 3:

Assumptions: There will be n (with n=3 if you like) trains arriving between 2 and 3, and n more trains arrive between 3 and 4. There will be one passenger arriving between 2 and 3. These 2n+1 arrival times are a independent set of random variables, each of them uniformly distributed over either [2,3] for the first n trains and the passenger or else over [3,4] for the second collection of n trains. We let W be the length of time from the passenger's arrival to the arrival of the first train after he arrives.

The problem is to find the expected value of W, the time the passenger waits for the train. With this interpretation the answer is (n+3)/(2(n+1)^2). In particular, for n=3 the answer is 3/16.

The argument runs: The arrival times of the n trains between 2 and 3 breaks [2,3] into n+1 intervals, and similarly with he trains arriving between 3 and 4. The expected value of the length of each of those intervals is (one can see fairly easily) 1/(n+1). If the passenger arrives randomly during any of the first n intervals his expected waiting time is half the length of that interval. If he arrives during the last interval he has missed all the trains for that hour. He has to wait until the end of that interval and then he has to wait for the first train of the next hour. His expected waiting time to reach the top of the hour is again half the expected length of this last interval, and his expected waiting time for the first train of the next hour is the full expected length of the first interval after 3.

So his expected waiting time conditioned on him arriving during any specific one of the first n interval is 1/(2{n+1)) and his expected waiting time conditioned on him arriving during the last interval is 3/(2(n+1)). Since all intervals have the same expected length, we get the expected waiting time by averaging these (n+1) numbers. We get (n+3)/(2(n+1)^2).

The problem is an artificial one and as in many such problems perhaps I have misunderstood the meaning. . Even if so, perhaps this will help in pinning down just what the intended interpretation was.

Of course the analysis could be in error, but it seems right.

mike777 · May 19, 2014

Ken as I have stated before you are a wonderful teacher of math. I mean a wonderful teacher of math to us non math guys.

This post shows I think where math teachers lose us.

I understand your English, you lose us in the equation. You lose us in the very first equation, that to you seems easy to understand but it is not.

My guess is it only takes more time, much more time to explain the equations to equal how well you explain the problem in plain English.

gwnn · May 19, 2014

kenberg, I suspect the problem with your reasoning is that there will almost always be intervals with different lengths. The expected length of the four lengths in order of lengths is probably something like 20-16-14-10 or something like that (I guess there is an analytic formula for this but I'm too lazy to look it up). He will be more likely to get in the longest interval than the shortest one and I don't see how you accounted for that. In the case that I outline here, the average length of the interval he gets there is 15.87 minutes (sum(x_i^2)/sum(x_i) where sum(x_i) is 60 here).

kenberg · May 19, 2014

kenberg, I suspect the problem with your reasoning is that there will almost always be intervals with different lengths. The expected length of the four lengths in order of lengths is probably something like 20-16-14-10 or something like that (I guess there is an analytic formula for this but I'm too lazy to look it up). He will be more likely to get in the longest interval than the shortest one and I don't see how you accounted for that. In the case that I outline here, the average length of the interval he gets there is 15.87 minutes (sum(x_i^2)/sum(x_i) where sum(x_i) is 60 here).

First a note to mike:This will continue to be math heavy. The problem has a vagueness to it that math can eliminate, but this does come at a cost to non-mathematicians.. I will try another version later that might be clearer.

My first thought was to look up trhe expected value of the lengths by instead I computed them, at first for n=3 and then for general n. Let's do n=3. After this I will give the argument, simpler but more abstract, for general n. First we have to be clear about just which intervals we are speaking of. Let's go with n=3. I will measire time from the top of the hour sp at a quarter after 2 I will say the time is 0.25.

I let A1, A2, A3 be the arrival times, X1,X2,X3 be these same three numbers listed in increasing order. So if the arrival tiems are 0.86,0.23, 0.57 then X1= 0.23, X2= 0.57 and X3=0.86.

The intervals I am speaking of are [0,X1], {X1.X2], [X2,X3] , {X3,1]. I am claiming that the expecte values of X1-0, of X2-Xq, of X3-X2, and of [1-X2] are each 1/4.

First some symmetry: The expected value of X1-0 is the same as the expected value of 1-X2 since the minimum of A1, A2,A3 is as likely to be close to 0 as the maximum is to be clost to 1. Similarly, the expected value of X2-X1 is the same as the expected value of X2-X1 since in one case we are subtracting second largest from largest, in the other case we are subtracting smallest from second smallest.

Also, the lengths add to one, so the expected values add to 1. So I I show that the expected value of the length of the left most interval is 1/4 I am done since then so is the rightmmost, and ttwo middle ones split what's left.

Now I computed the expected value of X1 a few posts back and got 1/4. Here is how: For any random variable X, the expected value of X (if it exists, for some rvs it doesn't) is given by integrating x dF(x) where F is the cumulative distribution function F*x) -Prob (X<x) and the integral is Riemann Stieltjes. The function f will be continuous whenever, as here, Prob(X=x)=0 ofr all x (The probability of arriving exactly at x=.5 as contrasted with , say, between 0.4999999 and 0.5000001 is 0). Usually, and it happens here, the integral of x dF(x) can be replaced by the integral of xf(x) dx where f is the derivative of f.

Sorry for all that, the short version is that we have to compute F(x), differentiate it to get f, and then integrate xf(x) to get teh expected value of X.

Here calculating F directly is easy. P(X1<x) = 1- P(A1,A2, A3 are all >x) since X1 is not less than X if and only if all of A1, A2, A3 are greater than X. Well. greater or equal but in a continuous variable it doesn't matter since the probability of exact equality is 0.

Thus, F(x) =1-(1-x)^3, f(x)=3(1-x))^2 and E(X1), the expected value of X1, is the integral of 3x(1-x)^2 from 0 to 1. We can just do this, or more easily we can observe that thjis is the same as the integral from 0 to 1 of 3x^2(1-x). Since this expression is 3(x^2-x^3) the integral from 0 to 1 is 3(1/3 -1/4) which comes to 3/12=14.

So, for n=3, all the intervals of interest have an expected length of 1/4.

For general n, the same reasoning applied to the first integral leads us to integrating n(x^(n-1)-(x^n)) from 0 to 1 and we get n((1/(n+1))-(1.n))=1/(n+1).

But for n larger than 3 we need a different approach to see that the other intervals also have the same expected length. A probabilistic approach efficiently (more efficiently than the above) does this. I'll come back to that later. The above is the way I first calculated this, it has the advantege of being a direct calculation for the definitions and so is probably actually right. So is, I think, may general approach for n, but that's another post.

incidentally, it is also possible to get to the answer as follows. For each t in [01] calculate the expected waitng time assuming the passenger arrives at time t, and the integrate this quantity from 0 to 1. I am hopeful that I get the same answer when I do this, I think that I will.

First, some coffee.

lamford · May 19, 2014

For general n, the same reasoning applied to the first integral leads us to integrating n(x^(n-1)-(x^n)) from 0 to 1 and we get n((1/(n+1))-(1.n))=1/(n+1).

Which is indeed the same figure as the average of the minimums of sets of three random numbers between 0 and 1.

kenberg · May 19, 2014

Which is indeed the same figure as the average of the minimums of sets of three random numbers between 0 and 1.

Yes. But my point here is that for 3. once I get the value of 1/4 for the first interval then I have the value of 1/4 for the last, and thus the remaining interfvals in the middle split up what is left, making 1/4 for all of them.

For general n, we can use the same reasoning to get 1/(n+1) for the first and last intervals but we need to think again to get 1/(n+1) for all the intervals.

gwnn · May 19, 2014

kenberg, I have had serious reading issues these last few days, so correct me if I'm wrong. You are essentially saying:

1. The three trains cut the hour in four intervals.

2. The four intervals each have an expected value of 15 minutes.

3. The probability that you get to any of the four intervals is equal, 25% (i.e., if you did the experiment 100 times, each time with random and different intervals and you noted which of the four intervals you got to, you would rate to get 25 ones, 25 twos, etc).

4. So you are 1/4 to be in the first quarter, which rates to be 15 minutes, and so on.

The first three of these is correct (and I might add, self-evident even without maths), but the fourth does not follow. Did you account for this? I posted a simple guess above about how average intervals would look like (ordered by length, not ordered chronologically!) It is true that on average your chance to be at the station in the first quarter is 25%, but you are much more likely to get to a "long" first quarter than a "short" first quarter. If the first quarter was 5 minutes, you are only 1/12 to get to the station in it. If the first quarter was 30 minutes, you are 1/2.

edit: to clarify, this has been discussed already by helene_t BTW upthread (that the interval you get to is more likely than not to be more than 15 minutes long): http://www.bridgebase.com/forums/topic/66432-a-puzzle/page__view__findpost__p__794130

Edited May 19, 2014 by gwnn

A Puzzle

Recommended Posts

Link to comment

Share on other sites

Top Posters In This Topic

Popular Days

Top Posters In This Topic

Popular Days

Popular Posts

billw55

helene_t

gwnn

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Join the conversation