| T.R | Title | User | Personal Name
 | Date | Lines | 
|---|
| 750.1 |  | CLT::GILBERT | Builder | Sat Aug 15 1987 21:36 | 16 | 
|  | >   I'm looking for an exact solution, a function of F, M, and N.
    I think that this has no closed-form solution.  If you have some
    rough idea of the values of F, M, and N, perhaps an approximate
    solution would suffice?
    I did find that
	p(M,F,N) = ((M+1-F) * choose(M-N,F-N)) / choose(M,F), if F < 2*N,
    where choose(N,K) is a binomial coefficient, i.e., N!/( (N-K)! K! ).
    The problem with the above is that if P >= 2*N, it counts some things
    too many times -- the probability is too high.  However, if the
    probability is known to be low, the overcounting can be approximated
    and removed. 
 | 
| 750.2 |  | VMSINT::THIEL | Dave Thiel -- VMS Development | Sun Aug 16 1987 21:34 | 19 | 
|  |     The limited domain solution of .1 seems to fail a basic sanity test,
    e.g.
    
	P(M,F,1) = 1 for F > 0
    
        
   The offered limted solution is:
    
	((M+1-F) * choose(M-N,F-N)) / choose(M,F), if F < 2*N,
    When N = 1, this is:
    
    	((M+1-F) * choose(M-1,F-1)) / choose(M,F)
    
    =   ((M+1-F) * (M-1)! * F! * (M-F)!) / ((F-1)! * (M-F)! * M!)
    
    =   (M+1-F) * F / M
    
    
 | 
| 750.3 |  | BEING::POSTPISCHIL | Always mount a scratch monkey. | Mon Aug 17 1987 09:23 | 25 | 
|  |     Re .2:
    
    >     =   (M+1-F) * F / M
    Since N <= F < 2N and N=1, then F=1, so the expression is
    
    	= (M+1-1)*1/M = 1, which is expected.
    
    However, I think the expression in .1 needs adjustment for another
    reason.  It basically enumerates the ways to pick a sequence and then
    pick the F-N remaining numbers.  This counts some sequences twice for
    some selections of the remaining numbers.  (E.g., picking 3, 4 and then
    2 is the same as picking 2, 3, and then 4.) 
    
    There are C(M-N,F-N) selections which contain a sequence starting with
    1.  For the other numbers 2 through M+1-F, there are C(M-N-1,F-N)
    selections, obtained by picking a sequence and then selecting the rest
    of the numbers except for the number right before the sequence, which
    would give us a sequence starting with a lower number, which we already
    counted.  So we have:
    
    	p(M,F,N) = (C(M-N,F-N) + (M-F) C(M-N-1,F-N)) / C(M,F) if F <= 2N
    
    
    				-- edp 
 | 
| 750.4 |  | CLT::GILBERT | Builder | Mon Aug 17 1987 17:14 | 8 | 
|  | >   For the other numbers 2 through M+1-F, there are [...]
    Don't you mean "For the other numbers 2 through M-N+1, there are ..."?
    You'll find that this simplifies to the expression given in .1.
>    	p(M,F,N) = (C(M-N,F-N) + (M-F) C(M-N-1,F-N)) / C(M,F) if F <= 2N
    Here, you must intend "if F < 2N".
 | 
| 750.6 |  | CLT::GILBERT | Builder | Tue Aug 18 1987 12:34 | 75 | 
|  |     The approach taken in .3 is useful, but in general it overaccounts for
    things.  The following takes the corrections into consideration, and
    yields an answer which, while not a closed expression, is the sum of
    only floor(f/n) terms.
    Let z(m,n,p) be the number of ways to place p distinct runs of length n
    over a range of m.  With a little work we discover that:
	z(m,n,p) = C(m-p*n+1, p)
    Note .3 had
	w(m,n,f,1) = C(m-n,f-n) + (m-n)*C(m-n-1,f-n)
    More generally,
	w(m,n,f,p) = z(m-(n+1),n,p-1) * C(m-p*(n+1)+1,f-p*n)
			+ (z(m,n,p) - z(m-(n+1),n,p-1)) * C(m-p*(n+1),f-p*n)
    Note that this is roughly z(m,n,p) * C(m-p*(n+1),f-p*n); that is,
    the number of ways to place p runs of length n, times the number
    of ways to place the rest of the numbers.  The exact expression
    accounts for combinations that have a run starting at position 1.
    We can tediously expand the equation for w(m,n,f,p), to get
				      (m-n*p)!
	w(m,n,f,p) = (m-f+1) * ----------------------
			       (m-p-f+1)! p! (f-n*p)!
    
    But w is not what we want, because it counts arrangements having multiple
    runs too many times.  For example, w(m,n,f,1) counts arrangements having
    two runs twice, arrangements having three runs three times, and so on.
    More generally, w(m,n,f,p) counts arrangements having k runs C(k,p) times. 
    Letting x(m,n,f,p) be the exact count, we have:
		     inf
    		     ---
	w(m,n,f,p) = >   C(k,p) * x(m,n,f,k)
		     ---
		     k=p
    Of course, the upper bound on the summation need only go to floor(f/n).
    What we want is the exact number arrangements having at least one run:
	inf
	---
	>   x(m,n,f,k)
	---
	k=1
    But this is equal (!) to the following:
	inf              inf
	---              ---     k+1
	>   x(m,n,f,k) = >   (-1)    w(m,n,f,k)
	---              ---
	k=1              k=1
    And so the number of ways to choose f numbers from a set of m such
    that there is at least one run of length n is:
     floor(f/n)
	---     p+1    (m-f+1) (m-n*p)!
	>   (-1)    ----------------------
	---         (m-p-f+1)! p! (f-n*p)!
	p=1
    Of course, the probability is this divided by C(m,f).
					- Gilbert
 | 
| 750.7 |  | CLT::GILBERT | Builder | Tue Aug 18 1987 12:54 | 7 | 
|  | Or to make it a tad prettier,
floor(f/n)                              floor(f/n)
   ---     p+1    (m-f+1) (m-n*p)!         ---     p+1
   >   (-1)    ---------------------- =    >   (-1)    C(m-f+1,p) C(m-n*p,m-f)
   ---         (m-p-f+1)! p! (f-n*p)!      ---
   p=1                                     p=1
 | 
| 750.8 |  | SSDEVO::LARY |  | Thu Aug 20 1987 14:41 | 20 | 
|  | 
In the summation
floor(f/n)                           
   ---     p+1    (m-f+1) (m-n*p)!   
   >   (-1)    ----------------------
   ---         (m-p-f+1)! p! (f-n*p)!
   p=1                               
The quantity (m-p-f+1) can become negative if f is greater than ~ mn/(n+1).
Computationally, I don't think this is a problem, as by the time this happens
in the summation the terms have become vanishingly small.
I believe the problem arises from the fact that the domain of w(m,n,f,p) in p
is smaller than the domain of the final summation in p; i.e. sometimes there
aren't any "rest of the numbers" to place. It can be fixed by changing the
upper limit of the sum to min(floor(f/n),m-f+1).
Nice derivation! especially the combinatorial sum trick....
 | 
| 750.9 |  | CLT::GILBERT | Builder | Thu Aug 20 1987 17:48 | 2 | 
|  |     Thanks for pointing out the problem.  I think your explanation doesn't
    quite suffice, but agree that w(m,n,f,p) is the likely culprit.
 | 
| 750.10 |  | VMSINT::THIEL | Dave Thiel -- VMS Development | Fri Aug 21 1987 00:06 | 70 | 
|  |     An urn contains M balls uniquely numbered 1 through M.  A set of
    F balls is drawn at random from the urn.  M >= F >= N > 0.
    
    What is the probability that the set of F balls contains at least
    one set of N sequentially numbered balls?
    
    There are C(M,F) ways to choose a set of F balls from the M balls.
    
    Let Q (M,F,N), M >= F >= N > 0, be the number of ways to choose F balls
    from the M balls such that there are at least N consecutively numbered
    balls among the F.
                Q (M,F,N)
    P (M,F,N) = ---------
                 C(M,F)
    
    Determine Q by enumerating all possibilities (stay with me, it's
    not as tedious as it may sound).
    
    Use the notation 1 (2) 3 ... to indicate a set of balls containing
    the balls numbered 1 and 3, omitting the ball numbered 2, and and
    arbitrary collection of balls with numbers greater than 3.
    
    Q (M,F,N) =
    
    	if M = F, then 1	(there is only one way to choose F,
    				 and F must contain N consecutively
    				 numbered balls)
    
    	otherwise, M > F.
	(1) ...  then  Q(M-1,F,N)
    
	1 (2) ...  if N > 1 then  Q(M-2,F-1,N) if F-1 >= N else 0
    
	1 2 (3) ...  if N > 2 then Q(M-3,F-2,N) if F-2 >= N else 0
    	...
    
        1 2 ... N-1 (N)  then Q(M-N,F-N+1,N) if F-N+1 >= N else 0
        1 2 ... N-1 N ...  then C(M-N,F-N)
    Adding all of these up:
    
    Q(M,F,N) (M >= F >= N > 0)  =
    
        if M = F    then   1
    
                                 min(N,F+1-N)
                                      ---
	otherwise    	C(M-N,F-N) +  >    Q(M-i,F-i+1,N)
                                      ---
                                      i=1
    and therefore:
    
    P(M,F,N) (M >= F >= N > 0)  =
    
        if M = F    then   1 / C(M,F)
    
                                 min(N,F+1-N)
                                      ---
	             	C(M-N,F-N) +  >    Q(M-i,F-i+1,N)
                                      ---
                                      i=1
        otherwise      -----------------------------------
                                    C(M,F)
 | 
| 750.11 | Ayup, that counts all the possibilities | CLT::GILBERT | Builder | Fri Aug 21 1987 13:53 | 30 | 
|  | re .10
	How many evaluations of distinct Qs are needed?  Assuming
	you store each computed Q in a two-dimensional array (the
	N parameter is the same for all of them), as reuse these
	saved values as needed, then roughly 2*N*(M-N)*(M-F)/(N-1)
	different values of the Q function must be evaluated.
	And, if F > 2N, most of these Qs require N additions; this
	approach is never much better than that given in .6, and
	may be considerably worse.
	Note that the calculations can be done without recursion:
	f := N;
	for m := f to M - ((F-f)*N) div (N-1) do
	    Q[m,f] := C(m,f);
	for f := N+1 to F do
	    begin
	    ibound := min(N,f+1-N);
	    for m := f to M - ((F-f)*N) div (N-1) do
		begin
		temp := C(m-N,f-N);
		for i := 1 to ibound do
		    temp := temp + Q[m-i,f-i+1];
		Q[m,f] := temp;
		end;
	    end;
Note that m, M, f, and F are distinct variables.  For simplicity, and
only slightly more computation, the expression "M - ((F-f)*N) div (N-1)"
could be replaced with "M - (F-f)".
 | 
| 750.12 | Demonstration of equivalent results | CLT::GILBERT | Builder | Sun Aug 23 1987 20:03 | 130 | 
|  |     The following Pascal program computes Q(m,f,n) by three different
    techniques, compares the results (to ensure they are equal), and
    displays the amount of CPU time consumed by each calculation.
					- Gilbert
program y (input, output);
function C(n,k: integer): integer;
var
    i,t:	integer;
begin
if n < k
then
    t := 0
else
    begin
    t := 1;
    for i := 1 to k do t := (t*(n+1-i)) div i;
    end;
C := t;
end;
function Q1 (m,f,n: integer): integer;
var
    p,t,u:	integer;
begin
t := 0;
for p := 1 to f div n do
    begin
    u := C(m-f+1,p) * C(m-n*p,m-f);
    if not odd(p) then u := -u;
    t := t + u;
    end;
Q1 := t;
end;
{ }
{floor(f/n)                              floor(f/n)
{   ---     p+1    (m-f+1) (m-n*p)!         ---     p+1
{   >   (-1)    ---------------------- =    >   (-1)    C(m-f+1,p) C(m-n*p,m-f)
{   ---         (m-p-f+1)! p! (f-n*p)!      ---
{   p=1                                     p=1
{ }
function Q2 (m,f,n: integer): integer;
var
    i,t:	integer;
begin
if m = f
then
    t := 1
else
    begin
    t := C(m-n,f-n);
    for i := 1 to min(n,f+1-n) do
	t := t + Q2(m-i,f-i+1,n);
    end;
Q2 := t;
end;
function Q3 (M,F,N: integer): integer;
var
    mm,ff,i,ibound,t: integer;
    Q: array [1..100,1..100] of integer;
begin
ff := N;
for mm := ff to M - ((F-ff)*N) div (N-1) do
    Q[mm,ff] := mm - ff + 1;
for ff := N+1 to F do
    begin
    ibound := min(N,ff+1-N);
    for mm := ff to M - ((F-ff)*N) div (N-1) do
	begin
	t := C(mm-N,ff-N);
	if mm > ff then
	for i := 1 to ibound do
	    t := t + Q[mm-i,ff-i+1];
	Q[mm,ff] := t;
	end;
    end;
Q3 := Q[M,F];
end;
procedure check (m,f,n: integer);
var
    E1,E2,E3:	integer;
    t0,t1,t2,t3:    integer;
begin
t0 := CLOCK;
E1 := Q1(m,f,n);
t1 := CLOCK;
E2 := Q2(m,f,n);
t2 := CLOCK;
E3 := Q3(m,f,n);
t3 := CLOCK;
if (E1 = E2) and (E2 = E3) then else write ('*** ');
write ('Q(',m:0,',',f:0,',',n:0,') = ',E1:0);
if (E1 = E2) and (E2 = E3) then else write (' vs ',E2:0,' vs ',E3:0);
writeln;
writeln (t1-t0:0,' ',t2-t1:0,' ',t3-t2:0);
end;
{ }
{    Q(M,F,N) (M >= F >= N > 0)  =
{        if M = F    then   1
{                                 min(N,F+1-N)
{				      ---
{	otherwise    	C(M-N,F-N) +  >    Q(M-i,F-i+1,N)
{				      ---
{                                     i=1
{ }
procedure main;
var
    m,f,n:	integer;
begin
while true do
    begin
    readln (m,f,n);
    check (m,f,n);
    end;
end;
begin
main;
end.
 |