Documentation

BiasedBisect.Basic

Basic definition and solution to the biased bisect problem #

We construct a solution to the following problem:

A software has $n+1$ versions $v_0,v_1,\dots,v_n$, and $n$ changes in between. It is discovered one of the change broke a feature of the software, but we don't know which one did. We only know $v_0$ is fine and $v_n$ is broken. How do we quickly find the broken change?

The answer to this classic question is to do binary search. But what if the cost of performing a software test depends on the outcome? For example, a broken software will result in a system crash and requires much longer time to reboot. Which version should we test first, and next?

Let s and t be the cost of a successful and a failed test. The expected cost $F(n)$ is

$$ F(n) = \begin{cases} 0 & n = 1 \\ \min_{1 ≤ w ≤ n - 1} \{\frac{w}{n} (F(w) + t) + \frac{n - w}{n} (F(n - w) + s)\} & n \ge 2 \end{cases} $$

where $w$ is the first version to test. To simplify a little bit, we normalize $F(n)$ with

$$ E(n) = n F(n) $$

where $E(n)$ satisfies

$$ E(n) = \begin{cases} 0 & n = 1 \\ \min_{1 ≤ w ≤ n - 1} \{E(w) + E(n - w) + w t + (n - w) s\} & n \ge 2 \end{cases} $$

Main statements #

IsOptimalCostℤ and IsOptimalStratℤ are the target properties of the solution.
IsOptimalCost and IsOptimalStrat are the modified properties that extend the domain to ℝ.
E is the optimal cost function of the solution with domain in ℝ.
- E_IsOptimalCost verifies this is the solution to IsOptimalCost.
wₛₑₜ is the optimal strategy function of the solution with domain in ℝ.
- wₘᵢₙ is the minimal optimal strategy.
- wₘₐₓ is the maximal optimal strategy.
- wₗᵢ is the principal strategy.
- W_IsOptimalStrat verifies this is the solution to IsOptimalStrat.
Eℤ restricts E to ℤ.
- Eℤ_IsOptimalCost verifies this is the solution to IsOptimalCostℤ.
wℤ restricts wₛₑₜ to ℤ
- Wℤ_IsOptimalStrat verifies this is the solution to IsOptimalStratℤ.

theorem sum_to_zero (a b : ℝ) (ha : a ≥ 0) (hb : b ≥ 0) (h : a + b ≤ 0) :

a = 0 ∧ b = 0

theorem finset_max_eq {α : Type u_1} [LinearOrder α] (s : Finset α) {m : α} (mem : m ∈ s) (other : ∀ n ∈ s, n ≤ m) :

s.max = ↑m

A lot of statements requires positive numbers, so we define a convenient class to pass them around.

class PosReal (x : ℝ) :

pos : x > 0

Instances

instance instPosRealHMulReal (a b : ℝ) [PosReal a] [PosReal b] :

PosReal (a * b)

instance instPosRealHDivReal (a b : ℝ) [PosReal a] [PosReal b] :

PosReal (a / b)

instance instPosRealHAddReal (a b : ℝ) [PosReal a] [PosReal b] :

PosReal (a + b)

instance instPosRealInvReal (a : ℝ) [PosReal a] :

instance instPosRealOfNatReal :

instance instPosRealCastRealVal (s : ℕ+) :

PosReal ↑↑s

Throughout the file, we will use a pair of real positive parameters $s$ and $t$.

We start with the lattic Λ = ℕ × ℕ and assign each lattice point $(p, q)$ a value $δ = ps + qt$. Visually, this is drawing a line passing the point with a fixed slope (namely -s/t) and measures how far away it is from the origin.

All possible $δ$ makes up the set Δ. One can notice that the "shape" of this set is different depending on whether $s/t$ is rational or not:

For irrational $s/t$, each lattice point will get a assigned a unique $δ$, and Δ get more dense when we are futher away from the origin.
For rational $s/t$, a line of slope $-s/t$ can pass multiple lattice points, and eventually the gap between $δ$ is stabilized at a fixed value $\gcd(s, t)$.

def Δ (s t : ℝ) :

Equations

Δ s t = {δ : ℝ | ∃ (p : ℕ) (q : ℕ), ↑p * s + ↑q * t = δ}

Instances For

The set Δ is symmetric for $s$ and $t$. We will explore this symmetry a lot later on.

theorem Δ_symm (s t : ℝ) :

Δ s t = Δ t s

Another property we will explore is homogeneity: parameters $(l s, l t)$ is closely related to $(s, t)$, and the associated objects is either the same, or scaled by $l$.

theorem Δ_homo (s t l : ℝ) [lpos : PosReal l] (δ : ℝ) :

δ ∈ Δ s t ↔ l * δ ∈ Δ (l * s) (l * t)

For each lattice point, we can assign a $δ$. As previously mentioned, this is injective only when $s/t$ is irrational.

noncomputable def δₚ (s t : ℝ) (pq : ℕ × ℕ) :

Equations

δₚ s t (p, q) = ↑p * s + ↑q * t

Instances For

Similarly, δₚ is also symmetric, but one needs to swap the coordinates of the input.

theorem δₚ_symm (s t : ℝ) (p q : ℕ) :

δₚ s t (p, q) = δₚ t s (q, p)

We can draw a line with slope $-s/t$ and only consider lattice points enveloped by the line, including those on the line. Equalently, this is considering only points whose assigned $δ$ is less or equal to a given value. We call these subsets as "ceiled".

def Δceiled (s t ceil : ℝ) :

Equations

Δceiled s t ceil = Δ s t ∩ {δ : ℝ | δ ≤ ceil}

Instances For

def Λceiled (s t ceil : ℝ) :

Set (ℕ × ℕ)

Equations

Λceiled s t ceil = {(p, q) : ℕ × ℕ | ↑p * s + ↑q * t ≤ ceil}

Instances For

theorem Λceiled_symm (s t δ : ℝ) (p q : ℕ) (h : (p, q) ∈ Λceiled s t δ) :

(q, p) ∈ Λceiled t s δ

theorem Λceiled_homo (s t δ l : ℝ) [PosReal l] :

Λceiled s t δ = Λceiled (l * s) (l * t) (l * δ)

As an important example, the subset ceiled by $0$ only includes the point $(0, 0)$

theorem Λceiled₀ (s t : ℝ) [PosReal s] [PosReal t] :

Λceiled s t 0 = {(0, 0)}

If the ceiling is negative, Λceiled is the empty set.

theorem Λceiled_neg (s t δ : ℝ) (neg : δ < 0) [PosReal s] [PosReal t] :

Λceiled s t δ = ∅

δₚ maps all ceiled lattice points to all ceiled $δ$

theorem Λ_map_ceiled (s t ceil : ℝ) :

δₚ s t '' Λceiled s t ceil = Δceiled s t ceil

We would like to prove that Δceiled is finite. We first introduce bounded natural numbers, and their products and show their finiteness.

def ℕceiled (ceil : ℝ) :

Equations

ℕceiled ceil = {p : ℕ | ↑p ≤ ceil}

Instances For

instance ℕceiled_finite (ceil : ℝ) :

Finite ↑(ℕceiled ceil)

def ΛRec (pbound qbound : ℝ) :

Set (ℕ × ℕ)

Equations

ΛRec pbound qbound = (ℕceiled pbound).prod (ℕceiled qbound)

Instances For

instance ΛRec_finite (pbound qbound : ℝ) :

Finite ↑(ΛRec pbound qbound)

Λceiled is always inside a rectangle region, hence finite

theorem Λceiled_in_rec (s t ceil : ℝ) [PosReal s] [PosReal t] :

Λceiled s t ceil ⊆ ΛRec (ceil / s) (ceil / t)

instance Λceiled_finite (s t ceil : ℝ) [PosReal s] [PosReal t] :

Finite ↑(Λceiled s t ceil)

noncomputable instance instFintypeElemProdNatΛceiledOfPosReal (s t ceil : ℝ) [PosReal s] [PosReal t] :

Fintype ↑(Λceiled s t ceil)

Equations

instFintypeElemProdNatΛceiledOfPosReal s t ceil = Fintype.ofFinite ↑(Λceiled s t ceil)

As the image of δₚ, Δceiled is therefore also finite.

instance Δceiled_finite (s t ceil : ℝ) [PosReal s] [PosReal t] :

Finite ↑(Δceiled s t ceil)

Consequently Δceiled well-ordered.

theorem Δceiled_WF (s t ceil : ℝ) [PosReal s] [PosReal t] :

(Δceiled s t ceil).IsWF

We now can show the whole set Δ is also well-ordered. Although Δ is an infinite set and can become arbitrarily dense for larger elements, its base, as indicated by the ceiled variation, behaves friendly for the order.

theorem Δ_WF (s t : ℝ) [PosReal s] [PosReal t] :

(Δ s t).IsWF

Δ always has the smallest element 0

theorem δ0 (s t : ℝ) :

0 ∈ Δ s t

theorem Δ_nonempty (s t : ℝ) :

(Δ s t).Nonempty

theorem Δ_min_element {δ : ℝ} (s t : ℝ) (δin : δ ∈ Δ s t) [PosReal s] [PosReal t] :

0 ≤ δ

theorem Δ_min (s t : ℝ) [PosReal s] [PosReal t] :

⋯.min ⋯ = 0

We also introduce "floored" subsets, the complement of ceiled ones. These subsets contain elements where $δ$ is larger than a certain threshold.

def Δfloored (s t floor : ℝ) :

Equations

Δfloored s t floor = Δ s t ∩ {δ : ℝ | δ > floor}

Instances For

Obviously, floored sets are also symmetric.

theorem Δfloored_symm (s t floor : ℝ) :

Δfloored s t floor = Δfloored t s floor

... and homogeneous.

theorem Δfloored_homo (s t floor l : ℝ) [PosReal l] (δ : ℝ) :

δ ∈ Δfloored s t floor ↔ l * δ ∈ Δfloored (l * s) (l * t) (l * floor)

Floored sets are still infinite, but are well-ordered as subsets.

theorem Δfloored_WF (s t floor : ℝ) [PosReal s] [PosReal t] :

(Δfloored s t floor).IsWF

Floored sets are always non-empty due to the unboundness of Δ.

theorem Δfloored_nonempty (s t floor : ℝ) [PosReal s] [PosReal t] :

(Δfloored s t floor).Nonempty

Since Δ is well-ordered, it is possible to sort all elements and enumerate them starting from the smallest one (0).

We first define the find the next $δ'$ given an element $δ$ using floored subsets. Note that this function also accepts input outside of Δ. It simply finds the smallest $δ$ that's larger than the input.

noncomputable def δnext (s t : ℝ) [PosReal s] [PosReal t] (floor : ℝ) :

Equations

δnext s t floor = ⋯.min ⋯

Instances For

Again the symmetry is passed on to δnext.

theorem δnext_symm (s t floor : ℝ) [PosReal s] [PosReal t] :

δnext s t floor = δnext t s floor

δnext is homogeneous.

theorem δnext_homo (s t floor l : ℝ) [PosReal s] [PosReal t] [PosReal l] :

l * δnext s t floor = δnext (l * s) (l * t) (l * floor)

δnext is weakly increasing w.r.t floor

theorem δnext_mono (s t : ℝ) [PosReal s] [PosReal t] :

Monotone (δnext s t)

δnext will always output an element in Δ.

theorem δnext_in_Δ (s t floor : ℝ) [PosReal s] [PosReal t] :

δnext s t floor ∈ Δ s t

δnext will always output an element larger than the input.

theorem δnext_larger (s t floor : ℝ) [PosReal s] [PosReal t] :

δnext s t floor > floor

δnext also effectively gives the "gap" between the input δ and the next δ'. There is no additional lattice point between this gap, which means Λceiled is inert for any bound given between the gap.

theorem Λceiled_gap (s t δ β : ℝ) [PosReal s] [PosReal t] (leftBound : δ ≤ β) (rightBound : β < δnext s t δ) :

Λceiled s t δ = Λceiled s t β

We can define the sequence δₖ by sorting all elements in Δ. The index $k$ will also be used a lot for other related sequences.

noncomputable def δₖ (s t : ℝ) [PosReal s] [PosReal t] :

Equations

δₖ s t 0 = 0
δₖ s t n.succ = δnext s t (δₖ s t n)

Instances For

δₖ is obviously strictly increasing.

theorem δₖ_mono (s t : ℝ) [PosReal s] [PosReal t] :

StrictMono (δₖ s t)

δₖ covers all elements in Δ.

theorem δₖ_surjΔ (s t δ : ℝ) (mem : δ ∈ Δ s t) [PosReal s] [PosReal t] :

∃ (k : ℕ), δₖ s t k = δ

δₖ is also symmetric.

theorem δₖ_symm (s t : ℝ) (k : ℕ) [PosReal s] [PosReal t] :

δₖ s t k = δₖ t s k

δₖ always starts with 0.

theorem δ₀ (s t : ℝ) [PosReal s] [PosReal t] :

δₖ s t 0 = 0

δₖ is homogeneous.

theorem δₖ_homo (s t l : ℝ) (k : ℕ) [PosReal s] [PosReal t] [PosReal l] :

l * δₖ s t k = δₖ (l * s) (l * t) k

All δₖ are obviously elements in Δ. Together with δₖ_surjΔ, this shows δₖ is a bijection between Δ and ℕ.

theorem δₖ_in_Δ (s t : ℝ) (k : ℕ) [PosReal s] [PosReal t] :

δₖ s t k ∈ Δ s t

We introduce a new kind of subset of the lattice: lattice points whose assigned $δ$ is exactly a given constant. It can be empty if the given constant is not in Δ.

As one can notice, this subset is a sub-singleton when $s/t$ is irrational, but we won't expand on it here.

def Λline (s t δ : ℝ) :

Set (ℕ × ℕ)

Equations

Λline s t δ = δₚ s t ⁻¹' Set.singleton δ

Instances For

This subset is again symmetric with lattice coordinates swapped.

theorem Λline_symm (s t δ : ℝ) (p q : ℕ) (h : (p, q) ∈ Λline s t δ) :

(q, p) ∈ Λline t s δ

If the line is negative, it won't cover any lattice points.

theorem Λline_neg (s t δ : ℝ) (neg : δ < 0) [PosReal s] [PosReal t] :

Λline s t δ = ∅

Elements in Λline is allowed to shift in coordinates and change their $δ$ by $s$.

Note that this is not saying Λline of $δ$ and of $δ + s$ are one-to-one. When shifting $δ$ by $s$, it can potentially introduce a new element with $p' = 0$. This element is ruled out by the $p' = p + 1 ≥ 1$ in the statement.

theorem Λline_s (s t δ : ℝ) [PosReal s] [PosReal t] (p q : ℕ) :

(p, q) ∈ Λline s t δ ↔ (p + 1, q) ∈ Λline s t (δ + s)

By symmetry, we can state similarly for $t$ and $q$.

theorem Λline_t (s t δ : ℝ) [PosReal s] [PosReal t] (p q : ℕ) :

(p, q) ∈ Λline s t δ ↔ (p, q + 1) ∈ Λline s t (δ + t)

The line subset at $δ = 0$ gives the singleton $(0, 0)$.

theorem Λline₀ (s t : ℝ) [PosReal s] [PosReal t] :

Λline s t 0 = {(0, 0)}

Λline is not empty when the input is from Δ.

theorem Λline_nonempty (s t δ : ℝ) (δinΩ : δ ∈ Δ s t) :

(Λline s t δ).Nonempty

Λline is a subset of the corresponding Λceiled, and therefore is also finite.

theorem Λline_in_Λceiled (s t δ : ℝ) :

Λline s t δ ⊆ Λceiled s t δ

instance instFiniteElemProdNatΛlineOfPosReal (s t δ : ℝ) [PosReal s] [PosReal t] :

Finite ↑(Λline s t δ)

noncomputable instance instFintypeElemProdNatΛlineOfPosReal (s t δ : ℝ) [PosReal s] [PosReal t] :

Fintype ↑(Λline s t δ)

Equations

instFintypeElemProdNatΛlineOfPosReal s t δ = Fintype.ofFinite ↑(Λline s t δ)

Now we assign each lattice point with another value $J$, which is the Pascal triangle where $p$- and $q$-axies are the sides of the triangle.

def Jₚ :

ℕ × ℕ → ℕ

Equations

Jₚ (p, q) = (p + q).choose p

Instances For

Just like the Pascal triangle, Jₚ follows the recurrence relation.

It should be noted that if we embed $Λ$ in ℤ × ℤ and assign $J = 0$ to the rest of the points, all points still follow this recurrence relation except at $(0, 0)$. This defect will show up again later.

theorem Jₚ_rec (p q : ℕ) :

Jₚ (p + 1, q + 1) = Jₚ (p + 1, q) + Jₚ (p, q + 1)

A gross bound for Jₚ to decompose it to a product of $f(p)$ and $g(q)$.

theorem Jₚ_bound (p q : ℕ) :

Jₚ (p, q) ≤ 2 ^ p * 2 ^ q

On Λ, $J$ are all nonzero.

theorem Jₚ_nonzero (pq : ℕ × ℕ) :

Jₚ pq > 0

$J$ itself is symmatrical for swapped coordinates.

theorem Jₚ_symm (p q : ℕ) :

Jₚ (p, q) = Jₚ (q, p)

We can evaluate $J$ for a given $δ$, by summing up $J$ of all points passed by the line.

noncomputable def Jline (s t δ : ℝ) [PosReal s] [PosReal t] :

Equations

Jline s t δ = ∑ pq ∈ (Λline s t δ).toFinset, Jₚ pq

Instances For

The evaluation on the line is symmetric for $s$ and $t$.

theorem Jline_symm (s t δ : ℝ) [PosReal s] [PosReal t] :

Jline s t δ = Jline t s δ

A helper function to zero the value if the input is zero.

def shut (p value : ℕ) :

Equations

shut Nat.zero value = 0
shut n.succ value = value

Instances For

Jline can be shifted by $s$. The sum will however be affected by the potential point on the $p = 0$ boundary, hence the equality needs to remove such point.

theorem Jline_s (s t δ : ℝ) [PosReal s] [PosReal t] :

Jline s t (δ - s) = ∑ x ∈ (Λline s t δ).toFinset, match x with | (p, q) => shut p (Jₚ (p - 1, q))

A similar statement can be said for $t$

theorem Jline_t (s t δ : ℝ) [PosReal s] [PosReal t] :

Jline s t (δ - t) = ∑ x ∈ (Λline s t δ).toFinset, match x with | (p, q) => shut q (Jₚ (p, q - 1))

Derived from the recurrence of binomial coefficents, Jline is also recurrent, except for at $δ = 0$.

theorem Jline_rec (s t δ : ℝ) (δ0 : δ ≠ 0) [PosReal s] [PosReal t] :

Jline s t δ = Jline s t (δ - s) + Jline s t (δ - t)

At $δ = 0$, Jline gives the "seed" 1 that induces all other values.

theorem Jline₀ (s t : ℝ) [PosReal s] [PosReal t] :

Jline s t 0 = 1

For all elements of Δ, Jline is nonzero.

theorem Jline_nonzero (s t δ : ℝ) [PosReal s] [PosReal t] (δinΔ : δ ∈ Δ s t) :

Jline s t δ > 0

Since we have defined the sequence δₖ for all elements in Δ, we can map them to a sequence Jₖ by Jline

noncomputable def Jₖ (s t : ℝ) [PosReal s] [PosReal t] :

Equations

Jₖ s t k = Jline s t (δₖ s t k)

Instances For

The sequence Jₖ is symmetric.

theorem Jₖ_symm (s t : ℝ) [PosReal s] [PosReal t] :

Jₖ s t = Jₖ t s

The sequence Jₖ is non-zero.

theorem Jₖ_nonzero (s t : ℝ) (k : ℕ) [PosReal s] [PosReal t] :

Jₖ s t k > 0

We also define a pair of sequence Jsₖ and Jtₖ similar to Jₖ, but the line is shifted by $s$ or $t$. The shifting can make some line no longer pass any lattice points, so some Jsₖ and Jtₖ are zero.

noncomputable def Jsₖ (s t : ℝ) [PosReal s] [PosReal t] :

Equations

Jsₖ s t k = Jline s t (δₖ s t k - s)

Instances For

noncomputable def Jtₖ (s t : ℝ) [PosReal s] [PosReal t] :

Equations

Jtₖ s t k = Jline s t (δₖ s t k - t)

Instances For

Jsₖ and Jtₖ are symmetric to each other.

def Jstₖ_symm (s t : ℝ) (k : ℕ) [PosReal s] [PosReal t] :

Jsₖ s t k = Jtₖ t s k

Equations

⋯ = ⋯

Instances For

Derived from Jline recurrence formula, Jₖ can be decomposed into Jsₖ and Jtₖ

theorem Jstₖ_rec (s t : ℝ) (k : ℕ) (k0 : k ≥ 1) [PosReal s] [PosReal t] :

Jₖ s t k = Jsₖ s t k + Jtₖ s t k

Just like Jline for Λline, we can define Jceiled for Λceiled which sums over all lattices bounded by $δ$.

noncomputable def Jceiled (s t : ℝ) [PosReal s] [PosReal t] (δ : ℝ) :

Equations

Jceiled s t δ = ∑ pq ∈ (Λceiled s t δ).toFinset, Jₚ pq

Instances For

Jceiled is symmetric.

theorem Jceiled_symm (s t δ : ℝ) [PosReal s] [PosReal t] :

Jceiled s t δ = Jceiled t s δ

... and homogeneous.

theorem Jceiled_homo (s t δ l : ℝ) [PosReal s] [PosReal t] [PosReal l] :

Jceiled s t δ = Jceiled (l * s) (l * t) (l * δ)

Jceiled is weakly increasing with regard to $δ$. As $δ$ grows, Λceiled can either remain unchanged for include new points.

theorem Jceiled_mono (s t : ℝ) [PosReal s] [PosReal t] :

Monotone (Jceiled s t)

While Jceiled is only weakly increasing, and one can't deduce the relation of two $δ$ from their Jceiled, it is possible give a relation between $δ$ if one of them is δₖ

theorem Jceiled_mono' (s t δ : ℝ) (k : ℕ) [PosReal s] [PosReal t] (h : Jceiled s t δ = Jceiled s t (δₖ s t k)) :

δₖ s t k ≤ δ

The growth of Jceiled is precisely described by Jline. Another way to view this is to say Jceiled = ΣJline for all lines in the bound.

theorem Jceiled_accum (s t δ : ℝ) [PosReal s] [PosReal t] :

Jceiled s t δ + Jline s t (δnext s t δ) = Jceiled s t (δnext s t δ)

Since there are gaps between $δ$, Jceiled stops growing when inside these gaps. We can also derive a few variants of this lemma:

As long as $β$ is less than δnext$(δ)$, Jceiled$(β)$ is no larger than Jceiled$(δ)$.
or the contrapose: if Jceiled$(β)$ is larger than Jceiled$(δ)$, $β$ must have passed δnext$(δ)$.

theorem Jceiled_gap (s t δ β : ℝ) [PosReal s] [PosReal t] (leftBound : δ ≤ β) (rightBound : β < δnext s t δ) :

Jceiled s t δ = Jceiled s t β

theorem Jceiled_gap' (s t δ β : ℝ) [PosReal s] [PosReal t] (rightBound : β < δnext s t δ) :

Jceiled s t δ ≥ Jceiled s t β

theorem Jceiled_gap'' (s t δ β : ℝ) [PosReal s] [PosReal t] (jump : Jceiled s t δ < Jceiled s t β) :

δnext s t δ ≤ β

theorem Jceiled_neg (s t δ : ℝ) (neg : δ < 0) [PosReal s] [PosReal t] :

Jceiled s t δ = 0

theorem Jceiled_pos (s t δ : ℝ) (neg : 0 ≤ δ) [PosReal s] [PosReal t] :

0 < Jceiled s t δ

Now we can define the sequence nₖ as partial sums of Jₖ.

The first element n₀ starts at $1$ for reasons we will see later. This essentially comes from the defect of binomial coefficient at $(0, 0)$.

nₖ will be the n-coordinate of the vertices of several piecewise functions we will introduce.

noncomputable def nₖ (s t : ℝ) [PosReal s] [PosReal t] :

Equations

nₖ s t 0 = 1
nₖ s t n.succ = nₖ s t n + Jₖ s t n

Instances For

Since nₖ is the partial sum, we can alternatively express it using Jceiled.

theorem nₖ_accum (s t : ℝ) (k : ℕ) [PosReal s] [PosReal t] :

nₖ s t k = if k = 0 then 1 else 1 + Jceiled s t (δₖ s t (k - 1))

nₖ is also symmetric.

theorem nₖ_symm (s t : ℝ) [PosReal s] [PosReal t] :

nₖ s t = nₖ t s

... and homogeneous.

theorem nₖ_homo (s t l : ℝ) [PosReal s] [PosReal t] [PosReal l] :

nₖ s t = nₖ (l * s) (l * t)

The first two elements of nₖ are always 1 and 2.

theorem n₀ (s t : ℝ) [PosReal s] [PosReal t] :

nₖ s t 0 = 1

theorem n₁ (s t : ℝ) [PosReal s] [PosReal t] :

nₖ s t 1 = 2

nₖ grows faster than $k$ it self.

theorem nₖ_grow (s t : ℝ) (k : ℕ) [PosReal s] [PosReal t] :

nₖ s t k > k

And obviously, nₖ is strictly increasing.

theorem nₖ_mono (s t : ℝ) [PosReal s] [PosReal t] :

StrictMono (nₖ s t)

As a quick corollary, nₖ are all positive.

theorem nₖ_pos (s t : ℝ) (k : ℕ) [PosReal s] [PosReal t] :

nₖ s t k ≠ 0

Just as we used Jₖ to define nₖ, we also use Jsₖ and Jtₖ to define partial sum sequences wₖ' and wₖ, respectively. (The reason wₖ corresponds to $t$ is mostly historical)

The starting point w₀ = 1 is an artifact, as we will see it doesn't follow nice properties we will soon see. The real starting point of this sequence is w₁ = 1.

noncomputable def wₖ (s t : ℝ) [PosReal s] [PosReal t] :

Equations

wₖ s t 0 = 1
wₖ s t n.succ = wₖ s t n + Jtₖ s t n

Instances For

noncomputable def wₖ' (s t : ℝ) [PosReal s] [PosReal t] :

Equations

wₖ' s t 0 = 1
wₖ' s t n.succ = wₖ' s t n + Jsₖ s t n

Instances For

wₖ and wₖ' are symmetric to each other.

theorem wₖ_symm (s t : ℝ) (k : ℕ) [PosReal s] [PosReal t] :

wₖ s t k = wₖ' t s k

theorem wₖ_accum (s t : ℝ) (k : ℕ) [PosReal s] [PosReal t] :

wₖ s t k = if k = 0 then 1 else 1 + Jceiled s t (δₖ s t (k - 1) - t)

theorem wₖ'_accum (s t : ℝ) (k : ℕ) [PosReal s] [PosReal t] :

wₖ' s t k = if k = 0 then 1 else 1 + Jceiled s t (δₖ s t (k - 1) - s)

Similar to nₖ, wₖ and wₖ' are homogeneous

theorem wₖ_homo (s t l : ℝ) [PosReal s] [PosReal t] [PosReal l] :

wₖ s t = wₖ (l * s) (l * t)

theorem wₖ'_homo (s t l : ℝ) [PosReal s] [PosReal t] [PosReal l] :

wₖ' s t = wₖ' (l * s) (l * t)

w₁ = w₁' = 1 is the real starting point of this sequence.

theorem w₁ (s t : ℝ) [PosReal s] [PosReal t] :

wₖ s t 1 = 1

theorem w₁' (s t : ℝ) [PosReal s] [PosReal t] :

wₖ' s t 1 = 1

Recurrence formula of wₖ: by swapping $s$ and $t$, $w$ becomes $n - w$ This is the first property that shows w₀ doesn't follow the pattern. A more sensible definition of w₀ that follows the Symmetry can be

w₀ = 1/2 when $s = t$
w₀ = c if $s > t$ else 1 - c But these definitions doesn't add much value to our further arguments, so we will just leave w₀ semantically undefined.

(The equivalent formula wₖ s t k + wₖ' s t k = nₖ s t k might be more suitable to be the recurrence formula. This is stated this way for historical reasons)

theorem wₖ_rec (s t : ℝ) (k : ℕ) (kh : k ≥ 1) [PosReal s] [PosReal t] :

wₖ s t k + wₖ t s k = nₖ s t k

wₖ is always bounded between $[1,$nₖ$ - 1]$. Because w₀ is undefined, we require $k ≥ 1$.

theorem wₖ_min' (s t : ℝ) (k : ℕ) [PosReal s] [PosReal t] :

wₖ s t k ≥ 1

theorem wₖ_min (s t : ℝ) (k : ℕ) :

k ≥ 1 → ∀ [inst : PosReal s] [inst✝ : PosReal t], wₖ s t k ≥ 1

theorem wₖ_max (s t : ℝ) (k : ℕ) (kh : k ≥ 1) [PosReal s] [PosReal t] :

wₖ s t k ≤ nₖ s t k - 1

wₖ is also increasing but only weakly. (The same is true for wₖ' but we omit the proof)

theorem wₖ_mono (s t : ℝ) [PosReal s] [PosReal t] :

Monotone (wₖ s t)

Here is a pretty important property of wₖ and wₖ': Elements of wₖ and wₖ' sequence all come from nₖ. This means wₖ and wₖ' effectively sets up mapping from nₖ to itself. It can be showed that this mapping is weakly monotone and contracting, and we will prove a weaker version of this later.

theorem wₖ_is_nₖ (s t : ℝ) (k : ℕ) [PosReal s] [PosReal t] :

∃ (k' : ℕ), wₖ s t k = nₖ s t k'

theorem wₖ'_is_nₖ (s t : ℝ) (k : ℕ) [PosReal s] [PosReal t] :

∃ (k' : ℕ), wₖ' s t k = nₖ s t k'

With sequence δₖ, nₖ, and wₖ introduced, we will construct the following functions:

First, the "cost differential" function $dE(n): [1, ∞) → ℝ$

     ↑ dE(n)
     |
     |     |-J₀-|-J₁--|---J₂---|-------J₃------|
     |
     |                                         |
δ₃ --|--                       *===============∘
     |                         |
δ₂ --|--              *========∘
     |                |
     |                |
δ₁ --|--        *=====∘
     |          |
     |          |
δ₀ --+-----*====∘-----|--------|---------------|--------→ n
     0     n₀   n₁    n₂       n₃              n₄
           (=1)

The function is defined like a stair case. By convension, each interval is defined with left point closed: $$ dE( [n_k, n_{k+1}) ) = δ_k $$

Second, the "strategy" function $w(n): [2, ∞) → P(ℝ)$.

     ↑ w(n)
     |
     |     |-J₀-|-J₁--|---J₂---|-------J₃------|
     |                                          /
w₄ --|--                            *----------*-  --|--
     |                             /##########/      |
     |                            /##########/       |
     |                           /##########/        | Jt₃
     |                          /##########/         |
w₃ --|--                *------*----------*        --|--
     |                 /######/                      | Jt₂
w₂ --|--          *---*------*                     --|--
     |           /###/                               | Jt₁
w₁ --|--        *---*                              --|--
     +----------|-----|--------|---------------|--------→ n
     0     n₀   n₁    n₂       n₃              n₄
           (=1) (=2)

We first anchor all points $(n₁, w₁)$, $(n₂, w₂)$, ... and then connect them with parallelogram with an angle of 45° The parallelogram can be degenerated if Jt$ = 0$ or Jt$ = J$. Then all points enveloped, including the boundary, are in w(n)

Again, because w₀ is semantically undefined, $w(n)$ is only defined starting from n₁$ = 2$.

We also write w(n) = [wₘᵢₙ(n), wₘₐₓ(n)]

But before we can define these functions, we need to define how to find $k$ for a given real input $n$.

We define kceiled as the set of natural numbers $k$ for which nₖ$ ≤ n$.

noncomputable def kceiled (s t n : ℝ) [PosReal s] [PosReal t] :

Equations

kceiled s t n = {k : ℕ | ↑(nₖ s t k) ≤ n}

Instances For

kceiled is also obviously symmetric and finite.

theorem kceiled_symm (s t n : ℝ) [PosReal s] [PosReal t] :

kceiled s t n = kceiled t s n

... and homogeneous.

theorem kceiled_homo (s t n l : ℝ) [PosReal s] [PosReal t] [PosReal l] :

kceiled s t n = kceiled (l * s) (l * t) n

kceiled is finite, which allows us to take maximum value later.

instance kceiled_finite (s t n : ℝ) [PosReal s] [PosReal t] :

Finite ↑(kceiled s t n)

noncomputable instance instFintypeElemNatKceiled (s t n : ℝ) [PosReal s] [PosReal t] :

Fintype ↑(kceiled s t n)

Equations

instFintypeElemNatKceiled s t n = Fintype.ofFinite ↑(kceiled s t n)

We can now find kₙ, the closest $k$ for which nₖ$ ≤ n$. We can always find such $k$ for $n ≥ 1$.

noncomputable def kₙ (s t n : ℝ) [PosReal s] [PosReal t] :

Equations

kₙ s t n = (kceiled s t n).toFinset.max

Instances For

And obviously, it is also symmetrical.

theorem kₙ_symm (s t n : ℝ) [PosReal s] [PosReal t] :

kₙ s t n = kₙ t s n

... and homogeneous.

theorem kₙ_homo (s t n l : ℝ) [PosReal s] [PosReal t] [PosReal l] :

kₙ s t n = kₙ (l * s) (l * t) n

kₙ and nₖ are basically inverse functions to each other. One can recover the $k$ by composing kₙ and nₖ.

theorem kₙ_inv (s t : ℝ) (k : ℕ) [PosReal s] [PosReal t] :

kₙ s t ↑(nₖ s t k) = some k

theorem kₙ_inv' (s t n : ℝ) (k : ℕ) [PosReal s] [PosReal t] (low : n ≥ ↑(nₖ s t k)) (high : n < ↑(nₖ s t (k + 1))) :

kₙ s t n = some k

theorem nₖ_inv (s t n : ℝ) (k : ℕ) [PosReal s] [PosReal t] (keq : kₙ s t n = some k) :

↑(nₖ s t k) ≤ n ∧ n < ↑(nₖ s t (k + 1))

k₁$ = 0$ is the first non-empty kₙ. This corresponds to the fact n₀$ = 1$.

theorem k₁ (s t : ℝ) [PosReal s] [PosReal t] :

kₙ s t 1 = some 0

Any $n ≥ 1$ should give a valid $k$.

theorem kₙ_exist (s t n : ℝ) (np : n ≥ 1) [PosReal s] [PosReal t] :

∃ (k : ℕ), kₙ s t n = some k

Mean while, $n < 1$ never gives a valid $k$.

theorem kₙ_not_exist (s t n : ℝ) (np : n < 1) [PosReal s] [PosReal t] :

kₙ s t n = none

Now the cost differential function is defined by clamping to the nearest $k$ and find δₖ.

noncomputable def dE (s t : ℝ) [PosReal s] [PosReal t] :

Equations

dE s t n = match kₙ s t n with | some k => δₖ s t k | none => 0

Instances For

... which is symmetric.

theorem dE_symm (s t n : ℝ) [PosReal s] [PosReal t] :

dE s t n = dE t s n

... homogeneous.

theorem dE_homo (s t n l : ℝ) [PosReal s] [PosReal t] [PosReal l] :

l * dE s t n = dE (l * s) (l * t) n

... and weakly increasing.

theorem dE_mono (s t : ℝ) [PosReal s] [PosReal t] :

Monotone (dE s t)

theorem dE₁ (s t : ℝ) [PosReal s] [PosReal t] :

dE s t 1 = 0

The following three lemma show the nice property of wₖ when applied to dE: The domain $n ∈ [1, ∞)$ is divided by wₖ k and wₖ (k + 1) into three regions:

dE( [1,          wₖ k      ) ) < δₖ - t
dE( [wₖ k,       wₖ (k + 1)) ) = δₖ - t
dE( [wₖ (k + 1), ∞         ) ) > δₖ - t

In other words, wₖ captures exactly where dE = δₖ - t (while nₖ captures where dE = δₖ)

Note that because 1 ≤ wₖ k ≤ wₖ (k + 1) are week inequalities, the intervals listed above can degenerate.

There are similar properties with wₖ' and δₖ - s, but the proof is omitted.

theorem w_eq (s t w : ℝ) (k : ℕ) (kh : k ≥ 1) [PosReal s] [PosReal t] (low : w ≥ ↑(wₖ s t k)) (r : w < ↑(wₖ s t (k + 1))) :

dE s t w = δₖ s t k - t

theorem w_lt (s t w : ℝ) (k : ℕ) (kh : k ≥ 1) [PosReal s] [PosReal t] (low : w ≥ 1) (high : w < ↑(wₖ s t k)) :

dE s t w < δₖ s t k - t

theorem w_gt (s t w : ℝ) (k : ℕ) [PosReal s] [PosReal t] (low : w ≥ ↑(wₖ s t (k + 1))) :

dE s t w > δₖ s t k - t

As a corollary, we show that wₖ is not only a monotone mapping from nₖ to itself, but also under the mapping, wₖ (k) and wₖ (k + 1) are either same, or two nₖ (k') next to each other.

theorem wₖ_is_nₖ_p1 (s t : ℝ) (k k' : ℕ) [PosReal s] [PosReal t] (keq : wₖ s t k = nₖ s t k') (wne : wₖ s t k ≠ wₖ s t (k + 1)) :

wₖ s t (k + 1) = nₖ s t (k' + 1)

By symmetry, the same holds for wₖ'.

theorem wₖ'_is_nₖ_p1 (s t : ℝ) (k k' : ℕ) [PosReal s] [PosReal t] (keq : wₖ' s t k = nₖ s t k') (wne : wₖ' s t k ≠ wₖ' s t (k + 1)) :

wₖ' s t (k + 1) = nₖ s t (k' + 1)

The strategy function $w$ is defined by finding wₖ after clamping to the nearest $k$ The parallelogram is formed by taking certain min and max.

noncomputable def wₘᵢₙ (s t : ℝ) [PosReal s] [PosReal t] (n : ℝ) :

Equations

wₘᵢₙ s t n = match kₙ s t n with | some k => max (↑(wₖ s t k)) (↑(wₖ s t (k + 1)) + n - ↑(nₖ s t (k + 1))) | none => 0

Instances For

noncomputable def wₘₐₓ (s t : ℝ) [PosReal s] [PosReal t] (n : ℝ) :

Equations

wₘₐₓ s t n = match kₙ s t n with | some k => min (↑(wₖ s t (k + 1))) (↑(wₖ s t k) + n - ↑(nₖ s t k)) | none => 0

Instances For

wₘᵢₙ and wₘₐₓ agree with wₖ at n = nₖ.

def wₘᵢₙnₖ (s t : ℝ) (k : ℕ) [PosReal s] [PosReal t] :

wₘᵢₙ s t ↑(nₖ s t k) = ↑(wₖ s t k)

Equations

⋯ = ⋯

Instances For

def wₘₐₓnₖ (s t : ℝ) (k : ℕ) [PosReal s] [PosReal t] :

wₘₐₓ s t ↑(nₖ s t k) = ↑(wₖ s t k)

Equations

⋯ = ⋯

Instances For

Derived from wₖ_rec, we have "recurrence formula" between wₘᵢₙ and wₘₐₓ.

theorem wₘₘ_rec (s t n : ℝ) (n2 : n ≥ 2) [PosReal s] [PosReal t] :

wₘᵢₙ s t n + wₘₐₓ t s n = n

Just like wₖ, $w(n)$ is bounded within $[1, n - 1]$.

theorem wₘᵢₙ_min (s t n : ℝ) (h : n ≥ 2) [PosReal s] [PosReal t] :

wₘᵢₙ s t n ≥ 1

theorem wₘₐₓ_max (s t n : ℝ) (h : n ≥ 2) [PosReal s] [PosReal t] :

wₘₐₓ s t n ≤ n - 1

We also define a third kind of $w$ function wₗᵢ, which is the diagonals of parallelograms formed by wₘᵢₙ and wₘₐₓ.

noncomputable def wₗᵢ (s t : ℝ) [PosReal s] [PosReal t] (n : ℝ) :

Equations

wₗᵢ s t n = match kₙ s t n with | some k => have a := (n - ↑(nₖ s t k)) / (↑(nₖ s t (k + 1)) - ↑(nₖ s t k)); (1 - a) * ↑(wₖ s t k) + a * ↑(wₖ s t (k + 1)) | none => 0

Instances For

We also define the dual version wₗᵢ' We could have done the same for wₘᵢₙ and wₘₐₓ, but we omitted them as they don't add much value.

noncomputable def wₗᵢ' (s t : ℝ) [PosReal s] [PosReal t] (n : ℝ) :

Equations

wₗᵢ' s t n = match kₙ s t n with | some k => have a := (n - ↑(nₖ s t k)) / (↑(nₖ s t (k + 1)) - ↑(nₖ s t k)); (1 - a) * ↑(wₖ' s t k) + a * ↑(wₖ' s t (k + 1)) | none => 0

Instances For

wₗᵢ as the diagnonal, is always between wₘᵢₙ and wₘₐₓ. With this, we have the complete ordering: 1 ≤ wₘᵢₙ ≤ wₗᵢ ≤ wₘₐₓ ≤ n - 1

def wₗᵢ_range (s t n : ℝ) [PosReal s] [PosReal t] :

wₘᵢₙ s t n ≤ wₗᵢ s t n ∧ wₗᵢ s t n ≤ wₘₐₓ s t n

Equations

⋯ = ⋯

Instances For

theorem wₘₘ_order (s t n : ℝ) [PosReal s] [PosReal t] :

wₘᵢₙ s t n ≤ wₘₐₓ s t n

As usual, wₗᵢ is symmetric

theorem wₗᵢ_symm (s t n : ℝ) [PosReal s] [PosReal t] :

wₗᵢ s t n = wₗᵢ' t s n

... and has recurrence formula.

theorem wₗᵢ_rec (s t n : ℝ) (n2 : n ≥ 2) [PosReal s] [PosReal t] :

wₗᵢ s t n + wₗᵢ t s n = n

wₘᵢₙ, wₘₐₓ, and wₗᵢ are all homogeneous

theorem wₘᵢₙ_homo (s t n l : ℝ) [PosReal s] [PosReal t] [PosReal l] :

wₘᵢₙ s t n = wₘᵢₙ (l * s) (l * t) n

theorem wₘₐₓ_homo (s t n l : ℝ) [PosReal s] [PosReal t] [PosReal l] :

wₘₐₓ s t n = wₘₐₓ (l * s) (l * t) n

theorem wₗᵢ_homo (s t n l : ℝ) [PosReal s] [PosReal t] [PosReal l] :

wₗᵢ s t n = wₗᵢ (l * s) (l * t) n

wₘᵢₙ, wₘₐₓ, and wₗᵢ are all weakly monotone

theorem wₘᵢₙ_mono (s t : ℝ) [PosReal s] [PosReal t] :

Monotone (wₘᵢₙ s t)

theorem wₘₐₓ_mono (s t : ℝ) [PosReal s] [PosReal t] :

Monotone (wₘₐₓ s t)

theorem wₗᵢ_mono (s t : ℝ) [PosReal s] [PosReal t] :

Monotone (wₗᵢ s t)

wₘᵢₙ, wₘₐₓ, and wₗᵢ never grow faster than $n$

theorem wₘᵢₙ_growth (s t m n : ℝ) (hm : 2 ≤ m) (hn : m ≤ n) [PosReal s] [PosReal t] :

wₘᵢₙ s t n - wₘᵢₙ s t m ≤ n - m

theorem wₘₐₓ_growth (s t m n : ℝ) (hm : 2 ≤ m) (hn : m ≤ n) [PosReal s] [PosReal t] :

wₘₐₓ s t n - wₘₐₓ s t m ≤ n - m

theorem wₗᵢ_growth (s t m n : ℝ) (hm : 2 ≤ m) (hn : m ≤ n) [PosReal s] [PosReal t] :

wₗᵢ s t n - wₗᵢ s t m ≤ n - m

We define the "strategy evaluation differential function" dD (n, w) = dE (w) - dE (n - w) + t - s

noncomputable def dD (s t n : ℝ) [PosReal s] [PosReal t] :

Equations

dD s t n w = dE s t w - dE s t (n - w) + t - s

Instances For

It is symmetric

theorem dD_symm (s t n w : ℝ) [PosReal s] [PosReal t] :

dD s t n w = -dD t s n (n - w)

... and weakly increasing w.r.t $w$

theorem dD_mono (s t n : ℝ) [PosReal s] [PosReal t] :

Monotone (dD s t n)

We show that wₘᵢₙ and wₘₐₓ indicates where dD is negative, zero, or positive.

In these theorems, we conviniently ignored boundary points at w = wₘᵢₙ or w = wₘₐₓ. dD value at those points can be found, but it doesn't add much value for our further arguments.

theorem dD_zero (s t n w : ℝ) (h : n ≥ 2) [PosReal s] [PosReal t] (leftBound : w > wₘᵢₙ s t n) (rightBound : w < wₘₐₓ s t n) :

dD s t n w = 0

theorem dD_neg (s t n w : ℝ) (h : n ≥ 2) [PosReal s] [PosReal t] (leftBound : w > 1) (rightBound : w < wₘᵢₙ s t n) :

dD s t n w < 0

theorem dD_pos (s t n w : ℝ) (h : n ≥ 2) [PosReal s] [PosReal t] (leftBound : w > wₘₐₓ s t n) (rightBound : w < n - 1) :

dD s t n w > 0

Let's also show that dE and dD are integrable, which will be soon used

theorem dE_integrable (s t m n : ℝ) [PosReal s] [PosReal t] :

IntervalIntegrable (dE s t) MeasureTheory.volume m n

Here is a more useful version with the correction term $s + t$

theorem dE_integrable' (s t m n : ℝ) [PosReal s] [PosReal t] :

IntervalIntegrable (fun (x : ℝ) => dE s t x + s + t) MeasureTheory.volume m n

theorem dD_integrable (s t n w1 w2 : ℝ) [PosReal s] [PosReal t] :

IntervalIntegrable (dD s t n) MeasureTheory.volume w1 w2

Now we can construct our main function, the cost function E (n)

     ↑ E (n)
     |
     |     |-J₀-|-J₁--|---J₂---|-------J₃------|
     |
     |                                        ·*   --|--
     |                                      ··       |
     |                                     ·         |
     |                                    ·          |
     |                                  ··           |
     |                                 ·             |
     |                                ·              | (δ₃+s+t)*J₃
     |                              ··               |
     |     |    |     |        |   ·                 |
     |     |    |     |        |  ·                  |
     |     |    |     |        | ·                   |
E₃ --|--   |    |     |      ··*·---               --|--
     |     |    |     |     ·                        |
     |     |    |     |   ··                         | (δ₂+s+t)*J₂
     |     |    |     | ··                           |
E₂ --|--   |    |    ·*·------------               --|--
     |     |    |   ·                                |
     |     |    | ··                                 | (δ₁+s+t)*J₁
E₁ --|--   |   ·*·------------------               --|--
     |     | ··                                      | (δ₀+s+t)*J₀
E₀ --+-----*·---|-----|--------|---------------|-----|--→ n
     0     n₀   n₁    n₂       n₃              n₄
           (=1)

We first pin the vertices Eₖ on this function.

noncomputable def Eₖ (s t : ℝ) [PosReal s] [PosReal t] :

Equations

Eₖ s t 0 = 0
Eₖ s t n.succ = Eₖ s t n + ↑(Jₖ s t n) * (δₖ s t n + s + t)

Instances For

... which is symmetric.

theorem Eₖ_symm (s t : ℝ) (k : ℕ) [PosReal s] [PosReal t] :

Eₖ s t k = Eₖ t s k

Eₖ can be alternatively expressed as integrating dE between vertices.

theorem Eₖ_integral (s t : ℝ) (k : ℕ) [PosReal s] [PosReal t] :

Eₖ s t k = ∫ (x : ℝ) in 1..↑(nₖ s t k), dE s t x + s + t

We then define E (n) as linear interpolation between Eₖ.

noncomputable def E (s t : ℝ) [PosReal s] [PosReal t] :

Equations

E s t n = match kₙ s t n with | some k => Eₖ s t k + (n - ↑(nₖ s t k)) * (δₖ s t k + s + t) | none => 0

Instances For

... which is symmetric.

theorem E_symm (s t n : ℝ) [PosReal s] [PosReal t] :

E s t n = E t s n

... and can be expressed as an integral.

theorem E_integral (s t n : ℝ) (n1 : n ≥ 1) [PosReal s] [PosReal t] :

E s t n = ∫ (x : ℝ) in 1..n, dE s t x + s + t

While E (n) itself is defined as partial sum of Jₖ * (δₖ + s + t), we can also show the composed mapping E (w(n)) is the partial sum of Jtₖ * (δₖ + s).

theorem Ew_accum (s t : ℝ) (k : ℕ) (k1 : k ≥ 1) [PosReal s] [PosReal t] :

E s t ↑(wₖ s t k) + ↑(Jtₖ s t k) * (δₖ s t k + s) = E s t ↑(wₖ s t (k + 1))

Symmetrically E (w'(n)) is the partial sum of Jsₖ * (δₖ + t).

theorem Ew'_accum (s t : ℝ) (k : ℕ) (k1 : k ≥ 1) [PosReal s] [PosReal t] :

E s t ↑(wₖ' s t k) + ↑(Jsₖ s t k) * (δₖ s t k + t) = E s t ↑(wₖ' s t (k + 1))

And here is the strategy evaluation function.

noncomputable def D (s t n w : ℝ) [PosReal s] [PosReal t] :

Equations

D s t n w = E s t w + E s t (n - w) + t * w + s * (n - w)

Instances For

... which is symmetric.

theorem D_symm (s t n w : ℝ) [PosReal s] [PosReal t] :

D s t n w = D t s n (n - w)

... and is the integral of the strategy evaluation differential function.

theorem D_integral (s t n w1 w2 : ℝ) (w1low : w1 ≥ 1) (w1high : w1 ≤ n - 1) (w2low : w2 ≥ 1) (w2high : w2 ≤ n - 1) [PosReal s] [PosReal t] :

D s t n w2 - D s t n w1 = ∫ (w : ℝ) in w1..w2, dD s t n w

We will now prove several version of the recurrence formula on E.

theorem Eₖ_rec (s t : ℝ) [PosReal s] [PosReal t] (k : ℕ) :

1 ≤ k → Eₖ s t k = E s t ↑(wₖ s t k) + E s t ↑(wₖ' s t k) + t * ↑(wₖ s t k) + s * ↑(wₖ' s t k)

theorem Eₖ_lerp (s t : ℝ) (k : ℕ) (a : ℝ) (low : a ≥ 0) (high : a ≤ 1) [PosReal s] [PosReal t] :

E s t ((1 - a) * ↑(nₖ s t k) + a * ↑(nₖ s t (k + 1))) = (1 - a) * Eₖ s t k + a * Eₖ s t (k + 1)

theorem Ewₖ_lerp (s t : ℝ) (k : ℕ) (a : ℝ) (low : a ≥ 0) (high : a ≤ 1) [PosReal s] [PosReal t] :

E s t ((1 - a) * ↑(wₖ s t k) + a * ↑(wₖ s t (k + 1))) = (1 - a) * E s t ↑(wₖ s t k) + a * E s t ↑(wₖ s t (k + 1))

theorem Ewₖ'_lerp (s t : ℝ) (k : ℕ) (a : ℝ) (low : a ≥ 0) (high : a ≤ 1) [PosReal s] [PosReal t] :

E s t ((1 - a) * ↑(wₖ' s t k) + a * ↑(wₖ' s t (k + 1))) = (1 - a) * E s t ↑(wₖ' s t k) + a * E s t ↑(wₖ' s t (k + 1))

theorem Eₖ_rec_lerp (s t : ℝ) (k : ℕ) (a : ℝ) (k1 : k ≥ 1) (low : a ≥ 0) (high : a ≤ 1) [PosReal s] [PosReal t] :

E s t ((1 - a) * ↑(nₖ s t k) + a * ↑(nₖ s t (k + 1))) = E s t ((1 - a) * ↑(wₖ s t k) + a * ↑(wₖ s t (k + 1))) + E s t ((1 - a) * ↑(wₖ' s t k) + a * ↑(wₖ' s t (k + 1))) + t * ((1 - a) * ↑(wₖ s t k) + a * ↑(wₖ s t (k + 1))) + s * ((1 - a) * ↑(wₖ' s t k) + a * ↑(wₖ' s t (k + 1)))

Eventually, we reached the major conclusion: The cost equals the strategy evaluation at the optimal strategy wₗᵢ

theorem E_wₗᵢ (s t n : ℝ) (n2 : n ≥ 2) [PosReal s] [PosReal t] :

E s t n = D s t n (wₗᵢ s t n)

But because D has zero derivative dD between wₘᵢₙ and wₘₐₓ all $w$ in between gives cost = strategy evaluation

theorem E_w (s t n w : ℝ) (n2 : n ≥ 2) [PosReal s] [PosReal t] (leftBound : w ≥ wₘᵢₙ s t n) (rightBound : w ≤ wₘₐₓ s t n) :

E s t n = D s t n w

And using the fact that the derivative dD is negative/positive outside the range, we conclude that the strategy evaluation is larger than the cost everywhere else.

theorem E_wₘᵢₙ (s t n w : ℝ) (n2 : n ≥ 2) [PosReal s] [PosReal t] (leftBound : w ≥ 1) (rightBound : w < wₘᵢₙ s t n) :

E s t n < D s t n w

theorem E_wₘₐₓ (s t n w : ℝ) (n2 : n ≥ 2) [PosReal s] [PosReal t] (leftBound : w > wₘₐₓ s t n) (rightBound : w ≤ n - 1) :

E s t n < D s t n w

Therefore, the interval bounded by wₘᵢₙ and wₘₐₓ idicates where E = D. Let's make it its own function

def wₛₑₜ (s t : ℝ) [PosReal s] [PosReal t] :

ℝ → Set ℝ

Equations

wₛₑₜ s t n = Set.Icc (wₘᵢₙ s t n) (wₘₐₓ s t n)

Instances For

Let's summarize our result in a high level

For any possible cost function $E(n): [1, ∞) → ℝ$ We can define a strategy evaluation function StratEval$\{E\}(n, w)$

A cost function $E$ is called optimal if the min value of StratEval is $E$ itself, and a strategy function $w$ is called optimal if it is the set for StratEval to reach $E$.

def StratEval (Efun : ℝ → ℝ) (s t n w : ℝ) :

Equations

StratEval Efun s t n w = Efun w + Efun (n - w) + t * w + s * (n - w)

Instances For

def IsOptimalCost (Efun : ℝ → ℝ) (s t : ℝ) :

Equations

IsOptimalCost Efun s t = ∀ n ≥ 2, IsLeast (StratEval Efun s t n '' Set.Icc 1 (n - 1)) (Efun n)

Instances For

def IsOptimalStrat (Efun : ℝ → ℝ) (wfun : ℝ → Set ℝ) (s t : ℝ) :

Equations

IsOptimalStrat Efun wfun s t = ∀ n ≥ 2, ∀ w ∈ Set.Icc 1 (n - 1), StratEval Efun s t n w = Efun n ↔ w ∈ wfun n

Instances For

Then obviously the E and wₛₑₜ function we have constructed are optimal.

theorem E_IsOptimalCost (s t : ℝ) [PosReal s] [PosReal t] :

IsOptimalCost (E s t) s t

theorem W_IsOptimalStrat (s t : ℝ) [PosReal s] [PosReal t] :

IsOptimalStrat (E s t) (wₛₑₜ s t) s t

Finally, we want to lift our E and wₛₑₜ to integers, which is the domain of the original question.

noncomputable def Eℤ (s t : ℝ) [PosReal s] [PosReal t] :

Equations

Eℤ s t n = E s t ↑n

Instances For

noncomputable def wₘᵢₙℤ (s t : ℝ) [PosReal s] [PosReal t] :

Equations

wₘᵢₙℤ s t n = match kₙ s t ↑n with | some k => max (↑(wₖ s t k)) (↑(wₖ s t (k + 1)) + n - ↑(nₖ s t (k + 1))) | none => 0

Instances For

noncomputable def wₘₐₓℤ (s t : ℝ) [PosReal s] [PosReal t] :

Equations

wₘₐₓℤ s t n = match kₙ s t ↑n with | some k => min (↑(wₖ s t (k + 1))) (↑(wₖ s t k) + n - ↑(nₖ s t k)) | none => 0

Instances For

While Eℤ is easy to understand, we need to show that wₘᵢₙℤ and wₘₐₓℤ remains the same value when lifted.

theorem wₘᵢₙℤeq (s t : ℝ) (n : ℤ) [PosReal s] [PosReal t] :

↑(wₘᵢₙℤ s t n) = wₘᵢₙ s t ↑n

theorem wₘₐₓℤeq (s t : ℝ) (n : ℤ) [PosReal s] [PosReal t] :

↑(wₘₐₓℤ s t n) = wₘₐₓ s t ↑n

noncomputable def wℤ (s t : ℝ) [PosReal s] [PosReal t] :

ℤ → Set ℤ

Equations

wℤ s t n = Set.Icc (wₘᵢₙℤ s t n) (wₘₐₓℤ s t n)

Instances For

We can then define the integer version of the optimal criteria, and proof the optimality of Eℤ and Wℤ.

def StratEvalℤ (Efun : ℤ → ℝ) (s t : ℝ) (n w : ℤ) :

Equations

StratEvalℤ Efun s t n w = Efun w + Efun (n - w) + t * ↑w + s * (↑n - ↑w)

Instances For

def IsOptimalCostℤ (Efun : ℤ → ℝ) (s t : ℝ) :

Equations

IsOptimalCostℤ Efun s t = ∀ n ≥ 2, IsLeast (StratEvalℤ Efun s t n '' Set.Icc 1 (n - 1)) (Efun n)

Instances For

def IsOptimalStratℤ (Efun : ℤ → ℝ) (wfun : ℤ → Set ℤ) (s t : ℝ) :

Equations

IsOptimalStratℤ Efun wfun s t = ∀ n ≥ 2, ∀ w ∈ Set.Icc 1 (n - 1), StratEvalℤ Efun s t n w = Efun n ↔ w ∈ wfun n

Instances For

theorem Eℤ_IsOptimalCost (s t : ℝ) [PosReal s] [PosReal t] :

IsOptimalCostℤ (Eℤ s t) s t

theorem Wℤ_IsOptimalStrat (s t : ℝ) [PosReal s] [PosReal t] :

IsOptimalStratℤ (Eℤ s t) (wℤ s t) s t

And finally, Eℤ is the unique optimal function with starting point of Eℤ (1) = 0

theorem Eℤ₁ (s t : ℝ) [PosReal s] [PosReal t] :

Eℤ s t 1 = 0

theorem Eℤ_Unique (s t : ℝ) (Efun : ℤ → ℝ) [PosReal s] [PosReal t] (E1 : Efun 1 = 0) (opt : IsOptimalCostℤ Efun s t) (n : ℤ) :

n ≥ 1 → Efun n = Eℤ s t n