Divisions of everything
Statistical concepts
- Think through statistics in terms of adjunctions that describe approximations which arise through sampling. Think of populations as the related category. Think of the example of reals approximated by integers.
- Understand Patterson's paper's statistical concepts (such as nondeterminism, Markov kernels) in terms of the Curry-Howard-Lambek correspondence.
- How is nondeterminism in statistics related to nondeterminism in automata theory? and how are they expressed in category theory?
- How does knowledge relate to validation? and how does that relate to the Yoneda lemma?
- How does the null hypothesis relate to the identity morphism in the Yoneda lemma and to the characterization in terms of step-by-step Homsets.
- Develop the relevant notion of equivalence for samples (up to a particular grade).
- Are frequentist statistics and Bayesian statistics related as adjoint functors?
- What is the relationship between rejecting the null hypothesis and applying the falsification principle?
- How do Bayesian statistics and the usual statistics differ as regards the fivesome and causalities?
Study statistics
- Norman Anderson's book on Empirical Design
Statistika
- Neaiškumą įžvelgti ir valdyti.
- Atsakymo neapibrėžtumą reikėtų papildyti klausimo neapibrėžtumu.
Readings
Videos
Wikipedia
Algebra of Statistical Models
- Monoidal category (tensor product)
- Symmetric monoidal category (maximally symmetric braided tensor product)
- Cartesian category
- Regular category (regular logic: exists, and, true)
- Coherent category (coherent logic: exists, and, true, or, false)
- Elementary topos (first-order logic)
- Cartesian category (tensor product is categorical product)(finite products)
- Cartesian closed category (lambda calculus, product types)(terminal object, products, exponentials)(internal Hom functor is adjoint functor to the product)
- Bicartesian closed category (lambda calculus, product types, sum types)
- Elementary topos
Concepts
- Whether is an object: its identity morphism.
- What it is as an object: the functor F. (Builds on validation of the rest.)
- How it is as an object: the Homset of morphisms. (Builds on validation of the rest.)
- Why it is an object: the entire category of morphisms along with their composition rules and identity rules.
Thoughts
- You have data - you have evidence of a probability distribution.
- If the model is known - if it is a well understood process - then you want to apply the given model.
- If the model is not known - if you want to find the most illuminating model - then you need to have a procedure for choosing the most meaningful model, that model which best balances insightfulness and faithfulness. You need to identify candidate models, distinguish competing models and have criteria for deciding amongst them.
- Patterson's goals: Understanding the internal structure of statistical models, comparing competing models, arguing that a certain model supports or fails to support some scientific theory.
- Extraction problem: Given a set of measurements of a phenomenon - a recurring activity - how can you extract the model of the process that is generating it. Note that there is likewise a guided coding by the process into what may be quite a simple model (binomial distribution) though it is generated by extremely complicated implementation (such as the factors determining the height of women, or the factors underlying the shapes of coins).
- Hierarchy of reproductive advantages. Reproduction is facilitated by a variety of advantages which order themselves in a hierarchy as to which most directly impact the probability of reproduction. Thus the properties that attract or repulse a mate are perhaps highest; those that facilitate competition with rivals, etc.
Ideas
- A type indicates the range of values that a variable may have. A probability distribution assigns a probability to every subrange, as well as its complement, thus to every subtype. See Subtyping.
- Questions-covariates-conscious. Answers-responses-unconscious. Model should relate the language of questions and the language of answers.
- Learn about relational frame theory.
- Consider game theory equilibria.
- Leibniz's Identity of indiscernibles.
- "Those things are which show themselves to be."
- Linguistics: things are the same if they always occur together. Things are the same if they never occur together.
Statistical concepts
Measure
- Lebesgue integration is important for measure theory and probability. The range is important, not the domain. It is the backwards of a function. What is the basis for this asymmetry? What does that say? How does it relate to the asymmetry of the category Set?
- Measure is additive (summing to 1), it is internal structure.
Probability distribution
- The relation between {$F_q$} and {$F_1$} is that between a list and a set, between a probability distribution (where there is choice) and a function (where this is a single choice of 1, a choice that is no choice).
Nondeterminism
- Relationships between nondeterminism and forgetting: elimination of variable (in automata theory), forgetful functor.
Randomness
- Randomness as an inverse functor.
Monad
- Choice as a wrapper.
- Coordinate system (observer) as a wrapper.
- Endofunctor. Introduction of order.
Bayesian statistics - only one methods of fitting a model (Bayesian inference)
Frequentist statistics - a single model may be fit by many different models
Bayesian statistics and Frequentist statistics are initial and terminal objects.
Frequentist estimators are usually defined as solutions to optimization problems. Efficient algorithms are often algorithms that are not guaranteed to converge to the intended estimator.
Fitting a statistical model to data
- 1) Specification of a statistical model.
- 2) Method of estimating the model's parameters.
- 3) Algorithm for computing the estimator.
Notes
- Statistics relates the ideal and the empirical, thus bridges the ideal and empirical Houses of knowledge.
- https://en.wikipedia.org/wiki/Bootstrapping_(statistics)
- Checking model fit: Could you simulate fake data for your model and see if it looks like your real data?
- Andrew Gelman: Bayes, statistics and reproducibility
- Gelman about Cantor: "All the action is on the diagonal." The diagonal is the fixed point, the identity morphism.
Gelman: Three fundamental problems in statistics
- Generalizing from sample to population
- Generalizing from treatment to control group
- Generalizing from your measurement to your underlying constructive interest
Thermodynamics and statistical physics
- Adiabatic process
- Brownian motion in time
- (small deviations) Ito-Stratanovich (big deviations) fluctuations, stochastic processes
- Levy process, Levy signal
Random matrices
Sevensome
- I suspect that the seven-fold equation (fraction of differences, as with Mobius transformations) in Shu-Hong's thesis organizes seven natural bases that express different ways of looking at probability and confidence in a self-standing system.