Causal Inference Study – Nov 8, 9 & 14, 2023

The narrower question focusing on causal inference is how it helps uncover causal relationships among variables and create applications in a particular problem domain? This is what this UofT causal inference workshop I will attend is about.

My answer (not tested, based on my reading of spirtes’ paper): traditional causal inference provides the framework – A particular problem can be modeled by Problem 3 (i.e., Constructing causal models from sample data) and Problem 4 (i.e., predicting effects of manipulations from causal models). Then, Problem 3 is solved by searching (i.e., discovering causal models G and P) and Problem 4 is solved by using the manipulation rule.

Secondary questions

What are maximum likelihood estimates? (study material)

My answer: this is a method to estimate the value of a parameter in a model given data.

Material’s answer: This method asks the question: for which parameter value does the observed data have the biggest probability?

In the coming weeks we will see how parameter values are naturally viewed as hypotheses, so we are in fact estimating the probability of various hypotheses given the data.
interesting quote from study material

What is the relation between Causal Markov Assumption and Causal Faithfulness Assumption?

My attempt: I think they are the same thing because causal faithfulness assumption states that local directed markov condition entails every conditional independence relation true in the density over V. Local directed markov condition is equivalent to the conditional independencies among vertices not having a parent or descendant relationship in a DAG.

My answer is wrong. Here is ChatGPT’s answer:

No, the Causal Markov Assumption and the Causal Faithfulness Assumption are not the same thing, although they are both important concepts in the study of causal inference and graphical models. Here’s a brief overview of each:

Causal Markov Assumption:

This assumption is about the relationship between a causal graph (like a Directed Acyclic Graph, or DAG) and the joint probability distribution of the variables in that graph.

It states that a variable is conditionally independent of its non-effects given its direct causes. In simpler terms, once you know the state of a variable’s direct causes, knowing the state of any other non-effect variables doesn’t provide any additional information about it.

Essentially, it uses the structure of the causal graph to make assertions about the conditional independencies in the associated probability distribution.

Causal Faithfulness (or Stability) Assumption:

The faithfulness assumption deals with the relationship between the observed statistical relationships (like correlations or associations) and the causal relationships.

It posits that all and only the conditional independencies true in the probability distribution are entailed by the causal graph. This means that if two variables are statistically independent, then they must be d-separated (i.e., conditionally independent given some set of variables) in the graph, and vice versa.

The assumption essentially ensures that the observed data do not show any ‘accidental’ independencies that aren’t supported by the underlying causal structure.

In practice, these assumptions are often used together to learn about causal structures from data:

The Markov Assumption allows us to infer certain independencies from the structure of the causal graph.

The Faithfulness Assumption ensures that these inferred independencies (and no others) are the ones that actually show up in the observed data.

What is this paragraph really saying: “For disjoint sets of vertices, X, Y, and Z, X is d-separated from Y given Z if and only if X is not d-connected to Y given Z. X is d-separated from Y conditional on Z in DAG G if and only if X is independent of Y conditional on Z in every density that satisfies the local directed Markov condition for G.”?

So, what the statement means is that if X and Y are d-separated given Z in the graph, then in every probability distribution that correctly reflects the conditional independencies implied by the graph (i.e., satisfies the local directed Markov condition), X and Y will be conditionally independent given Z. This is a cornerstone of the theory underlying Bayesian networks and other probabilistic graphical models, which allows us to infer conditional independencies from the structure of the graph and vice versa.
ChatGPT

Gang Fang's Blog