Information Theory in Neuroscience Flashcards
What is the aim of information theory ?
It helps to understand how to efficiently encore, transmit and decode data.
Define Enthropy
Measure of uncertainty or unpredictability in a set of possible outcomes
Define Information
The reduction of uncertainty: When a message is received, it provides information it reduces the possible unknowns.
Define “Compression”
Technique to represent information with fewer bits by eliminating redundancy.
Define “Channel capacity”
The maximal amount of information that can be transmitted over a communication channel without error.
Define “Error correction”
Methods to detect and correct errors in data transmission or storage.
In which unity is the information calculated ?
In bits.
How much information is contained in a head or tail experience ?
1 bits (50% chance of tail, 50% chances of head, 0.5 + 0.5 = 1)
Define what “bits” refers to in information theory
The average nbr of yes/no question required to ascertain the value of a variable.
Site advantages of information theory (5)
-Model Free (Not necessary to hypothesize a specific structure to the interactions between variables in a data set to use information theory).
-You can use any mixture of data type
-Detects linear and non linear interactions.
- Multivariate (Possess several metrics design to quantify the behavior of systels with any number of variable).
- Results in bits (facilitates straightforward comparison)
What can information theory tell you ?
Quantify uncertainty of one or more variables as well as the influence of one or more variables on one or more other variables.
What is the main limit of information theory analysis ?
Can’t produce a model that describres how a system works. It can only be used to restrict the space of possible models.
Define what is a probability distribution
A distribution that describes the likelihood of certain outcomes of a random variable or group of variable.
How is noted the probability distribution?
p(A)
Is a p(A) discrete or continuous ?
Can be both.
In a discrete probability distribution, what must be the value of the sum of the possible states and the integral of possible values ??
1
With what type of distribution can we describe a system of more than one variable?
A Joint probability distribution
p(c1, c2) = p(c1)p(c2)
e.g : two independent coins
What is a joint probability distribution ?
Describe a system with more than one variable.
p(c1, c2) = p(c1)p(c2)
What is a maginal probability distribution ?
Represents the likelihood for the outcomes of a subset of variables in the joint distribution.
it describes the probabilities of one or more variables while ignoring (or marginalizing over) the others.
How is calculated the marginal probability distribution ?
Summing across certain variables in a joint distribution.
p(c1) = ∑ p(c1, c2)
(c2)
If we calculate the probability distribution of a system with two magically linked (dependent) variables, will the distribution will be uniform ?
No
What is a conditional probability distribution ?
It is another way to represent probabilities in system full of multiple variable. It describes the likelihood to obtain outcomes of certain variables assuming that other variables are known.
How is noted the conditional probability?
It is noted as:
p(A|B) (“the probability of A given B”).
What is Data Binning ?
Binning multiple observations of some variables across time or trials). It is a pre processing technique used to reduce the effect of minor observation error.
E.g : measuring voltage we could bin datas in these three categories:
>1, <-1 or ]-1, 1[.
How is estimated the probability of a state ?
Total nbr of observation of that state divided by the total number of observations for all states.
s : State
N(s) : Nbr of experimental observations
Nobs: Total nbr of experimental observations
p(s) = N(s) / Nobs
What guideline can we follow to know how many observations we perfom to adequately sample the space of possible joint states ?
- > 10 observations per possible state is ideal.
What is meant by the assumption of stationarity ?
The assumption that each observation contributes to a picture of the same probability distribution.
What is discretization
Converting continuous data into discrete data.
What are the two main binning procedures ?
Uniform width and uniform count.
What is the principle of uniform width binning ?
Dividing the total range of the data into Nbins nbr of equal-WIDTH bins.
What is the principle of uniform count binning ?
Divinding the total range of the data into Nbis equal-COUNT bins : probability of falling into one bin is the same as for others.
What is parameter fishing
Testing parameters until statistically different. Leads to false positive and misleading conclusions. It is a form of p hacking: manipulation of analysis to get desirable fishing.
What is a null model ?
Default model used in scq exp to represent what you would expect to happen if nothing special is going on. Used as comparison.
What are Kernel-based or binless strategies
Other methods for handling continuous values.
How is noted enthropy ?
H(x)
What is H(X)
enthropy
Does the uniform count binning procedure minimize or maximize enthropy ?
It maximizes the enthropy
How many bits is a Byte ?
8
What is joint enthropy ?
how is it noted ?
H(X, Y), enthorpy for sys with more than one variable.
What is H(X, Y)
Joint enthropy
What is conditional enthropy ? how is it noted ?
Quantifies the average uncertainty in a variable given the stateof another variable.
H(X | Y)
Express what is joint enthropy in fct of enthropy and enconditional enthropy.
H(X, Y) = H(X) + H(Y | X)
Explain what is this
H(X) = H(X|Y) + I(X;Y)
The enthropy of event X is the sum of the conditional entropy (uncertainly of a variable given the knowledge of another variable) and of the information provided by Y about X (= I(X;Y))
What is the measure of I(X;Y) and what is it
bits
Information by Y about X.
what is p(x, y) if the two variables are independant ?
p(x, y) = p(x)*px(y)