Skip to main content

Unanswered Questions

72,730 questions with no upvoted or accepted answers
24 votes
0 answers
19k views

When should I use the Normal distribution or the Uniform distribution when using Xavier initialization?

Xavier initialization seems to be used quite widely now to initialize connection weights in neural networks, especially deep ones (see What are good initial weights in a neural network?). The ...
23 votes
0 answers
1k views

Is there a general expression for ancillary statistics in exponential families?

An i.i.d sample $X_1,\dots,X_n$ from a scale family with c.d.f. $F(\frac{x}{\sigma})$ has $S(X)$ as an ancillary statistic if $S(X)$ depends on the sample only through $\frac{X_1}{X_n},\cdots,\frac{X_{...
21 votes
1 answer
1k views

Physical/pictoral interpretation of higher-order moments

I'm preparing a presentation about parallel statistics. I plan to illustrate the formulas for distributed computation of the mean and variance with examples involving center of gravity and moment of ...
20 votes
0 answers
732 views

Is the Wilcoxon two-sample test maximally powered to detect proportional odds alternatives?

We know from the literature that The Wilcoxon-Mann-Whitney two-sample rank sum test is optimal for detecting simple location shifts when comparing two continuous random variables that each have a ...
20 votes
0 answers
2k views

Implementation of CoVaR (a systemic risk measure) in R

I'm trying to estimate CoVaR using bivariate DCC GARCH in R. The concept of CoVaR is the dependence adjusted of VaR, which was first introduced by Adrian and Brunnermeier (2011). However, this ...
19 votes
0 answers
500 views

Empirical Bayes (In)Admissibility

Most of the time, sticking to a pure Bayesian approach to statistics with proper priors, leads to admissible estimators. Nevertheless, there is a good reason to use Empirical Bayes in many cases, and ...
18 votes
0 answers
1k views

What is Shannon's source entropy?

Suppose that ${X_n; Y_n}$ is a random process with a discrete alphabet, that is, taking on values in a discrete set for $n$ data length. They correspond to the input and output of a communication ...
17 votes
0 answers
2k views

Rademacher complexity of logistic regression

Consider logistic regression. We have the logistic loss function, $\phi: R\rightarrow [0,1], \phi(u)=\log(1+\exp(-u))$, which is Lipschitz, and we have the linear function class $F=\{f_w:R^d \...
17 votes
0 answers
5k views

How to compare two distance matrices?

Suppose that I have two distance matrices for the same set of items. By a distance matrix I mean a square matrix whose (i,j)th entry holds the distance (in terms of cosine similarity) between ith and ...
17 votes
0 answers
13k views

Time series regression with overlapping data

I am seeing a regression model which is regressing Year-on-Year stock index returns on lagged (12 months) Year-on-Year returns of the same stock index, credit spread (difference between monthly mean ...
17 votes
1 answer
996 views

How can I measure model performance with weighted logistic regression?

I am working with some survey data that uses probability weights. A number of sources explain that likelihood-based tests and fit statistics like likelihood-ratio, AIC, and BIC are not valid in the ...
16 votes
0 answers
415 views

What is tantile regression?

My question follows on this discussion of medials and tantiles vs medians and quantiles from earlier this year: When would we use tantiles and the medial, rather than quantiles and the median? As ...
16 votes
0 answers
1k views

Understanding Sequential Probability Ratio Test (SPRT) Likelihood Ratio

I am a software developer looking to develop an alternative for the simple hypothesis testing scheme described here. In short, the test works as follows: Two URLs are compared for their ability to ...
16 votes
0 answers
568 views

Asymptotic property of tuning parameter in penalized regression

I'm currently working on asymptotic properties of penalized regression. I've read a myriad of papers by now, but there is an essential issue that I cannot get my head around. To keep things simple, I'...
15 votes
0 answers
2k views

How to calculate percent partial deviance explained by each predictor variable in a GAM model?

I am trying to find a sensible way to calculate the deviance explained by each predictor variable in a GAM model and need some input on my calculations. Following Simon Wood's example on the thread ...

15 30 50 per page
1
2 3 4 5
4849
-