In the context of the Student’s t-distribution cumulative distribution function, R Version 4.3.1’s ?dt
documentation highlights the following result:
However, upon attempting to verify the accuracy of this formula, an inconsistency arises, as illustrated in the following code snippet:
v <- 5
t <- -1
## Student's t-distribution cumulative distribution function
pt(q = t, df = v, lower.tail = TRUE, ncp = 0)
#> [1] 0.1816087
## Application of the theorical result where there is a discrepancy
## based on what is mentioned in R Version 4.3.1's ?dt documentation
1 - pbeta(q = v / (v + t^2), shape1 = v/2, shape2 = 1/2,
ncp = 0,lower.tail = TRUE)
#> [1] 0.6367825
Created on 2023-10-09 with reprex v2.0.2
This issue raises questions about the accuracy of the documentation. I am seeking clarification to determine whether the problem lies in the documentation itself before reporting a potential mistake to the R project. This inquiry is related to a theoretical concept, where a detailed explanation can be found here
3
2 Answers
Hmm looks like an error. Here is a valid identity:
v <- 5
t <- -1
## Student's t-distribution cumulative distribution function
pt(q = t, df = v, lower.tail = TRUE, ncp = 0)
#> [1] 0.1816087
x = (t + sqrt(t * t + v)) / (2.0 * sqrt(t * t + v))
pbeta(q = x, shape1 = v/2, shape2 = v/2, ncp = 0, lower.tail = TRUE)
#> [1] 0.1816087
And another one, closer to the claim of the R doc:
pbeta(q = v / (v + t^2), shape1 = v/2, shape2 = 1/2,
ncp = 0,lower.tail = TRUE) / 2
#> [1] 0.1816087
First of all, I don’t that the formula in R’s manual is correct!
You should use the following formula to calculate CDF
> 1 / 2 + 1 / 2 * (pbeta(1, v / 2, v) - pbeta(v / (v + t^2), v / 2, 1 / 2)) * sign(t)
[1] 0.1816087
The formula refers to https://mathworld.wolfram.com/Studentst-Distribution.html, see equation (7)
Break-down into the function
As we can see that, the CDF function depends on the sign of t
. Note that pbeta(1, a, b) = 1
always holds, so the expression can be simplified further and written in a piece-wise manner in terms of t
, say:
- If
t>0
1 - 0.5 * pbeta(v / (v + t^2), v / 2, 1 / 2)
- Otherwise
0.5 * pbeta(v / (v + t^2), v / 2, 1 / 2)
Note that you have t <- -1
, which is negative, you should apply the second expression (be aware of the factor 0.5
before pbeta
), and you will see that
> 0.5 * pbeta(v / (v + t^2), v / 2, 1 / 2)
[1] 0.1816087
3
-
1
Thank you ThomasIsCoding for pointing out another alternative.
– luifrancgom7 hours ago
-
@luifrancgom I added more explanation in my answer.
– ThomasIsCoding
5 hours ago
-
Thank you again ThomasIsCoding. Now I understand more!!! I will try to deduce the case when t > 0 using the probability distribution function. I was able initially to deduce only the case where t < 0.
– luifrancgom1 hour ago
I don't think the derivation is correct. You should refer to equation (7) in the webpage mathworld.wolfram.com/Studentst-Distribution.html
11 hours ago
Maybe worth noting that this section was only added a year ago: github.com/r-devel/r-svn/commit/…
9 hours ago
Thank you Ben Bolker for pointing this out. I appreciate your help.
7 hours ago