Student’s t-distribution CDF R base documentation

Question

In the context of the Student’s t-distribution cumulative distribution function, R Version 4.3.1’s ?dt documentation highlights the following result:

However, upon attempting to verify the accuracy of this formula, an inconsistency arises, as illustrated in the following code snippet:

v <- 5
t <- -1

## Student's t-distribution cumulative distribution function
pt(q = t, df = v, lower.tail = TRUE, ncp = 0)
#> [1] 0.1816087

## Application of the theorical result where there is a discrepancy 
## based on what is mentioned in R Version 4.3.1's ?dt documentation
1 - pbeta(q = v / (v + t^2), shape1 = v/2, shape2 = 1/2, 
          ncp = 0,lower.tail = TRUE)
#> [1] 0.6367825

^{Created on 2023-10-09 with reprex v2.0.2}

This issue raises questions about the accuracy of the documentation. I am seeking clarification to determine whether the problem lies in the documentation itself before reporting a potential mistake to the R project. This inquiry is related to a theoretical concept, where a detailed explanation can be found here

I don't think the derivation is correct. You should refer to equation (7) in the webpage mathworld.wolfram.com/Studentst-Distribution.html — 11 hours ago
Maybe worth noting that this section was only added a year ago: github.com/r-devel/r-svn/commit/… — 9 hours ago
Thank you Ben Bolker for pointing this out. I appreciate your help. — 7 hours ago

score 6 · Accepted Answer · 2023-10-09 13:47:17Z

Hmm looks like an error. Here is a valid identity:

v <- 5
t <- -1

## Student's t-distribution cumulative distribution function
pt(q = t, df = v, lower.tail = TRUE, ncp = 0)
#> [1] 0.1816087

x = (t + sqrt(t * t + v)) / (2.0 * sqrt(t * t + v))
pbeta(q = x, shape1 = v/2, shape2 = v/2, ncp = 0, lower.tail = TRUE)
#> [1] 0.1816087

And another one, closer to the claim of the R doc:

pbeta(q = v / (v + t^2), shape1 = v/2, shape2 = 1/2, 
            ncp = 0,lower.tail = TRUE) / 2
#> [1] 0.1816087

score 2 · Accepted Answer · 2023-10-09 19:42:24Z

First of all, I don’t that the formula in R’s manual is correct!

You should use the following formula to calculate CDF

> 1 / 2 + 1 / 2 * (pbeta(1, v / 2, v) - pbeta(v / (v + t^2), v / 2, 1 / 2)) * sign(t)
[1] 0.1816087

The formula refers to https://mathworld.wolfram.com/Studentst-Distribution.html, see equation (7)

Break-down into the function

As we can see that, the CDF function depends on the sign of t. Note that pbeta(1, a, b) = 1 always holds, so the expression can be simplified further and written in a piece-wise manner in terms of t, say:

If t>0

1 - 0.5 * pbeta(v / (v + t^2), v / 2, 1 / 2)

Otherwise

0.5 * pbeta(v / (v + t^2), v / 2, 1 / 2)

Note that you have t <- -1, which is negative, you should apply the second expression (be aware of the factor 0.5 before pbeta), and you will see that

> 0.5 * pbeta(v / (v + t^2), v / 2, 1 / 2)
[1] 0.1816087

Thank you ThomasIsCoding for pointing out another alternative. — 7 hours ago
Thank you again ThomasIsCoding. Now I understand more!!! I will try to deduce the case when t > 0 using the probability distribution function. I was able initially to deduce only the case where t < 0. — 1 hour ago

Student’s t-distribution CDF R base documentation

Student’s t-distribution CDF R base documentation

2 Answers 2

Break-down into the function

2 Answers
2