Student’s t-distribution CDF R base documentation

Student’s t-distribution CDF R base documentation


6

In the context of the Student’s t-distribution cumulative distribution function, R Version 4.3.1’s ?dt documentation highlights the following result:

Student's t-distribution CDF R base documentation

However, upon attempting to verify the accuracy of this formula, an inconsistency arises, as illustrated in the following code snippet:

v <- 5
t <- -1

## Student's t-distribution cumulative distribution function
pt(q = t, df = v, lower.tail = TRUE, ncp = 0)
#> [1] 0.1816087

## Application of the theorical result where there is a discrepancy 
## based on what is mentioned in R Version 4.3.1's ?dt documentation
1 - pbeta(q = v / (v + t^2), shape1 = v/2, shape2 = 1/2, 
          ncp = 0,lower.tail = TRUE)
#> [1] 0.6367825

Created on 2023-10-09 with reprex v2.0.2

This issue raises questions about the accuracy of the documentation. I am seeking clarification to determine whether the problem lies in the documentation itself before reporting a potential mistake to the R project. This inquiry is related to a theoretical concept, where a detailed explanation can be found here

3

2 Answers
2


6

Hmm looks like an error. Here is a valid identity:

v <- 5
t <- -1

## Student's t-distribution cumulative distribution function
pt(q = t, df = v, lower.tail = TRUE, ncp = 0)
#> [1] 0.1816087

x = (t + sqrt(t * t + v)) / (2.0 * sqrt(t * t + v))
pbeta(q = x, shape1 = v/2, shape2 = v/2, ncp = 0, lower.tail = TRUE)
#> [1] 0.1816087

And another one, closer to the claim of the R doc:

pbeta(q = v / (v + t^2), shape1 = v/2, shape2 = 1/2, 
            ncp = 0,lower.tail = TRUE) / 2
#> [1] 0.1816087


2

First of all, I don’t that the formula in R’s manual is correct!

You should use the following formula to calculate CDF

> 1 / 2 + 1 / 2 * (pbeta(1, v / 2, v) - pbeta(v / (v + t^2), v / 2, 1 / 2)) * sign(t)
[1] 0.1816087

The formula refers to https://mathworld.wolfram.com/Studentst-Distribution.html, see equation (7)


Break-down into the function

As we can see that, the CDF function depends on the sign of t. Note that pbeta(1, a, b) = 1 always holds, so the expression can be simplified further and written in a piece-wise manner in terms of t, say:

  • If t>0
1 - 0.5 * pbeta(v / (v + t^2), v / 2, 1 / 2)
  • Otherwise
0.5 * pbeta(v / (v + t^2), v / 2, 1 / 2)

Note that you have t <- -1, which is negative, you should apply the second expression (be aware of the factor 0.5 before pbeta), and you will see that

> 0.5 * pbeta(v / (v + t^2), v / 2, 1 / 2)
[1] 0.1816087

3

  • 1

    Thank you ThomasIsCoding for pointing out another alternative.

    – luifrancgom

    7 hours ago

  • @luifrancgom I added more explanation in my answer.

    – ThomasIsCoding

    5 hours ago

  • Thank you again ThomasIsCoding. Now I understand more!!! I will try to deduce the case when t > 0 using the probability distribution function. I was able initially to deduce only the case where t < 0.

    – luifrancgom

    1 hour ago




Leave a Reply

Your email address will not be published. Required fields are marked *