Augmented dickey fuller test in python with statsmodels

The Augmented Dickey Fuller (ADF) test is a statistical test commonly used in econometrics to determine whether a time series is stationary or not. In Python, we can perform the ADF test using the statsmodels library. In this article, we will explore three different ways to conduct the ADF test in Python.

Option 1: Using the adfuller() function

The statsmodels library provides the adfuller() function, which allows us to perform the ADF test. This function takes a time series as input and returns a tuple of test statistics, p-value, and critical values.

import statsmodels.api as sm

# Assuming 'data' is the time series data
result = sm.tsa.stattools.adfuller(data)

# Extracting the test statistics, p-value, and critical values
test_statistic = result[0]
p_value = result[1]
critical_values = result[4]

# Printing the results
print("Test Statistic:", test_statistic)
print("P-value:", p_value)
print("Critical Values:", critical_values)

This approach is straightforward and provides all the necessary information for the ADF test. However, it requires additional code to extract and print the results.

Option 2: Using the adfuller() function with a custom function

To simplify the process and make the code more reusable, we can create a custom function that encapsulates the ADF test and result extraction.

import statsmodels.api as sm

def adf_test(data):
    result = sm.tsa.stattools.adfuller(data)
    test_statistic = result[0]
    p_value = result[1]
    critical_values = result[4]
    return test_statistic, p_value, critical_values

# Assuming 'data' is the time series data
test_statistic, p_value, critical_values = adf_test(data)

# Printing the results
print("Test Statistic:", test_statistic)
print("P-value:", p_value)
print("Critical Values:", critical_values)

This approach encapsulates the ADF test in a function, making it easier to reuse the code and obtain the results in a more concise manner.

Option 3: Using the adfuller() function with pandas DataFrame

If the time series data is stored in a pandas DataFrame, we can directly apply the adfuller() function to the DataFrame column.

import pandas as pd
import statsmodels.api as sm

# Assuming 'df' is the pandas DataFrame with a column named 'data'
result = sm.tsa.stattools.adfuller(df['data'])

# Extracting the test statistics, p-value, and critical values
test_statistic = result[0]
p_value = result[1]
critical_values = result[4]

# Printing the results
print("Test Statistic:", test_statistic)
print("P-value:", p_value)
print("Critical Values:", critical_values)

This approach is useful when working with pandas DataFrames as it allows us to directly apply the ADF test to a specific column.

After exploring these three options, the best approach depends on the specific use case. Option 1 provides a straightforward solution but requires additional code to extract and print the results. Option 2 encapsulates the ADF test in a reusable function, making it more concise. Option 3 is suitable when working with pandas DataFrames and allows direct application of the ADF test to a specific column. Therefore, the best option may vary depending on the requirements of the project.

Rate this post

11 Responses

    1. I couldnt disagree more! Option 2 is clearly the superior choice. Why waste time with pandas DataFrame when you can achieve the same results with a simpler and more efficient solution? Dont overcomplicate things for no reason.

    1. Sorry, but I have to disagree. Option 2 is far superior. It offers more flexibility and customization. Just like my morning coffee, I prefer to have choices and make it exactly how I want it. Cheers! ☕️

Leave a Reply

Your email address will not be published. Required fields are marked *

Table of Contents