Calculate root mean square deviation rmsd with numpy of python

The root mean square deviation (RMSD) is a measure of the average distance between the atoms (or other particles) of two superimposed structures. In Python, we can calculate the RMSD using the numpy library. In this article, we will explore three different ways to calculate the RMSD using numpy.

Method 1: Using numpy’s built-in functions


import numpy as np

def calculate_rmsd(coords1, coords2):
    diff = coords1 - coords2
    squared_diff = np.square(diff)
    mean_squared_diff = np.mean(squared_diff)
    rmsd = np.sqrt(mean_squared_diff)
    return rmsd

# Example usage
coords1 = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
coords2 = np.array([[2, 3, 4], [5, 6, 7], [8, 9, 10]])
rmsd = calculate_rmsd(coords1, coords2)
print("RMSD:", rmsd)

In this method, we first calculate the difference between the two sets of coordinates using numpy’s element-wise subtraction. Then, we square the differences, take the mean, and finally, calculate the square root to obtain the RMSD.

Method 2: Using numpy’s dot product


import numpy as np

def calculate_rmsd(coords1, coords2):
    diff = coords1 - coords2
    squared_diff = np.square(diff)
    sum_squared_diff = np.sum(squared_diff)
    rmsd = np.sqrt(sum_squared_diff / len(coords1))
    return rmsd

# Example usage
coords1 = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
coords2 = np.array([[2, 3, 4], [5, 6, 7], [8, 9, 10]])
rmsd = calculate_rmsd(coords1, coords2)
print("RMSD:", rmsd)

In this method, we calculate the sum of squared differences using numpy’s dot product. We divide the sum by the number of coordinates and take the square root to obtain the RMSD.

Method 3: Using numpy’s einsum function


import numpy as np

def calculate_rmsd(coords1, coords2):
    diff = coords1 - coords2
    squared_diff = np.einsum('ij,ij->i', diff, diff)
    mean_squared_diff = np.mean(squared_diff)
    rmsd = np.sqrt(mean_squared_diff)
    return rmsd

# Example usage
coords1 = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
coords2 = np.array([[2, 3, 4], [5, 6, 7], [8, 9, 10]])
rmsd = calculate_rmsd(coords1, coords2)
print("RMSD:", rmsd)

In this method, we use numpy’s einsum function to calculate the sum of squared differences. The ‘ij,ij->i’ notation specifies the element-wise multiplication and summation along the second axis. We then proceed to calculate the mean and square root to obtain the RMSD.

Among the three options, Method 1 using numpy’s built-in functions is the most straightforward and readable. It follows a step-by-step approach and is easy to understand for beginners. However, Method 3 using numpy’s einsum function offers a more concise and efficient solution, especially for larger datasets. It leverages the power of numpy’s optimized functions and array operations.

Ultimately, the choice between the methods depends on the specific requirements of your project. If simplicity and readability are crucial, go with Method 1. If performance is a priority, especially for larger datasets, consider using Method 3.

Rate this post

6 Responses

Leave a Reply

Your email address will not be published. Required fields are marked *

Table of Contents