As discussed in our post related to adaptive control, estimating the unknown inertial parameter is a challenging problem. One of the key issues which complicates the problem of inertial parameter estimation is the physical consistency condition of the inertial parameters.
In detail, consider the manipulator equation of an open-chain \(n\)-DOF robot [1], [2]: \[ \mathbf{M(q)\ddot{q}+C(q,\dot{q})\dot{q}+g(q)}=\boldsymbol{\tau}_{in} \tag{1} \] The notations are self-explanatory. We assume no external forces applied to the robot.
It is well-known that the left-hand side of Equation 1 can be represented linearly in system inertial parameters [3], [4]: \[ \mathbf{M(q)\ddot{q}+C(q,\dot{q})\dot{q}+g(q)=\mathbf{Y}(q,\dot{q},\ddot{q}})\mathbf{a} = \boldsymbol{\tau}_{in} \tag{2} \] In this equation, \(\mathbf{Y(q,\dot{q}, \ddot{q})}\in\mathbb{R}^{n\times 10n}\) is the matrix that is known; \(\mathbf{a}:=(\mathbf{a}_1, \mathbf{a_2}, \cdots, \mathbf{a}_n)\in\mathbb{R}^{10n}\) is the unknown parameters of the \(n\)-DOF robot manipulator; \(\mathbf{a}_i\in\mathbb{R}^{10}\) is the 10 unknown inertial parameters of the \(i\)-th link which consists of: \[ \mathbf{a}_i = (m_i, ~ mc_{x,i}, ~mc_{y,i}, ~mc_{z,i}, ~I_{xx,i}, ~I_{xy,i}, ~I_{xz,i}, ~I_{yy,i}, ~I_{yz,i}, ~I_{zz,i} ) \] The detailed explanation of each inertial parameters are discussed in later section. Our key goal is to estimate \(\mathbf{a}\) with sufficient accuracy.
Based on Equation 2, one might wonder about using a naive least-square method. Assume we have \(K\) data sets of torque input \((\boldsymbol{\tau}_{in,1}, \boldsymbol{\tau}_{in,2}, \cdots, \boldsymbol{\tau}_{in,K})\in\mathbb{R}^{nK}\), and the corresponding joint position \((\mathbf{q}_{1}, \mathbf{q}_{2}, \cdots, \mathbf{q}_{K})\in\mathbb{R}^{nK}\), joint velocity \((\mathbf{\dot{q}}_{1}, \mathbf{\dot{q}}_{2}, \cdots, \mathbf{\dot{q}}_{K})\in\mathbb{R}^{nK}\), joint acceleration \((\mathbf{\ddot{q}}_{1}, \mathbf{\ddot{q}}_{2}, \cdots, \mathbf{\ddot{q}}_{K})\in\mathbb{R}^{nK}\) data. This can be concisely denoted as: \[ \underbrace{ \begin{bmatrix} \mathbf{Y}(\mathbf{q_1}, \mathbf{\dot{q}_1}, \mathbf{\ddot{q}_1}) \\ \mathbf{Y}(\mathbf{q_2}, \mathbf{\dot{q}_2}, \mathbf{\ddot{q}_2}) \\ \vdots \\ \mathbf{Y}(\mathbf{q_K}, \mathbf{\dot{q}_K}, \mathbf{\ddot{q}_K}) \\ \end{bmatrix} }_{:=\mathbf{\bar{Y}}} \mathbf{a} = \underbrace{ \begin{bmatrix} \boldsymbol{\tau}_1 \\ \boldsymbol{\tau}_2 \\ \vdots \\ \boldsymbol{\tau}_K \\ \end{bmatrix} }_{:=\boldsymbol{\bar{\tau}}} ~~~~~~~~~ \mathbf{\bar{Y}}~\mathbf{a} =\boldsymbol{\bar{\tau}} \] In this equation, the size of matrices are \(\mathbf{\bar{Y}}\in\mathbb{R}^{nK\times 10n}\) and \(\boldsymbol{\tau}_{total}\in\mathbb{R}^{nK}\).
From these data set, we can define the following least-square regression (tribute to Gauss!) problem. The optimal \(\mathbf{a}^{*}\) which best-fit the \(K\) data is [5]: \[ \mathbf{a}^{*} = \underset{\mathbf{a}\in\mathbb{R}^{10n}}{\operatorname{arg}\,\operatorname{min}}\;~~ (\mathbf{\bar{Y}}~\mathbf{a} -\boldsymbol{\bar{\tau}} )^{\text{T}}\boldsymbol{\Sigma}^{-1}(\mathbf{\bar{Y}}~\mathbf{a} -\boldsymbol{\bar{\tau}} ), ~~~~~~~~~~ \mathbf{a}^{*} = (\mathbf{\bar{Y}^T}\boldsymbol{\Sigma}^{-1}\mathbf{\bar{Y}})^{-1}\mathbf{\bar{Y}^T}\boldsymbol{\Sigma}^{-1}\boldsymbol{\bar{\tau}} \] While this problem is well-defined and solvable, the optimal \(\mathbf{a}^{*}\) values are not reliable in terms of physical feasibility of the values of the inertial parameter [5]. In other words, the solution of \(\mathbf{a}\) is a subset of \(\mathbb{R}^{10n}\) which satisfies the physical feasibility condition.
Consider a rigid object with continuous mass distribution. As discussed in this post, the moment of inertia about point \(A\) of the object is defined by: \[ \mathbf{I}_A = \sum_{i=1}^{N} m_i\big(-[\mathbf{r}_i]^2\big) = \sum_{i=1}^{N} m_i \big( ||\mathbf{r}_i||^2\mathbf{I}_3-\mathbf{r_ir_i^T} \big) = \sum_{i=1}^{N} m_i\Big( \text{tr}(\mathbf{r_ir_i^T})\mathbf{I}_3 -\mathbf{r_ir_i^T} \Big) \tag{3} \] Since \(\mathbf{I}_A\) is a symmetric, positive definite matrix, there exists a special orthogonal matrix \(\mathbf{R}\in SO(3)\) which satisfies \[ \mathbf{I}_A = \mathbf{R}\mathbf{I}_A'\mathbf{R^{T}} \tag{4} \] Where \(\mathbf{I}_A'\) is a diagonal matrix which takes the principal moments of inertia \(I_1\), \(I_2\), \(I_3\) as the diagonal elements. To have a physically consistent symmetric positive definite inertia matrix \(\mathbf{I}_A\), triangular inequality must be satisfied: \[ I_1+I_2>I_3, ~~ I_2 +I_3 > I_1, ~~ I_3 + I_1 > I_2 ~~ \Longrightarrow ~~ I_1+I_2+I_3 > \lambda_{max}(\mathbf{I}_A),~~ i=1,2,3 \tag{5} \] In these inequalities, \(\lambda_{max}(\cdot)\) provides the maximum eigenvalue of the matrix.
It is quick to check that the eigenvalues of \(\mathbf{I}_A\) and \(\mathbf{I}_A'\) are identical, since \(\mathbf{R}\) is a (special) orthogonal matrix. Moreover, as both, \(\mathbf{I}_A\) and \(\mathbf{I}_A'\) are symmetric positive-definite matrices: \[ \forall \mathbf{v}\in\mathbb{R}^3: \lambda_{min}(\mathbf{I}_A)\mathbf{v^Tv} \le \mathbf{v^{T}}\mathbf{I}_A\mathbf{v} \le \lambda_{max}(\mathbf{I}_A)\mathbf{v^Tv}, ~~~~~~~~~ \lambda_{max}(\mathbf{I}_A)\mathbb{I}_3 \succ \mathbf{I}_A \tag{6} \] In these inequalities, \(\mathbb{I}_3\in\mathbb{R}^{3\times 3}\) is an identity matrix.
The great paper from Wensing et al. [6] and Sousa et al. [7] defines Density-Weighted Covariance. Both, the (1) positive-definiteness and the (2) triangular inequality of the inertia matrix are substituted to simply the positive-definiteness of the density-weighted covariance matrix. In this section, we derive the corresponding density-weighted covariance matrix of the inertia matrix.
First, from Equation 3: \[ \require{cancel} \text{tr}( \mathbf{I}_A ) = \text{tr}( \mathbf{R}\mathbf{I}_A'\mathbf{R^{T}} ) = \text{tr}( \mathbf{I}_A'\cancel{\mathbf{R^{T}}\mathbf{R}} )=\text{tr}(\mathbf{I}_A') = I_1+I_2+I_3 \] In these equations, the cyclic property of the trace operator is used.
The inequalities 4 and 5 can be reformulated as:1 \[ \frac{1}{2}(I_1+I_2+I_3) > \lambda_{max}(\mathbf{I}_A), ~~~~ \frac{1}{2}\text{tr}(\mathbf{I}_A)\mathbb{I}_3 \succ \lambda_{max}(\mathbf{I}_A)\mathbb{I}_3, ~~~~~~ \frac{1}{2}\text{tr}(\mathbf{I}_A)\mathbb{I}_3 - \mathbf{I}_A \succ 0 \tag{7} \] From the inequality 6, we define the density-weighted covariance matrix \(\boldsymbol{\Sigma}_A\), which is defined by, and has a one-to-one correspondence with the inertia matrix \(\mathbf{I}_A\): \[ \boldsymbol{\Sigma}_A :=\frac{1}{2}\text{tr}(\mathbf{I}_A)\mathbb{I}_3 - \mathbf{I}_A, ~~~~~~~~~~ \mathbf{I}_A = \text{tr}(\boldsymbol{\Sigma}_A)\mathbb{I}_3 - \boldsymbol{\Sigma}_A \] The remarkable results show that the two properties of the inertia matrix is satisfied by setting this new matrix as positive definite.
In fact, from Equation 3, we can check that the density-weighted covariance matrix is simply: \[ \boldsymbol{\Sigma}_A = \sum_{i=1}^{N} m_i \mathbf{r}_i^{\mathbf{T}}\mathbf{r}_i \]
Note that the relation between \(\boldsymbol{\Sigma}_{A}\) and \(\mathbf{I}_{A}\) is reported when Zhang et al. defined the co-stiffness matrix [8].
The terminology of density-weighted covariance matrix makes sense, since considering the definition of covariance matrix in statistics.
\[ \boldsymbol{\Sigma}_A = \boldsymbol{\Sigma}_C + m\mathbf{cc^T} \] Which is simply the paralle axis theorem for inertia matrix, but without the skew-symmetric matrix operation.
One should be cognizant about the shape of the inequality! One is used for scalars and the other one is to check the positive (negative) definiteness of matrices. ↩︎