Definition

First, we study linear time-invariant (LTI) systems, in which the mathematical equation of the system can be described as follows [1]: \[ \begin{align*} \begin{split} \mathbf{\dot{x}}(t) &= \mathbf{Ax}(t)+\mathbf{Bu}(t) \\ \mathbf{y}(t) &= \mathbf{Cx}(t)+\mathbf{Du}(t) \end{split} \tag{1} \end{align*} \] We call these set of equations as the linear state-space representation of the LTI system.1 In these equation, \(\mathbf{x}(t)\in\mathbb{R}^{n}\) denotes the state vector, \(\mathbf{u}(t)\in\mathbb{R}^{m}\) denotes the input vector and \(\mathbf{y}(t)\in\mathbb{R}^{p}\) denotes the output vector.2 Without the loss of generality, we set the initial time to be zero. The initial condition of the system is \(\mathbf{x}(0)\).

\(\mathbf{A}\in\mathbb{R}^{n\times n}\) denotes the state matrix, \(\mathbf{B}\in\mathbb{R}^{m\times n}\) denotes the input matrix, \(\mathbf{C}\in\mathbb{R}^{p\times n}\) denotes the output matrix and \(\mathbf{D}\in\mathbb{R}^{p\times m}\) denotes the feedthrough matrix of the system. Systems in which the dynamics can be modelled as Equation 1 are called linear time-invariant systems, since matrices \(\mathbf{A}, \mathbf{B}, \mathbf{C}, \mathbf{D}\) are constant with respect to time \(t\).

We assume that matrices \(\mathbf{A}, \mathbf{B}, \mathbf{C}, \mathbf{D}\) (i.e., the linear mathematical model of the system in interest). Given the initial condition \(\mathbf{x}(t)\), our goal is to control \(\mathbf{x}(t)\) via a time-history input \(\mathbf{u}(t)\) which we can define arbitrarily.3



Solution – Informal Derivation

It is clear that once we derive \(\mathbf{x}(t)\), calculating \(\mathbf{y}(t)\) is straightforward. Hence, our prior focus is on the following differential equation: \[ \begin{equation} \mathbf{\dot{x}}(t) = \mathbf{Ax}(t)+\mathbf{Bu}(t) \tag{2} \end{equation} \]

Given the initial condition \(\mathbf{x}(0)\), the solution of Equation 1 is [3]: \[ \begin{equation} \mathbf{x}(t) = \exp({\mathbf{A}t})\mathbf{x}(0) + \int_{0}^{t} \exp\{\mathbf{A}(t-\tau)\} \mathbf{B} \mathbf{u}(\tau) d\tau \tag{3} \end{equation} \] where \(\exp(\cdot):\mathbb{R}^{n\times n} \rightarrow \mathbb{R}^{n\times n}\) is a matrix operation4 defined by: \[ \exp({\mathbf{A}t}) = \mathbf{I} + \mathbf{A}t + \frac{1}{2!}\mathbf{A^2}t^2 + \cdot + \frac{1}{n!}\mathbf{A^n}t^n + \cdots = \sum_{k=0}^{\infty}\frac{1}{k!}(\mathbf{A}t)^{k} \] Note that this power series of matrices is equivalent to the scalar case of the exponential function.

A mathematically-rigorous derivation for Equation 3 is discussed in the section below. However, we want to exphasize that the form of the solution is exactly equivalent to the solution of the scalar case.

In detail, consider the following scalar first order differential equation: \[ \dot{x}(t) = ax(t) + bu(t) \] In this equation, \(a\in\mathbb{R}\) is a constant. Given the initial condition \(x(0)\), it is straightforward that the solution \(x(t)\) is: \[ x(t) = \exp(at)x(0) + \int_{0}^{t} \exp\{a(t-\tau)\} b u(\tau) d\tau \tag{4} \] In this equation, the last term is the convolution between two time-domain functions \(\exp(at)\) and \(bu(t)\): \[ \int_{0}^{t} \exp\{a(t-\tau)\} b u(\tau) d\tau = \exp(at) * bu(t) \] In this equation, \(*\) is a convolution operator. The solution for the scalar case is immediately generalized to the solution for Equation 3.

Summarizing, the closed-form solution of \(\mathbf{x}(t)\), \(\mathbf{y}(t)\) in Equation 1 is: \[ \begin{align*} \mathbf{x}(t) &= \exp({\mathbf{A}t})\mathbf{x}(0) + \int_{0}^{t} \exp\{\mathbf{A}(t-\tau)\} \mathbf{B} \mathbf{u}(\tau) d\tau \\ \mathbf{y}(t) &= \mathbf{C}\exp({\mathbf{A}t})\mathbf{x}(0) + \int_{0}^{t} \mathbf{C}\exp\{\mathbf{A}(t-\tau)\} \mathbf{B} \mathbf{u}(\tau) d\tau + \mathbf{Du}(t) \end{align*} \]



Solution – Formal Derivation

Note that this section can be too much of a detail, hence feel free to jump to the next section.



References

[1]
A. Isidori, Nonlinear control systems: An introduction. Springer, 1985, pp. 1–2.
[2]
B. Friedland, Control system design: An introduction to state-space methods. Courier Corporation, 2012, pp. 14–16.
[3]
B. Friedland, Control system design: An introduction to state-space methods. Courier Corporation, 2012, pp. 59–62.

  1. State-space methods were introduced to the United States during the late 1950s and early 1960s [2].↩︎

  2. Strictly speaking, vector specifically means an element of a vector space. Without the loss of generality, we set the initial time to be zero. Given the initial condition \(\mathbf{x}(0)\),Here, we only focus on n-dimensional real vector spaces (e.g., \(\mathbb{R}^{n}\)) with a component-wise addition and scalar multiplication [1].↩︎

  3. Since \(\mathbf{D}\) and \(\mathbf{u}\) are determined in the first-place, we often move the \(\mathbf{Du}\) term to the left-hand side and simply write \(\mathbf{y-Du\triangleq y'=Cx}\).↩︎

  4. We often describe \(\exp{\mathbf{A}t}\) as \(e^{\mathbf{A}t}\), but strictly speaking the mathematical notation of \(e^{\mathbf{A}t}\) can be a bit vague. \(e\) stands for the natural constant \(2.71828...\), hence for notation \(e^{\mathbf{A}t}\) one can ask “what is the meaning of scalar to the power of a matrix?”. Therefore, people often describe a more general and safer notation, \(\exp{\mathbf{A}t}\) and interpret \(\exp(\cdot)\) as an “operation” which expands the matrix as Equation 3. In fact, this discussion is related to the comment from 3Blue1Brown Grant Sanderson, when he was talking about his least favorite piece of notation [Link].↩︎