Partial Differentiation

  • Introduction

So far, while studying calculus, we have dealt with functions of single variable, i.e. {y=f(x)}. {sin (x^2), ln \ x, e^{cos \ (tan \ x)}} are few examples. Irrespective of their complexity, the variable {y} always depended on the value of independent variable {x}. We also defined the derivatives and integrals of {f(x)} and studied few applications of them.

More often than not, we encounter situations, where a function {f} needs more than 1 independent variable for its definition. Such functions are known as functions of several variables. e.g. a function of 2 variables is

{f(x,y) = sin (x) e^y \times xy^{3/2}}

Thus, without knowing values of both {x} and {y} simultaneously, we cannot get a unique value of {f(x,y)}.

One can define a function of as many variables as one wants. (Of course, it should make some sense.) In many of the problems in mechanical engineering, the functions are of at the most 4 independent variables; viz. 3 space variables, {(x,y,z)} and a time variable {t}.

The partial differentiation involves obtaining the derivatives of functions of several variables.

  • Definition and Rules

Let {z} be a function of 2 independent variables {x} and {y}. To differentiate {z} partially w.r.t {x}, we treat {y} as a constant and follow the usual process of differentiation. Thus,

{\dfrac {\partial z}{\partial x} = \lim \limits_{\delta x \to 0} \dfrac {f(x + \delta x, y) - f(x,y)}{\delta x}}


{\dfrac {\partial z}{\partial y} = \lim \limits_{\delta y \to 0} \dfrac {f(x, y+ \delta y) - f(x,y)}{\delta y}}

Thus, the definition is similar to that of ordinary differentiation. The condition of existence of the limit is necessary.

Note that we use the letter {\partial} for partial derivatives and the letter {d} for ordinary derivatives.

The rules of for differentiation of addition, subtraction, multiplication, division are same as ordinary differentiation.

  • Derivatives of Higher Order

Having obtained the first order derivatives {\dfrac {\partial z}{\partial x}} and {\dfrac {\partial z}{\partial y}}, we now define the second order derivatives, i.e.

{\frac {\partial}{ \partial x} \Big ( \frac {\partial z}{\partial x} \Big) , \frac {\partial}{ \partial y} \Big ( \frac {\partial z}{\partial x} \Big), \frac {\partial}{ \partial y} \Big ( \frac {\partial z}{\partial x} \Big), \frac {\partial}{ \partial y} \Big ( \frac {\partial z}{\partial y} \Big)}

For a function of 2 variables, four 2nd order derivatives are possible. These are sometimes written as

{\dfrac {\partial^2 z}{ \partial x^2} = z_{xx}, \ \dfrac {\partial^2 z}{\partial y \partial x} = z_{yx},\ \dfrac {\partial^2 z}{\partial x \partial y} = z_{xy}, \ \dfrac {\partial^2 z}{\partial y^2} = z_{yy}}

If the function and its derivatives are continuous, then we have

{\dfrac {\partial^2 z}{\partial y \partial x}= \dfrac {\partial^2 z}{\partial y \partial x}}

One can define derivatives of order {> 2} by following the same procedure.

  • Types of Problems (Crucial from exam point of view)

I) Based on the definition and the commutative property of partial differentiation

II) Based on the concept of composite functions (Mostly involve the relations between cartesian and polar coordinates)

  • Homogeneous Functions (Already encountered in M II , 1st unit)

When the sum of indices of the variables in a function is same for all terms, the function is said to be homogeneous of degree equal to the sum.

{6x^3y^2 + x^5 - xy^4}

is an example. (Degree {= 5})

Note that each term must be explicitly of the form {a x^m y^n}. Thus, {sin (6x^3y^2 + x^5 - xy^4)} is NOT a homogeneous function.

  • Euler’s Theorem (by Leonhard Euler)

For a homogeneous function {z=f(x,y)} of degree {n},

{x \dfrac {\partial z}{\partial x} + y \dfrac {\partial z}{\partial y} = nz}

As a consequence of this,

{x^2 \dfrac {\partial^2 z}{ \partial x^2} + 2xy \dfrac {\partial^2 z}{\partial x \partial y} + y^2 \dfrac {\partial^2 z}{ \partial y^2} = n (n-1)z}

Similarly, if {u =f(x,y,z)} is a homogeneous function of 3 independent variables of degree {n}, then

{x \frac {\partial u}{\partial x} + y \frac {\partial u}{\partial y} + z \frac {\partial u}{\partial z}= nu}

  • Total Derivatives

Consider a function {z = f(x,y)}. If it so happens that {x} and {y} themselves are functions of another variable {t}, then the total derivative of {z} w.r.t. {t} is defined as

{\dfrac {dz}{dt} = \dfrac {\partial z}{\partial x} \times \dfrac {dx}{dt} + \dfrac {\partial z}{\partial y} \times \dfrac {dy}{dt}}

Thus, if we are given a function {z = g(t)}, we would differentiate it w.r.t. {t}, thus getting {\dfrac {dz}{dt}}. Instead, if {z} is expressed as {f(x,y)} and {x= \phi (t)} and {y = \psi (t)}, then obtaining the total derivative of {f(x,y)} will be equivalent to getting {\frac {d}{dt} g(t)}

  • Applications

We will discuss the applications of partial differentiation in the next unit.


List of Topics (M 1)

Unit 1 i Rank of a Matrix

Unit 1 ii System of Linear Equations

Unit 1 iii Eigenvalues and Eigenvectors

Unit 2 i Complex Numbers

Unit 2 ii Hyperbolic Functions and Logs of Complex Numbers

Unit 3 i Infinite Series

Unit 3 ii Successive Differentiation

Unit 4 i Taylor and Maclaurin Series

Unit 4 ii Indeterminate Forms

Unit 5 Partial Differentiation

Unit 6 i Jacobians

Unit 6 ii Errors and Approximations

Unit 6 iii Maxima, Minima of Multivariable Functions

Unit 6 iv Lagrange’s Method of Undetermined Multipliers


Eigenvalues and Eigenvectors of Matrices

  • Mathematical Formulation

Consider a transformation matrix {A}, such that it transforms a vector {X} into {Y}. Thus, we can write,

{Y = AX}

Recall that 2 vectors directed along same direction are simply scalar multiples of each other. e.g. {\vec P = 3 \hat i + 4 \hat j = (3,4)} and {\vec Q = 6 \hat i + 8 \hat j} (Check their direction). We can write {\vec Q = 2 \vec P}.

Now, suppose there exists a scalar {\lambda}, such that

{Y = \lambda X}

Now this is interesting, because {Y} and {X} are now related via. a matrix {A} and a scalar (a number) {\lambda}. i.e. {Y = AX} as well as {Y = \lambda X}.


{Y = AX = \lambda X = \lambda IX}


{AX - \lambda IX = 0 \ or \ \ (A - \lambda I)X =0}

Since RHS is a null vector, this forms a homogeneous system, which will have non-trivial solutions, when {|A - \lambda I| =0}. On expanding the determinant, we will get {n} values of {\lambda}, if {A} is of the order {n \times n}. These values are known as the eigenvalues. In German, eigen means special.

  • More About Eigenvalues

I) The determinant {|A- \lambda I|} is known as the characteristic determinant and the polynomial obtained on expanding the determinant is known as the characteristic polynomial. If matrix {A} is of the order {n}, the degree of polynomial is {n} and hence, the matrix has {n} eigenvalues, which may be distinct or identical.

The set of eigenvalues is known as the spectrum.

II) {\sum \limits_{i=1}^n \lambda_i = \sum \limits_{i=1}^n a_{ii} =} Trace

III) {\prod \limits_{i=1}^n \lambda_i = |A|}

This implies, if any one of the eigenvalues is {0}, then {|A|} is {0}.

IV) Eigenvalues of {A^{n}} are {\lambda^n}, where {n} is a non-negative integer.

V) Eigenvalues of {A} and {A^T} are same.

VI) Eigenvalues of {A- KI} are {\lambda_i - K}, where {K} is any number.

VII) If {A} is symmetric, then its eigenvalues are real.

  • Eigenvectors

Corresponding to each of the eigenvalues {\lambda}, there will be a vector, known as eigenvector. This is obtained by solving the system {(A - \lambda I) X =0}. As stated earlier, {AX = \lambda X}.

This forms a system of homogeneous equations, and to get {X}, we solve the system.

  • Properties of Eigenvectors

These are the vectors, whose direction does not change under the transformation {AX}.

I) For an eigenvalue {\lambda}, if {X} is an eigenvector, then {KX, K \ne 0} is also an eigenvector.

II) If the eigenvalues are distinct, the eigenvectors are linearly independent.

III) If {A} is symmetric, then the eigenvectors corresponding to 2 distinct eigenvalues are orthogonal.

  • Use of Eigenvalues and Eigenvectors

There are many applications of eigenvalues and eigenvectors. Finding natural frequencies of a system with multiple degrees of freedom is an example which is a typical mechanical engineering application. The equations of motion form a system of the kind {AX = B = \lambda X}. The eigenvalues of {A} tell us its natural frequencies. The corresponding eigenvectors indicate the mode shapes.

  • Cayley Hamilton Theorem

It states that every matrix satisfies its characteristic equation, i.e. if {|A - \lambda I|=0}, or {f(\lambda) = 0} , then


This theorem can be used to find higher powers of a matrix as well as its inverse.

Matrices and Simultaneous Equations

Consider the following equations:

{2x+3y =8, \ x - y = -1}

This can be written using matrix form as

{\begin {bmatrix} 2 & 3 \\ 1 & -1 \end {bmatrix} \begin {bmatrix} x \\ y \end {bmatrix} = \begin {bmatrix} 8 \\ -1 \end {bmatrix}}

Let {\begin {bmatrix} 2 & 3 \\ 1 & -1 \end {bmatrix}} be {A}, the matrix of coefficients, {\begin {bmatrix} x \\ y \end {bmatrix}} be {X}, the matrix of variables and {\begin {bmatrix} 8 \\ -1 \end {bmatrix}} be {B}, the matrix of constant terms.

We can then write {AX =B}.

Now, consider a matrix, where the column vector {B} is joined to {A}, s.t. it is the last column of the newly formed matrix. i.e.

{\begin {bmatrix} 2 & 3 & 8\\ 1 & -1 & -1 \end {bmatrix}}

This matrix is known as the augmented matrix. Let’s denote it by {(A,B)}. The behavior of the system depends on the relation between the rank of augmented matrix {\rho (A,B)} and the rank of coefficient matrix {\rho (A)}.

The augmentation is done usually for the purpose of performing the same elementary row operations on each of the given matrices.

Thus, if a system has {m} variables in {n} equations, it can be written in the form

{A_{n \times m} X_{m \times 1} = B_{n \times 1}}

The augmented matrix will be {(A,B)_{n \times {(m+1)}}}.

  • Homogeneous and non-Homogeneous Systems

When a system of equations has all constant terms to be {0}, i.e. {B} is a null vector, it is known as a homogeneous system. When {B} is not a null vector, it is known as a non-homogeneous system.

  • Consistent and Inconsistent Systems

When a system has 1 or more solutions, it is said to be consistent. When it does not have any solution, it is said to be inconsistent.

Thus, a homogeneous system will always be consistent, because, zeros always form a solution. (This solution is known as trivial solution).

  • Condition for Consistency of Non-homogeneous System of Equations

When the rank of augmented matrix is equal to the rank of matrix of coefficients, the system is consistent. i.e.

{\rho (A,B) = \rho (A)}

When it isn’t, the system is inconsistent.

  • Finitely Many and Infinitely Many Solutions of Non-homogeneous System of Equations

Once the condition for consistency is satisfied, we can look for the possibility of infinitely many solutions. When the rank {\rho} is less than the total number of unknowns,{m}, the system possesses an infinite number of solutions. When it is equal to the total number of unknowns, the system possesses a unique solution.

  • A sub-condition, provided that ranks are equal

When the number of unknowns is equal to the number of equations, the coefficient matrix {A} is a square matrix. In such cases, if {|A| \ne 0}, the system possesses a unique solution given by {X = A^{-1}B}. (Discussed in Matrices III)

When {|A|=0}, if {\rho < m}, it is a case of infinitely many solutions.

  • Condition for Only Unique (Trivial) Solution of Homogeneous System

The homogeneous system always possesses a trivial solution. If the ranks are equal, and is less than the the number of unknowns, there will be infinitely many solutions.

For homogeneous systems, where number of equations is equal to number of unknowns, when {|A| \ne 0}, it possesses only trivial solution.

  • Linear Dependence of Vectors

Recall : A vector is either a row vector or a column matrix.

A set of {n} vectors, {x_1, x_2, ... , \ x_n} is said to be linearly dependent, when there exist {n} scalars {c_1, c_2, ... ,\ c_n}, not all zero, such that

{\sum_{i=1}^{n} c_i \times x_i =0, \ \ \ \ equation \ 1}

When this condition is not satisfied, the vectors are linearly independent. For example, consider {\hat i, \hat j} and {\hat k}. They can be written as

{\hat i = [1 \ 0 \ 0], \hat j = [0 \ 1 \ 0], \hat k = [0 \ 0 \ 1]}

There exist no scalars {c_1, c_2, c_3}, not all zero, s.t. {c_1 \hat i + c_2 \hat j + c_3 \hat k = 0}. Hence these vectors are linearly independent.

Equation {1} gives us a homogeneous system of equations with {X} being the matrix containing all {c_i}s.

  • Orthogonality

A square matrix {A} is orthogonal, when {A^T = A^{-1}}. Its determinant is always equal to {\pm 1}. Consider a system

{Y = AX}

such that

{Y = \begin {bmatrix} y_1 \\ y_2 \\ .. \\ .. \\ y_n \end {bmatrix} , X = \begin {bmatrix} x_1 \\ x_2 \\ .. \\ .. \\ x_n \end {bmatrix}, A = \begin {bmatrix} a_{11} & a_{12} & .. & .. & a_{1n}\\ a_{21} & a_{22} & .. & .. & a_{2n} \\ .. & .. & .. & .. & ..\\ .. & .. & .. & .. & .. \\ a_{n1} & a_{n2} & .. & .. & a_{nn}\end {bmatrix}}

{A} is orthogonal, when {\sum \limits_i x_i^2 = \sum \limits_i y_i^2}

Rank of a Matrix

  • Rank of a Matrix

An important characteristic of any matrix is its rank. It tells us the number of independent rows or columns of matrix.

Note that all matrices have a rank, unlike inverse of a matrix, which only a non-singular square matrix has.

Definition: A matrix has a rank {r}, if

I) At least one  minor of order {r} which is not equal to {0} and

II) Every minor of order {r+1} is {0}.

[A minor of a matrix {A} is the determinant of some smaller square matrix, formed by removing one or more rows or columns of {A}. ]

The elementary transformations of a matrix do not alter its rank. Any matrix {B}, obtained by transforming a matrix {A} is known as equivalent matrix. It is denoted by {A \sim B}.

  • Rank of a Matrix by reducing it to Echelon/ Canonical Form

A matrix is in echelon/ canonical form if

I) all nonzero rows are above any rows of all zeroes (if any), and

II) the leading coefficient (the pivot) of a nonzero row is always strictly to the right of the leading coefficient of the row above it. The pivot is preferably taken to be {1}.

The rank {r} is the number of non-zero rows of matrix.

  • Rank of a Matrix by reducing it to Normal Form

A normal form of a non-zero matrix {A} is a matrix in either of the following forms:

{[I_r], \ \begin {bmatrix} I_r & 0 \end {bmatrix}, \ \begin {bmatrix} I_r \\ 0 \end{bmatrix}, \ \begin {bmatrix} I_r & 0 \\ 0 & 0 \end {bmatrix}}

{I_r} is identity matrix of order {r}.

We use both {row} and {column} transformations to reduce the matrix to normal form. The rank of the matrix is then equal to {r}.

  • Obtaining 2 matrices {P} and {Q}, such that {PAQ} is in normal form

Consider a matrix {A} with {m} rows and {n} columns. Take {P} as {I_m} and {Q} as {I_n}. We can then write

{A_{m \times n} = I_{m} \times A \times I_n = PA Q}

Reduce {A} on the LHS to a normal form. Perform respective row transformations on {I_m} and column transformations on {I_n}. Once {A} is reduced to normal form, we get the rank.

Note: {P} and {Q} are not unique, and are non-singular.

  • Inverse using {PAQ} form

If {PAQ} is in normal form, {A^{-1}} is {QP}.

Errors and Approximations, Maxima and Minima, Lagrange’s Method

  • Errors and Approximations

If {f} is a function of {x,y,z}, then the error in {f} is

{df = \frac {\partial f}{\partial x} dx + \frac {\partial f}{\partial y} dy + \frac {\partial f}{\partial z} dz}

However, this is an approximation.

Note: If necessary, we take log on both sides.

  • Maxima and Minima (Functions of Several Variables)

We have studied the process of finding extreme values of {y=f(x)}. Now, we will see the same for {f = f(x,y)}.

First, we equate {\frac {\partial f}{\partial x}} and {\frac {\partial f}{\partial x}} to {0}. We get those pairs {(x,y)}, which can either be maxima, minima or saddle points.

We then find {f_{xx} = r}, {f_{xy} = s} and {f_{yy}=t} for these pairs.

I) {rt > s^2}, {r < 0} – Gives maximum value

II) {rt > s^2}, {r > 0} – Gives minimum value

III) {rt < s^2}Saddle point

IV) {rt = s^2} – No conclusion possible

  • Lagrange’s Method of Undetermined Multipliers

This method is used when the maximum and minimum of a function are to be obtained under certain constraints (like the optimization problems). We form a linear relation between the function and the constraint using a Lagrange multiplier and differentiate it partially w.r.t. the variables. We then eliminate the variables and parameter and get an equation. The roots of that equation are the extreme values.


  • Introduction

To get an idea about Jacobians, one requires knowledge of matrices, determinants, functions and partial differentiation.

I) Matrices and Determinants :

A matrix is a rectangular arrangement of numbers in {m} rows and {n} columns. When {m =n}, the matrix is known as a square matrix. Determinants are defined for square matrices. The simplest square matrix is

{A = \begin {bmatrix} a & b \\ c & d \end {bmatrix}, |A| = ad -bc}

One can expand a {3 \times 3} determinant using cofactors.

Note that Jacobians are determinants whose elements are partial derivatives.

II) Functions of Several Variables :

This is often the case in real world applications. For example, the temperature at a given point may be a function of space {(x,y,z)} and time {t}. So {T} depends on {4} variables.

III) Partial Differentiation : 

Read more here.

  • Intuitive Idea

Let {f} be a function of {x}. Then,by definition of differentiation,

{f'(x) = \lim_{\Delta x \to 0} \frac{f(x + \Delta x) - f(x)}{\Delta x}}

By removing the limit, we can approximate the left hand side to right hand side.

{f(x + \Delta x) \approx f(x) + f'(x) \Delta x}

Now, instead of {x}, if we have a vector function {X}, then

{f(\underbrace{X}_{n \times 1} + \underbrace{\Delta X}_{n\times 1}) \approx}

{\underbrace{f(X)}_{m \times 1} + \underbrace{f'(X)}_{?} \underbrace{\Delta X}_{n \times 1}}

The {?} thing is the Jacobian. This means, while dealing with derivatives of functions of several variables, to get an approximation, one needs to differentiate each function w.r.t. each independent variable. This is what Jacobians are for.

  • Definition

Jacobians are termed as functional determinants. Let {u,v} be functions of independent variables {x,y}. Then the determinant

{\begin {vmatrix} \frac {\partial u}{\partial x} & \frac {\partial u}{\partial y} \\ \frac {\partial v}{\partial x} & \frac {\partial v}{\partial y}\end {vmatrix} = \begin {vmatrix} u_x & u_v \\ v_x & v_y \end {vmatrix}}

is the Jacobian, which is sometimes denoted by

{ J = \frac {\partial (u,v)}{\partial (x,y)}}

Similarly, for functions {u,v,w} of {x,y,z},

{\begin {vmatrix}u_x & u_y & u_z \\ v_x & v_y & v_z \\ w_x & w_y & w_z \end {vmatrix}}

While changing the coordinate system from one to another, Jacobians are useful. Thus,

{dxdydz = r^2 sin^2 (\theta) dr d \theta d \phi}

And {r^2 sin^2 (\theta)} is

{J = \frac {\partial (x,y,z)}{\partial (r, \theta, \phi)}}

  • Chain Rule


{x = f_1 (u,v), \ y = f_2 (u,v) \ and \ u = g_1 (s,t), \ v = g_2 (s,t)}


{\frac {\partial (x,y)}{\partial (u,v)} \times \frac {\partial (u,v)}{\partial (s,t)} = \frac {\partial (x,y)}{\partial (s,t)}}

A corollary to this is,

{\frac {\partial (x,y)}{\partial (u,v)} \times \frac {\partial (u,v)}{\partial (x,y)}=1}

  • Jacobians of Implicit Functions

Let {u_1,u_2,u_3} be implicit functions of {x_1,x_2,x_3} such that following equations are satisfied :

{f_1 (u_1, u_2, u_3, x_1, x_2, x_3) = 0 , \ f_2 (u_1, u_2, u_3, x_1, x_2, x_3) = 0 , \ f_3 (u_1, u_2, u_3, x_1, x_2, x_3) = 0 }

Then, for {j =1,2,3},

{\frac {\partial f_j}{\partial x_j} + \sum \limits_{i = 1}^{3} \frac {\partial f_j}{\partial u_i} \frac {\partial u_i}{\partial x_j}}

Useful to solve problems :

{\frac {\partial (u_1, u_2)}{\partial (x_1, x_2)} = (-1)^2 \times \frac {\frac {\partial (f_1, f_2)}{\partial (x_1, x_2)}} {\frac {\partial (f_1, f_2)}{\partial (u_1, u_2)}}}


{\frac {\partial (u_1, u_2, u_3)}{\partial (x_1, x_2, x_2)} = (-1)^3 \times \frac {\frac {\partial (f_1, f_2, f_3)}{\partial (x_1, x_2, x_3)}} {\frac {\partial (f_1, f_2, f_3)}{\partial (u_1, u_2, u_3)}}}

  • Partial Derivatives of Implicit Functions

Let {u_1,u_2} be implicit functions of {x_1,x_2} such that

{f_1 (u_1, u_2, x_1, x_2) = 0 , \ f_2 (u_1, u_2, x_1, x_2) = 0}


{\frac {\partial u_1}{\partial x_1} = (-1) \times \frac {\frac {\partial (f_1,f_2)}{\partial (x_1, u_2)}}{\frac {\partial (f_1,f_2)}{\partial (u_1, u_2)}}}

Similarly, one can get {\frac {\partial u_1}{\partial x_2}} and other derivatives.

  • Functional Dependence

For functional dependence of {f_1 (x_1, x_2)} and {f_2(x_1,x_2)},

{\frac {\partial (f_1, f_2)}{\partial (x_1, x_2)} = 0}

For dependence of 3 functions, one has to equate the {3 \times 3} Jacobian to {0}.