Integrating fault detection and classification in microgrids using supervised machine learning considering fault resistance uncertainty

ML and SVM
The rapid development of communication and information technology in modern society progressively produces data at an amazing pace. The techniques of uncovering knowledge from a vast majority of data are called data mining. For the time being, data mining complements constructing analytical methods7.
SVM algorithm as a supervised ML technique is frequently utilized for classification problems. A significant advantage of SVM over other classification algorithms such as the KNN algorithm is the high degree of accuracy they offer.
Introduced in 1963, SVM is a mature algorithm for ML and it has great capability to generalize well to unobserved data45. SVMs are currently considered highly effective in various classification tasks, spanning from text to genomic data, and are regarded as top-performing methods. The SVM is a linear classifier that can be considered an advanced way of supervised classifying for high-dimensional data. In order to minimize the error in the training process the classification boundary is constructed. The concept behind SVM is that data points might be linearly separable when mapped into a higher-dimensional space, even if they are not linearly separable in the original space where the data is originally represented. The SVM offers the capability to identify and define nonlinear boundaries that effectively separate different classes within a given problem46. In other words, the SVM is a special transforming method that makes the learning task seem simpler to execute than in the original space of features.
A concept known as hyper-plane is used in the SVM to separate data into, unlike classes of data. This hyper-plane is supported by using support vectors that enlarge the hyper-plane margin as much as possible. In other words, instead of completely separating the vectors into two classes, SMV uses a trade-off. It allows for some vectors to drop inside the margin and on the wrong side of the decision boundary.
Decision trees Naive Bayes were used in47 to more effectively discriminate fault events from normal situations in MGs. The ensemble of classifiers was used in another protection scheme48. The ensemble-based approach simultaneously performs mode detection, fault detection, and classification. Moreover, a method by combines data-mining and wavelet analysis was proposed in49 for providing an intelligent protection scheme for MGs.
Suppose that the training vectors \(\:x_j\in\:\mathbbR^n,\:j=\text1,2,3,\dots\:,\:k\) have two classes, and \(\:y_j\in\:\-\text1,1\^n\) is a vector of the objective for finding \(\:\omega\:\in\:\mathbbR^n\) and \(\:b\in\:\mathbbR\). The purpose is to make sure that the prediction provided using \(\:sign(\omega\:^T\phi\:\left(x\right)+b)\) is as correct as possible for most samples. Accordingly, the optimization problem that must be solved by SVM can be written as follows:
$$\:\underset\omega\:,b,\epsilon\:\textmin\frac12\omega\:^T\omega\:+C\sum\:_j=1^k\epsilon\:_j$$
(1)
Subject to:
$$\:y_j\\omega\:^T\phi\:\left(x_j\right)+b\\ge\:1+\epsilon\:_j$$
(2)
$$\:\epsilon\:_j\ge\:0,\:j=\text1,2,3,\dots\:,k$$
(3)
By minimizing \(\:\omega\:\right^2=\omega\:^T\omega\:\:\)the margin is maximized. Note that when a sample is within the margin boundary or even misclassified, a penalty is incurred. In an ideal manner, the value \(\:y_j\\omega\:^T\phi\:\left(x_j\right)+b\\) would be larger or equal to one for all samples. In other words, the model predicts the results in a perfect way. Nonetheless, problems are usually flawlessly distinguishable with a hyper-plane, so several samples can be at a distance \(\:\epsilon\:_j\) from their exact margin boundary. In addition, \(\:C\) is the penalty term which is an inverse regularization parameter.
The quadratic function represented in Eq. (2) needs to be optimized subject to linear constraints. There are several (non-trivial) algorithms for a famous class of mathematical programming problems known as quadratic optimization. To solve them, a dual problem is constructed and a Lagrange multiplier \(\:\alpha\:_i\) is linked to every inequality constraint in the original problem. Therefore, we should obtain \(\:\alpha\:_i,\dots\:,\:\alpha\:_n\) in order to maximize Eq. (4) as follows:
$$\:Q\left(a\right)=\sum\:a_i-\:\frac12\sum\:\sum\:a_ia_jy_iy_jx_i^Tx_j\:$$
(4)
And also:
$$\:\sum\:a_iy_j=0$$
(5)
$$\:a_i\ge\:0\:for\:all\:a_i$$
(6)
Steady state symmetrical components
Most textbooks on power systems provide the concept of symmetrical components. Symmetrical components have been used in fault analysis, protection, and unbalance mitigation in power systems. Some studies discovered that the control strategies of the inverters affect the equivalent positive and negative impedances of DGs including magnitude and angle. Accordingly, numerous symmetrical components estimating ways have been used50,51,52,53. The proposed techniques can be used to estimate symmetrical components in the frequency domain by DFT or in the time domain.
Consider three-phase instantaneous currents as \(\:i_u\left(t\right)\), \(\:i_v\left(t\right)\) and \(\:i_w\left(t\right)\). Accordingly, the related instantaneous symmetrical components can be calculated by50:
$$\:\left[\beginarrayci_u\left(t\right)\\\:i_v\left(t\right)\\\:i_w\left(t\right)\endarray\right]=\left[\beginarrayci_u,0\left(t\right)\\\:i_v,0\left(t\right)\\\:i_w,0\left(t\right)\endarray\right]+\left[\beginarrayci_u,1\left(t\right)\\\:i_v,1\left(t\right)\\\:i_w,1\left(t\right)\endarray\right]+\left[\beginarrayci_u,2\left(t\right)\\\:i_v,2\left(t\right)\\\:i_w,2\left(t\right)\endarray\right]$$
(7)
Where all the parameters as instantaneous values and zero, positive and negative components are denoted by 0, 1, and 2, respectively. Using RMS values (phasors written in bold symbols), it can be reformulated as:
$$\:\left[\beginarrayc\varvecI_u\\\:\varvecI_v\\\:\varvecI_w\endarray\right]=\left[\beginarrayc\varvecI_u,0\\\:\varvecI_v,0\\\:\varvecI_w,0\endarray\right]+\left[\beginarrayc\varvecI_u,1\\\:\varvecI_\varvecv,1\\\:\varvecI_w,1\endarray\right]+\left[\beginarrayc\varvecI_u,2\\\:\varvecI_v,2\\\:\varvecI_w,2\endarray\right]$$
(8)
Accordingly, as every set of these components are balanced, the following equation can be presumed:
$$\:\left[\beginarrayc\varvecI_u,0\\\:\varvecI_u,1\\\:\varvecI_u,2\endarray\right]=\frac13\left[\beginarrayccc1&\:1&\:1\\\:1&\:\varveca&\:\varveca^2\\\:1&\:\varveca^2&\:\varveca\endarray\right]\left[\beginarrayc\varvecI_u\\\:\varvecI_v\\\:\varvecI_w\endarray\right]$$
(9)
Where \(\:\varveca=e^j\frac2\pi\:3\).
In54 it has been demonstrated that for symmetrical component currents in unbalanced faults, the rate of change is far faster than in phase currents. Bear in mind that the symmetrical components exist in currents during normal conditions. Accordingly, in order to appropriately coordinate protective elements, it is of crucial importance to be aware of the content of symmetrical component currents in normal and emergency operations in MGs.
One or more symmetrical components occur for each type of fault, as presented in Table 2. Therefore, the symmetrical components suitable for locating the fault location emerge when a fault occurs55.
Symmetrical components can be considered the best features for fault detection as well as fault localization and they help to overcome some difficulties related to just exploiting the magnitude of the fault current47.
Proposed method
The fault components principle is suggested in56 to make a scheme for MGs to overcome the complexity of operating conditions. The sequence directional element of components together with the distance element of local measuring units are used.
There is a lot of uncertainty in a given MG as well as the grid that serves that MG. The value of fault current depends on the characteristics of the fault such as its location, resistance, and inception angle. Accordingly, the time domain simulations, no matter how accurate those are, are expensive and not able to take into account all aspects. To the best of our experience, the steady state analysis has the authenticity to predict the faulty situation and is inexpensive. Needless to say, implementing the results based on the steady-state parameters in IEDs is more straightforward50,57.
A suitable protection system in the MG must21,58:
-
Be able to deal with a wide range of PFs. Some European grid codes mandate reactive current generation as much as the rated capacity in faults for renewable sources59.
-
Cope with the intermittent nature of RESs.
-
Take into account stochastic features of faults60.
To accomplish a reliable relaying scheme, the post-fault currents are analyzed using VSM to produce discriminatory attributes among isolated and grid-connected modes regardless of healthy or faulty conditions. Given that, at first, a series of scenarios are formed and feed into PowerFactory DIgSILENT (hereafter, referred to as DIgSILENT) according to ANSI/IEEE C37. Note that DIgSILENT can analyze the short-circuit values according to the following methods (standards): VDE 0102, IEC 60,909, ANSI/IEEE C37, IEC 61,363, and IEC 61,660 (DC), and complete (superposition with considering the pre-fault results of load-flow analyses).Then, the parameters used for fault identification as shown in Table 1 are exploited in Python. SVM is the method used to discriminate between faulty and no faulty conditions.
In the following subsections, we provide an approach to take all these aspects into consideration when the proposed AI algorithm is trained.
Scenario construction
Each IED must be trained by its own dataset. A proper and reliable adaptive protection scheme must consider all the following situations10 in order to ensure minimum portion isolation in the MG:
-
The operation mode of the MG (Connected or isolated).
-
The operation mode of RESs (ON or OFF).
-
Disconnection in non-fault conditions.
-
Proper functioning in hidden failures.
It is worth noting that random variation of loads is not considered as it seldom affects the short circuit level.
Fault types
Four kinds of faults as presented in Table 2 are used in the proposed method to make sure that the algorithm can discriminate the faults correctly. Accordingly, LG, LLG, LL, and LLL faults are taken into consideration.
RES status
Due to the inherent volatility of RESs, their operational status can fluctuate, meaning they may be considered either in service or out of service at any given time. This variability is influenced by factors such as weather conditions, availability of resources, and other environmental factors that affect their performance and reliability.
Fault location
The complex structure of MGs necessitates using the voltage and current meters which are installed at different points. As the proposed method uses local measurements in MGs, the database must include different faults in the system. Therefore, the proposed method usually employs:
MG status
The protection scheme must be capable of distinguishing between two distinct operational modes: grid-connected and isolated. In the grid-connected mode, the microgrid operates in conjunction with the main power grid, allowing for the exchange of power. Conversely, in the isolated mode, the microgrid functions independently, relying solely on its internal resources to maintain stability and supply power. This differentiation is crucial for ensuring the appropriate protective measures are applied in each scenario.
Fault resistance
The presence of RESs in MGs may seriously affect the accuracy of the technique for determining the fault location. Accordingly, the adverse effect of these resources on the fault location calculation must be quantified. To do so, in the suggested technique, the intrinsic uncertainty in the fault resistance is pondered. Reference35 suggests that the fault resistance is an amount 0, 20, 40 ohms.
To quantify and express uncertainty, probability distributions are used. A probability distribution is a mathematical illustration of the relative likelihood of a variable having specific values61. Probability distributions can be graphically displayed in several ways. PDF is the simplest way to express the distribution function.
In engineering tasks, often uncertainties are considered explicitly, in which slight describing information is available to describe. In the lack of exact information, the uniform distributions are appropriate to describe errors in a way that is trouble-free. The user experience can be used to fix the bounds in many circumstances62.
Uncertainties related to considering the magnitude of load, the resistance of faults, the type of fault and the faulted node are investigated in63. Two kinds of measurements in the three-phase and LG faults namely at the upstream substation as well as the DG. The LG and LLL faults are considered in the proposed method.
Note that fault resistance affects the method of fault locating greatly64. In MGs, the fault locating methods are mainly based on the AI, the impedance method as well as the traveling wave method65. Regarding the stochastic nature of the fault, we propose using a Uniform probability distribution (also called the Rectangular Distribution) for which equal probability is assigned to all the fault resistances in the range of 0 to 20 Ω.
Contingencies
Emergency conditions play a significant role in the proper operation of the protection relays. Accordingly, some crucial contingencies must be considered in coordination with the relays. Needless to say, contingencies with a high probability of occurrence are of great importance. A criterion in power system studies known as N-1 refers to considering conditions in which just one equipment is out of service66. In this paper, the loss of some equipment except RESs with considerable effect on the fault current level at the IED under investigation is taken into account.
Methodology
In the proposed procedure, the symmetrical RMS values of the short circuit are used. Table 3 shows the quantities utilized in the proposed method for fault identification.
link