Scholarly article on topic 'Test results judgment method based on BIT faults'

Test results judgment method based on BIT faults Academic research paper on "Electrical engineering, electronic engineering, information engineering"

CC BY-NC-ND
0
0
Share paper
Keywords
{BIT / "Composite BIT program" / "Stress monitor" / "Test course" / "Test sequence control"}

Abstract of research paper on Electrical engineering, electronic engineering, information engineering, author of scientific article — Gang Wang, Jing Qiu, Guanjun Liu, Kehong Lyu

Abstract Built-in-test (BIT) is responsible for equipment fault detection, so the test data correctness directly influences diagnosis results. Equipment suffers all kinds of environment stresses, such as temperature, vibration, and electromagnetic stress. As embedded testing facility, BIT also suffers from these stresses and the interferences/faults are caused, so that the test course is influenced, resulting in incredible results. Therefore it is necessary to monitor test data and judge test failures. Stress monitor and BIT self-diagnosis would redound to BIT reliability, but the existing anti-jamming researches are mainly safeguard design and signal process. This paper focuses on test results monitor and BIT equipment (BITE) failure judge, and a series of improved approaches is proposed. Firstly the stress influences on components are illustrated and the effects on the diagnosis results are summarized. Secondly a composite BIT program is proposed with information integration, and a stress monitor program is given. Thirdly, based on the detailed analysis of system faults and forms of BIT results, the test sequence control method is proposed. It assists BITE failure judge and reduces error probability. Finally the validation cases prove that these approaches enhance credibility.

Academic research paper on topic "Test results judgment method based on BIT faults"

Chinese Journal of Aeronautics, (2015), 28(6): 1650-1657

JOURNAL OF

AERONAUTICS

Chinese Society of Aeronautics and Astronautics & Beihang University

Chinese Journal of Aeronautics

cja@buaa.edu.cn www.sciencedirect.com

Test results judgment method based on BIT faults

Wang Gang a'b, Qiu Jingab'*, Liu Guanjunab, Lyu Kehongab

a Science and Technology on Integrated Logistics Support Laboratory, National University of Defense Technology, Changsha 410073, China

b College of Mechatronics and Automation, National University of Defense Technology, Changsha 410073, China

Received 25 November 2014; revised 12 October 2015; accepted 12 October 2015 Available online 19 October 2015

Abstract Built-in-test (BIT) is responsible for equipment fault detection, so the test data correctness directly influences diagnosis results. Equipment suffers all kinds of environment stresses, such as temperature, vibration, and electromagnetic stress. As embedded testing facility, BIT also suffers from these stresses and the interferences/faults are caused, so that the test course is influenced, resulting in incredible results. Therefore it is necessary to monitor test data and judge test failures. Stress monitor and BIT self-diagnosis would redound to BIT reliability, but the existing antijamming researches are mainly safeguard design and signal process. This paper focuses on test results monitor and BIT equipment (BITE) failure judge, and a series of improved approaches is proposed. Firstly the stress influences on components are illustrated and the effects on the diagnosis results are summarized. Secondly a composite BIT program is proposed with information integration, and a stress monitor program is given. Thirdly, based on the detailed analysis of system faults and forms of BIT results, the test sequence control method is proposed. It assists BITE failure judge and reduces error probability. Finally the validation cases prove that these approaches enhance credibility.

© 2015 Production and hosting by Elsevier Ltd. on behalf of CSAA & BUAA. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).

KEYWORDS

Composite BIT program; Stress monitor; Test course; Test sequence control

1. Introduction

Built-in-test (BIT) system is an affiliated part with functions of health management and fault diagnosis.1'2 Its test course is constituted by electronic components and connectors.3 The

* Corresponding author. Tel.: +86 731 84574311. E-mail addresses: wang2gang2@163.com (G. Wang), qiujing16@si-na.com (J. Qiu), gjliu342@qq.com (G. Liu), fhrlkh@163.com (K. Lyu).

Peer review under responsibility of Editorial Committee of CJA.

test data is sent to processors and memorizers for operation and storage by transmission path (fiber, cable, bus, etc.), so its correctness relies on BIT equipment (BITE) state. However, testing is interfered by environment stresses, which will even cause the corresponding failures. The wrong test results are direct consequences which cause BIT false alarms and non-detection.4-7 The data correctness should be guaranteed and BITE failures must be judged, thus the stress monitor program and test sequence control method are proposed to enhance self-diagnosis capacity of BIT system.

The designers prefer protection designs, while BIT improvement methods have been proposed. NASA has been researching on aviation BIT system and indicated that BITE reliability is critical. Certain methods are: continuous monitor, voter,

http://dx.doi.org/10.1016/j.cja.2015.10.008

1000-9361 © 2015 Production and hosting by Elsevier Ltd. on behalf of CSAA & BUAA.

This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).

chain method, overlap BIT, etc.8-10 The unsteady operating state and measuring errors cause test errors, so the test data temperately exceeds to cause wrong alarms.11 For effective maintenance action, false alarm analysis and reduction are necessary.12 It is acknowledged that most BIT errors derive from data gathering and process and there are many methods to solve this problem, among which Rahman et al. propose a novel framework for sensor data.13 Allen indicates that BIT itself is a main cause of false alarms and the Bayes theory is an effective solution.14 The equipment working environment is becoming worse with the wide application and high indexes; Deng et al. think that both of the intermittent fault and the environment stress are main causes of false alarms.15 The direct consequences are that BITE failures become common, so that the BIT self-diagnosis technology is necessary and essential.

This paper illustrates the consequences influenced by the environment stresses. Then the composite BIT program is summarized and BITE failure judge method is proposed to get correct diagnosis results. The program weakens the stress influences; the stress monitor gives an accessorial judgment for test results; the BIT test results are analyzed and classified for sequence control method. These methods are used in a radar testability improvement project and their efficiencies are proved.

2. Influences of environment stress

Different kinds of stresses are dominant under special conditions: the temperature is difficult to control in upper air and deep sea; the vibration increases with high acceleration and agility; great integration degree makes the electronic components be interfered by electromagnetic stress more easily. The subassemblies of test course are influenced, causing interference/failure, so the test data errors are generated. Fig. 1 shows how the stresses influence testing and result in BITE failures. Then BIT system cannot run successfully and the fault diagnosis results become unreliable.

(1) Temperature stress

The equipment constantly suffers temperature stress during all life time. The typical failure modes include parameter float, sealing failure, component aging, bad contact, etc. For example, the high integrated circuits (IC) work and generated

thermo. The heat-collecting phenomenon is obvious to cause these severe failure modes. This stress causes 40% time stress failures.16-19

(2) Vibration stress

The main vibration forms are as follows: the connectors suffer vibration with quite higher amplitude; the sympathetic vibration occurs with similar frequencies; fatigue damage is caused by too many vibration circulations. The common failure modes include crack, short circuit, looseness, contact open, etc. It causes 27% time stress failures.20

(3) Electromagnetism stress

This stress is acknowledged as a critical factor for electron devices, without concrete statistical data. The typical failure modes include semiconductor puncture, lap joint, false action, short circuit, etc. There are two main consequences: the system would recover with no physical damage; the permanent failures occur with direct damage. For example, the common radiation threshold of semiconductor damage is 10~5-10~2 J/cm2; for the damageable device, it is reduced to 0.1-1.0 iJ/cm2; for instantaneous failures, the threshold is lowered by 2-3 orders of magnitude.

Fig. 2 shows the high frequency simulator structure (HFSS) simulation model of an electric circuit (multi-chip module, MCM). It is in the cabinet made by aluminum 5A06 and the thickness is 7 mm. Figs. 2(a)-(b) is the HFSS analytical results of the overall and local model separately. The stress concentration is not obvious, the maximum and common values only have about 2 multiple difference.21 The coupling current mainly enters from pin and line. With coupling electric signals, the MCM cannot work normally.

(4) Analysis of BIT system influences

The stresses influence reliability and performance of BIT system seriously, so that uncertain aftereffects occur and data credibility declines. It is known that: the stress influences BIT results; it causes system and BITE failures; the errors can be weakened but not eliminated. The BIT correctness is defined by test course, feature signals and noise. When the noise grows larger, the output signal fluctuates and test data changes; when there are failures on test course, test data uncertainty occurs.

[ Temperature j-

Vibration

Electro -magnetism

Environment stress

Sensor

Processor Memorizer

Fiber, cable, interface, bus

Test course

Test course

Parameter drift; Short circuit; Open circuit; Aging;

Looseness; Open circuit; Bad contact; Test invalidation Data falsity

Degeneration; Malfunction; Data lose;

Interference and failures

Defence and separation

Design improvements

Composite BIT system

Stress monitor

Test sequence control

BIT fault judge

Fig. 1 Illustration of environment influences and corresponding solutions.

Fig. 2 HFSS analysis of electromagnetic stress.

Firstly the test data changes temporarily, and then they become unreliable with time and amplitude.

Assume xr and xt are real value and test result of fault feature separately and Th is fault alarm threshold. The system healthy probability is that xr is in normal range, P(Normal) = P(xr < Th); BIT pass probability is that xt is less than Th, P(Pass) = P(xt < Th). Because of stress effect and measure errors, they are not equal, xr—xt. So BIT pass does not indicate normal state completely, P(Pass)—P(Normal). BIT no-pass does not equal faulty state, P(NPass)—P(Fault). Then there may be many wrong BIT alarms, most of which would be eliminated by test data process. But some may remain to form BIT false alarms and the effective alarms neglect results in non-detection.

Eqs. (1) and (2) are BIT false alarm rate (FAR) and non-detection rate (NDR). Where, m indicates BIT pass and m is BIT alarm, S indicates stress and D is the error by test course. FAR is the probability that healthy system is judged faulty, and NDR is the probability that faulty system is judged healthy. The reasons are stress effect and BITE failure.3 Test credibility would be enhanced with proper diagnosis.

FAR = P(m|xt,xr < Th) w P(xt > Th|xr < Th)

= P[xt(S) > Th|xr < Th] + P[(xt + D) > Th|xr < Th] (1) NDR = P(m|xt,xr > Th) w P(xt < Th|xr > Th)

= P[xt(S) < Th|xr > Th] + P[(xt + D) < Th|xr > Th] (2)

For single test of independent feature, assume that the xt accords with the normal distribution R(i, r) and the cumulative probability function is u(Th,k, r). Where i is current feature value, r is BIT reliability, k is the health point and Th' is the health critical line. If Th is in (0, Th'], FAR includes two parts: if k is on the left side of Th, k < Th, the false alarm probability is 1 — u(Th,k, r); otherwise, it is u(k — Th,0, r). When Th is in (Th', 1], FAR is completely decided by the probability that k > Th, u(k — Th,0, r).

FAR = Nfa/(Nfa + Nd) becomes

/0rh u(k - Th,0, r)dk

u(k - Th,0, r)dk + /„' u(x - Th,0, r)dk

Similarly, NDR = Nnd/(Nnd + ND) becomes /Th, u(Th — k, 0, r)dk

u(Th — k, 0, r)dk + u(x — Th,0, r)dk

3. Central BIT management

There are some useful testability designs described in introduction. Furthermore a central BIT management can reduce stress influences and advance test data efficiency. It includes the composite BIT program and stress monitor.

3.1. Composite BIT program

By integrating the existing test items and BITE, the composite BIT program is planned on system level. It handles and records test data of (all) periodic BIT and (some) maintenance BIT. They reflect system health state of all life cycle. It includes two parts: all the possible components are integrated for efficiency and protection; the software controls BIT and manages BIT information.

BITE failure rate FBIT is quite lower than the corresponding line-replacement-unit (LRU) failure rate FLRU, and their usual failure rate ratio is 10FBiT 6 FLRU. However, BITE is embedded, both failure rate rises and lifting speeds are similar, AFBIT « AFLRU. Then BIT system reliability cannot be guaranteed. All the feeble components are required to be located in an individual part to prevent stress interference. The composite BIT centralizes the process parts and the BIT test points only have data gathering facilities. By effective defense, BITE failure rate rise is lowered, AFBIT ^ AFLRU. The integrated BIT design needs collectivity design and the cost is higher, but the anti-stress capability is better.

The program diagnoses with historical records and LRU failure rate in the database to locate the fault in size-stated ambiguous group. It contains the modules of BIT control, data gathering, data process, database management and display. An industrialized computer is used to gather and process information. It selects needed test items and gives test requests to related LRU. Fig. 3 depicts how the composition BIT program works. Firstly the BIT data from subsystems is sent to database after decode. Secondly the false alarms are removed with certain rules in database. Thirdly the fault isolation module works with real-time and historical information. Finally the results are presented to users. The diagnosis tree and LRU failure rate are required for isolation. The tree is gained by eXpress, testability engineering and maintenance system (TEAMS), testability analysis, design and evaluation system (TADES) or other software. The test and control module manages test items, responsible for communication with other control subsystem.

Fig. 3 Flowchart of composite BIT program.

3.2. Stress monitor program

If the stress becomes considerable, the probability of BITE failure grows higher and the test data becomes unbelievable. In Fig. 4, the x-axis denotes the stress, and the y-axis denotes the value of failure rate and health. Their relation is: in the rated area (0, Si], the failure rate remains at a lower value; with increased stress (S1, S2], the system is interfered; in the damage area (S2, S3], the damage increases and intermittent faults occur; when it achieves endurable limit S > S3, the permanent faults occur. The interference usually causes data fluctuation and high stress causes the damage.

Therefore, a stress monitor is required to judge testing reliability and its purpose is an elementary judgment. Fig. 3 shows that it helps false alarm cut.

(1) Because the defense capabilities are different, the BIT system failure rate is Fbit — p=1^/ Fi + £q—1 bFj; a, and F, depict defense coefficient and failure rate of i BITE separately; b and Fq depict defense coefficient and failure rate of central BITE separately.

(2) The components' failure rate change with environment stress, and F,(S) denotes the failure rate with stress S. F,(S) is gained by relative materials or reliability experiments, whole BITE failure rate is Fbit(S) — SX 1fl,F,(S) + 1bFj(S) .

(3) Using BIT test credibility to guarantee BIT results, which is 10 times higher than the main system. If BIT health accords with monitor program, the BIT system failure rate is only correlative with test course. Then BIT credibility is

P(TEST) — iPBit(S) — 1 1 - $>F-(S) - £bF;(S) (5)

Fig. 4 Relationship between health probability and stress.

where P(TEST) indicates BIT credibility, PBIT(S) credibility of test course and 1 the credibility extent.

(4) If P (TEST) > PTh, BITE failure probability is lower than credibility. BIT result takes less risk and it is believable.

4. BIT self-diagnosis

4.1. Analysis of test data and fault forms

The basic detection principle of BIT system is to test fault feature and judge whether it exceeds the alarm threshold. Because the maintainers only need binary classification of system running state (normal or faulty), all inconformity results are judged faults. But the fault diagnosis is not so simple. Because the fault conditions may be iterative and the features represent different forms, the forms of faults and test data have various representations too. It may include no signal, continuous or discontinuous excess, error code and unequal value. Both of system and BIT states can be diagnosed exactly based on all the conditions of test data.

Actually the BIT results and main system fault forms are mainly classified into three forms: no-data, unequal, error code. The BIT data forms could show the kind of the system/BITE failures. It helps judge the BIT results correctness by comprehensively analyzing these states. The main system health states are [''normal", "no-data", "unequal", "error code"], whose correspondent symbols are [0,0,1, R]; BIT result forms are [''normal", ''no-data", ''all normal", ''all fault", ''error code"], whose correspondent symbols are [0,0,0,1,R]. Assume that the equipment failure rate is kE and BITE failure rate is kBIT. [PE(0),PE(0),PE(1),PE(R)] — [a, b, c, d], [Pb(0), Pb(0), Pb(0), Pb(1), Pb(R)] — [a, b, c, d, e], b + c + d — kE, b + c + d + e — kBIT, a + kE — 1, a + kBIT — 1, kE P 10kBIT, a » kE, and a » kBIT.

Table 1 shows all the test conditions, corresponding probabilities and actual system/BIT state. The basic principles are as follows: when BIT system is normal, the results are believable; it is false alarm with normal main system and abnormal BIT; both faulty probabilities are tiny, but there are many kinds of results, such as wrong result with non-detection and coincident

Table 1 Fault diagnosis probabilities by analyzing test results.

Number Main system BIT system Test result Diagnosis Probability Actual condition

1 0 0 0 0 a a Correct result

2 0 0 0 1 a b False alarm

3 0 0 0 0 a c Correct result, actual non-detection

4 0 1 1 1 a d False alarm

5 0 R R 1 a e False alarm

6 0 0 0 1 ß' a Correct result

7 0 0 0 1 ß> b Correct result, actual wrong

8 0 0 0 ß> c Wrong result, actual non-detection

9 0 1 1 1 ß> d Correct result, actual wrong

10 0 R R 1 ß> e Correct result, actual wrong

11 1 0 1 1 c a Correct result

12 1 0 0 1 c b Correct result, actual wrong

13 1 0 0 c c Wrong result, actual non-detection

14 1 1 1 1 c d Correct result, actual wrong

15 1 R R 1 c e Correct result, actual wrong

16 R 0 R 1 d a Correct result

17 R 0 0 1 d b Correct result, actual wrong

18 R 0 0 d c Wrong result, actual non-detection

19 R 1 1 1 d d Correct result, actual wrong

20 R R R 1 d e Correct result, actual wrong

correct result; the max probability is that both of system and BIT are normal, a x a; the consequences are usually wrong with BITE failures. But whether BIT or equipment has faults is still unknown. For example, the main system is normal with abnormal BIT, including a x b, a x c, a x d and a x e. That may cause BIT false alarm.

If feature fluctuation and test errors are ignored, the false alarm and non-detection are only caused by BITE failures. According to Table 1: Eq. (6) denotes correct diagnosis results by normal BIT; Eq. (7) indicates that BITE failures cause two kinds of false alarms; Eq. (8) is false alarm by wrong location; Eq. (9) is false alarm by wrong detection; Eq. (10) is non-detection.

The diagnosis validity relies on raw data, and the existing methods are prone to give more information. The direct method to guarantee the correctness of the results is that BIT alarms when it has faults. The non-detection caused by BIT ''all normal" are eliminated but whether equipment/BIT has faults is unknown, so the test sequence control method is proposed.

Corr — P(Normal BIT) — (a + b + c + d)a — a (6)

FAR « P(xt > Th|xr < Th) —FAR1 + FAR2 (7)

FAR1 — P(Faulty BIT|Faulty BIT) — (b + c + d)(b + d + e) (8) FAR2 — P( Faulty BIT | Normal system)

— ^ o/Ubit) — a(b + d + e) (9)

NDR « P(xt < Th|xr > Th) — P(Faulty BIT|Faulty system)

«(a + b + c + d)c (10)

4.2. Test sequence control method 4.2.1. Testing principle

Actually interference can be classified as temporary failures, so the test results rely on test course. If the environment stress

interference is excluded, only BITE failures influence diagnoses. By comparative analysis of test data and system/BIT fault, it is possible to achieve BIT self-diagnosis. If BIT results are defined, whether there are failures of test course could be judged by comparing test results and preconcerted results. The test sequence control method proves feasibility.

The method includes three parts: a series of designated data is produced; BIT results are gained by testing the feature signals; the actual conditions are diagnosed by comparing the results with correct results. These results should contain all the system/BITE failures. The concrete steps are that the test period t is divided into 4 parts, [t0, t1], [t1, t2], [t2, t3] and [t3, t4]. In [to, t1], there is no input; in [t1, t2], the input signal is normal and designated result is ''0"; in [t2, t3], the input signal is abnormal and the designated result is ''1"; in [t3, t4], the input signal is practical system signal and the test result is the existing state. Then the test results are [0,0,1, Vr], Vr is practical BIT test value. If the former 3 parts get correct results, BIT result for main system state is believable; otherwise the BIT system has failures itself.

As Table 2 shows, the test data and results have a corresponding relationship. According to these test data, the BIT results are gained. The test results may have many kinds of permutations and combinations, 44 — 256. However a great number of results are inexistent in practice, such as [0,0,0,1]. It means BIT result is ''no-data" in [t0, t1], ''normal" in [t1, t2] and [t2, t3], ''faulty" in [t3, t4]. If BIT has failures that it must give ''all normal" persistently, it should not give ''faulty" in [t3, t4] and the ''1 " result is a contrary consequence. The main possible test results are shown in Table 2. By comparing actual results, BITE failures could be established.

4.2.2. Testing circuit design

This kind of method is mostly like the combination of active BIT and improved drive signals. It is a kind of method for designated testing and the drive signal is imported to detect the

Table 2 Comparison of test data and BIT results.

Test data Test result Problem Resolution

[0,0,1,0][0,0,1,1] [0,0,0, O][0,1,1,1] [R, R, R, R][0,0,0,0] [0, R, R, R] [0,0,0,1] Normal BIT, normal system Faulty BIT Faulty BIT/system Inexistent test results Unknown system health state Whether the system/BIT has faults Offline test Offline test

appointed components. Because the drive signal is set beforehand, the results would not be influenced by other component and the environment interference with input signal is weakened. Fig. 5 illustrates the principle of active BIT: system/ BIT gives drive signal; the unit under test (UUT) processes these data; the results are tested; test data is compared with settled consequence; the diagnosis conclusion is given to the composite BIT program.

Different from active BIT, test sequence method needs additional BIT performance. As Fig. 6 shows, sequence control and signal adjustment circuit ought to be added before testing. The sequence control part is used for allocating test time, and then the output results and test order are set. The signal adjustment methods include signal multiplication/demulti plication and noise superimposition, so the output results become required signal by signal adjustment. By adding signal adjustment and sequence control module, a series of acquired results is gained.

There are several definition methods of normal and fault signal, the designers must select appropriate one.

(1) The comparison part is based on the tested fault feature value. Even the test item is in a specific range, it could be changed into an upper limit and a lower limit, which is easy to get designated signals.

(2) The multiplication circuit is needed. For example, the normal signal is set 1/2 or 1/3 of real value. If it still cannot meet the normal requirement, then the main system fault extent is severe.

Fig. 5 Principle of active BIT.

(3) The fault value can take noise superimposition to get BITE failure form of ''all 1".

5. Validations

Take a radar testability improvement project as an example to validate the methods effectiveness.

(1) Composite BIT program

Fig. 7 indicates the testability design and the broken lines show the improvements. By integrating all possible BIT components and strengthening defense capacity, the stress influence is reduced. By optimizing existing test items and handling all the BIT information synthetically, a composite BIT program is formed. The integrated BIT design gives the hardware better protection. The composite BIT program uses test data more sufficiently and the false alarms are reduced. Based on the theoretical analysis, it can guarantee 90% BITE reliability indexes and reduce 30% false alarms. But the practical application and consequences are still unknown without the users' feedback data.

(2) Stress monitor program

Take electro-static discharge (ESD) effect on certain component as an example. Its fault rate is kE — 8 x 10—6 with BITE fault rate kBjT — 7.3 x 10 7. Fig. 8 shows the relationship between fault feature and electric stress. In this example, the ordinate is current amplification factor of transistor and the abscissa is ESD voltage. The stress threshold is calculated. Fig. 9 shows boundary by neural network, S — 700V. But BITE failure rate is nearly half of main system

Fig. 6 Circuit diagram design for test sequence design.

Fig. 7 Radar BIT design improvement.

0.5kE — 4 x 10 6. Actually the BITE failure rate becomes quite high when it exceeds 200V, 4IT > 8 x 10-7. The BIT results are unbelievable, so stress threshold is directly setting S > 200V.

(3) BIT self-diagnosis

Take electrical source voltage test for example. Because each test result is uncertainty, BIT and system failure rates are divided equivalently for various failure forms, b — c — d — e — 0.25kBiT, b — c — d — 0.33ke, ^bit P 10kE. The steps of test sequence control method are as follows:

(a) The value is required in the range [Thl, Thh]. It should not be larger than top limit threshold Thh or smaller than lower limit threshold Thl.

Fig. 8 Relationship curve of feature value and ESD voltage.

Table 3 Comparison of test data and practical condition.

Failure mode Test data Practical condition

Voltage fluctuation [0,0,1,0] Normal system

Normal BIT

[0,0,1,1] Faulty system

Normal BIT

[0,0,1,1],[0, R, R, R] Faulty system

[0,0,0,0],[R, R, R, R] Faulty BIT

[0,0,0, 0],[r, r, r, r] Normal system

[0,0,0,0],[0,1,1,1] Faulty BIT

[0, R, R, R] Faulty BIT/system

Fig. 9 Test result credibility boundary.

(b) Design test sequence circuit with methods is in Section 4.2.2, including sequence control and signal adjustment.

(c) Use hardware integration method: the comparison circuit is integrated in the main system and sensors are at the signal output location.

(d) Test sequence method is based on comparison of practical conditions and test data presentations.

Table 3 is the comparison of test data forms and practical condition, and then the maintainers could get fault diagnosis results. Where [0, R, R, R] means the main system state is ''error code", so the multiplication and practical signals are all messy. [0,0,0,0], [R, R, R, R], [0,0,0,0] and [0,1,1,1] represent BITE failures. The offline test is required because the main system health is still unknown.

Assume that all BIT testing results are correct, then the non-detection rate and false alarm rate are reduced. The reduced NDR is (a + b + c + d)c — 0.25kBIT; reduced FAR is (a + b + c + d)(b + d + e)—0.75kBiT.

(1) Validation results

The methods in this paper are used in the radar testability design and the results are favorable. The improvements avoid wrong test data and the BITE failures are detected to some extent. The composite BIT design makes the BIT hardware get better protection and BIT information more effective. The stress monitor program can generally manage the sensors and simply judge the test data. The test sequence control method manages BIT information comprehensively and gives credible diagnosis results. Table 4 shows several testability improvements.

Table 4 Testability improvements in radar project.

Subsystem

Testing item

Method

Before

Frequency resource Transmission/reception Transmission/reception Signal process Array process Power

Clock signal spectrum test Active test

Test of packed received data Test of power temperature FPGA temperature test Process function test Voltage test

BIT hardware integration Stress monitor program Stress monitor program Test sequence control Test sequence control

Non-detection Test is unreliable with stress The stress is undetectable The stress is undetectable BITE failure is undetectable BITE failure is undetectable

Detectable

The defense is effective Alarm for unreliable data Alarm for unreliable data Alarm for BITE failure Alarm for BITE failure

Note: FPGA - Field-programmable gate array.

6. Conclusions

By analyzing the influence of stress on electrical components, hardware failures and signal interference are common during test course. Current defense methods could not guarantee BIT reliability, so structure improvement and BIT self-diagnosis technologies are required. A synthesis processing station is established by central BIT management, so the BIT data efficiency is enhanced. Furthermore, the stress monitor is an elementary data judge condition, which can be important supplement of composite BIT program. Based on various equipment/BIT failure forms, the test sequence control method is proposed. Its main value on test anti-jamming is ensuring correct test data. With certain radar BIT improvements, it is proved that the methods could increase BIT diagnosis capacity.

Acknowledgement

This study was supported by the Ministry Level Project of China.

References

1. China Aviation Industry Development and Research Center. GJB 2547A Equipment testability currency criterion. Beijing: China Aviation Industry Development and Research Center; 2011 [Chinese].

2. Department of Defense. MIL-STD-2165A Military standard testability program for systems and equipments. Washington D C.: Department of Defense; 1993.

3. Qiu J, Liu GJ, Yang P, Lv KH, Su YD, Chen XX. Equipment testability modeling and design technology. Beijing: Science Press; 2012, p. 1, 136-67 [Chinese].

4. Yang G. Research on false-alarm-rate decreasing at sensor-level of built-in test systems in mechatronic products [Dissertation]. Changsha: National University of Defense Technology; 2003 [Chinese].

5. Davenport MA, Baraniuk RG, Scott CD. Controlling false alarms with support vector machines. ICASSP 2006; 2006 May 14-19; Toulouse, France. Piscataway, NJ: IEEE Press; 2006. p. 589-92.

6. Byington CS, Watson MJ, Amin S, Begin M. False alarm mitigation of vibration diagnostic systems. IEEE Aerospace Conference Proceedings; 2008 March 1-8; Big Sky, Montana. Piscataway, NJ: IEEE Press; 2008. p. 1-11.

7. Wu CH, He YH, Mi K. A low false alarm rates oriented design scheme of multivariate control chart. 9th International Conference on Reliability, Maintainability and Safety; 2011 June 12-15; Guiyang, China. Piscataway, NJ: IEEE Press; 2011. p. 1107-11.

8. Johnson Space Center. False Alarm Mitigation Techniques. Washington, D C.: NASA; 1990. Report No.: Technique DFE-2.

9. Charles C, Kenneth AH, Victor GZ, Paul S, White G. Smart BIT/ TSMD integration. New York: Grumman Aerospace Corp; 1991. Report No.: RL-TR-91-353.

10. Aylstock F, Elerin L, Hintz J, Learoyd C, Press R. Neural network false alarm filter. New York: Rome Laboratory, Air Force Materiel Command, Griffiss Air Force Base; 1995. Report No.: RL-TR-94-216.

11. Steadman B, Berghout F, Olsen N. Intermittent fault detection and isolation system. AUTOTESTCON 2008; 2008 September 811; Salt Lake City, UT. Piscataway, NJ: IEEE Press; 2008. p. 37-40.

12. Westervelt K. Root cause analysis of BIT false alarms. 2004 IEEE Aerospace Conference Proceedings; 2004 March 6-13; Big Sky, Montana. Piscataway, NJ: IEEE Press; 2004. p. 3782-90.

13. Rahman A, McCulloch J, Mamun Q. Prediction with uncertainty: a novel framework for analyzing sensor data streams. IEEE Sens J 2015;15(1):382-6.

14. Allen D. Probabilities associated with a built-in-test system, focus on false. AUTOTESTCON Conference, Systems Readiness Technology; 2003 September 22-25; Anaheim, CA. Piscataway, NJ: IEEE Press; 2003. p. 643-5.

15. Deng GQ, Qiu J, Liu GJ, Lv KH. Environmental stress level evaluation approach based on physical model and interval grey association degree. Chin J Aeronaut 2013;26(2):456-62.

16. Correcher A, Garcia E, Morant F. Intermittent failure diagnosis in industrial processes. IEEE International Symposium on Industrial Electronics; 2003 June 9-11; Rio de Janeiro, Brasil. Piscataway, NJ: IEEE Press; 2003. p. 723-8.

17. Harvey G, Louis S, Buska S. Micro-time stress measurement device development. Minneapolis: Honeywell, Inc.; 1994. Report No.: RL-TR-94-196.

18. Deng GQ. Research on key technology of intermittent faults diagnosis in extreme temperature environment [Dissertation]. Changsha: National University of Defense Technology; 2013 [Chinese].

19. Ma ZH, Li JG. Study on mechanism of the effect on products in the temperature's environment. Environ Technol 2006(3);15-20 [Chinese].

20. Pan J. Electrical connector vibration reliability modelling and evaluation [Dissertation]. Hangzhou: Zhejiang University; 2002 [Chinese].

21. Wang G, Qiu J, Liu GJ, Lv KH. Research on intermittent faults diagnosis with electromagnetic interference. 27th International Congress of Condition Monitoring and Diagnostic Engineering; 2014 September 16-18; Brisbane, Australia. London: Informa; 2014.

Wang Gang received his B.S. and the M.S. degrees from the National University of Defense Technology (NUDT) in 2008 and 2010 respectively, both in mechanics engineering. Now he is pursuing a Ph.D. degree in mechanics engineering of NUDT. His major research interests include design for testability (DFT), fault diagnosis, and false alarm reduction.

Qiu Jing received his B.S. degree in solid mechanics from Beihang University in 1985, M.S. degree in solid mechanics and Ph.D. degree in mechanics engineering from NUDT, in 1988 and 1998 respectively. Since 1988, She has been carrying out teaching and research at NUDT. He became a professor in 1998. His research interests include design for testability, testability demonstration, fault diagnosis, and fault prognostics.

Liu Guanjun received his B.S and Ph.D. degrees from the NUDT in 1994 and 2000 respectively, both in mechanics engineering. Since 2000, he has been carrying out teaching and research at NUDT. He became a professor in 2011. His research interests include design for testability, testability demonstration, fault diagnosis and fault prognostics.

Lyu Kehong received his B.E. degree in Xi'an Jiao Tong University, in 2001 and received his M.S. and Ph.D. degrees in mechatronic engineering from NUDT, in 2003 and 2008. He worked as an associate professor in NUDT. His current research interests include prognostics and health management (PHM) technology, design for testability (DFT), mechanical signal processing, condition monitoring and fault diagnosis.