

# **Design for Reliability**

**DfR Solutions Webinar** 



9000 Virginia Manor Rd Ste 290, Beltsville MD 20705 | 301-474-0607 | www.dfrsolutions.com

# What is Design for Reliability (DfR)?

- <u>DfR</u>: A process for ensuring the reliability of a product or system during the design stage <u>before</u> physical prototype
- Reliability: The measure of a product's ability to
  - ...perform the specified function
  - ... at the customer (with their use environment)
  - ...over the desired lifetime



# Why DfR? Get it Right the First Time!



P. Smith and D. Reinertsen. Developing Products In Half The Time (New York Van Nostrand Reinhold. 1991). 4.

9000 Virginia Manor Rd Ste 290, Beltsville MD 20705 | 301-474-0607 | www.dfrsolutions.com

# Why DfR? Lower Cost of Quality!



9000 Virginia Manor Rd Ste 290, Beltsville MD 20705 | 301-474-0607 | www.dfrsolutions.com

Statement from a Major OEM of Server Architecture

Margins are so tight in this business that reliability plays a major role in our financial performance.

If a unit requires more than one service call during its lifetime, our profit margin has been effectively <u>eliminated</u>.

# **Reality of Design for Reliability (DfR)**

- Ensuring reliability of electronic designs is becoming increasingly difficult
  - Increasing complexity of electronic circuits
  - Increasing power requirements
  - Introduction of new component and material technologies
  - Introduction of less robust components
- Results in multiple potential drivers for failure





# **Reality (cont.)**

- Predicting reliability is becoming problematic
  - Standard MTBF calculations can tend to be inaccurate
  - A physics-of-failure (PoF) approach can be timeintensive and not always definitive (limited insight into performance during operating life)





#### **Issues with DfR**

- The process of Design for Reliability (DfR) is achieving a high profile in the electronics industry
  - DfR is even on Wikipedia!
  - Part of an overall Design for Excellence (DfX)
- Numerous organizations now offer DfR training tools (courses, books, etc.)
  - Response to market demand





### **Issues with Standard DfR Tools**

- Too broad in focus (not electronics focused)
- Too much emphasis on techniques (e.g., FMEA and FTA) and not answers
  - FMEA/FTA rarely identify DfR issues because of limited focus on the failure mechanism
- Incorporation of HALT and failure analysis (HALT is test, not DfR; failure analysis is too late)
  - Frustration with 'test-in reliability', even HALT, has been part of the recent focus on DfR

# **Issues with DfR Tools (e.g., HALT)**

- Highly Accelerated Life Testing (HALT)
- Based on sound fundamental engineering
  - 1. Know your margins!
  - 2. Not everything can be predicted (how to predict the time required for a screw to come loose?)

#### • However...

- There are no failure modes in HALT that can not be addressed through strong DfR activities
- Best in Class organizations will actually predict HALT failure modes <u>before</u> HALT

### How Does DfR (Solutions) do DfR?

- Start at Concept / Block-Diagram Stage
  - Specifications are key
- Focus on electrical and software reliability
- Part selection
  - Derating and uprating
- DfM cannot be separated from DfR
- Wearout mechanisms and physics of failure
  - Predicting degradation in today's electronics



# **Concept / Block Diagram**

- Failure to capture and understand product specifications at this stage lays the groundwork for mistakes at schematic and layout
  - Reliability expectations
  - Use environment
  - Dimensional constraints



# **Reliability Goals**

- Typical reliability metrics: Desired Lifetime / Product Performance
- Desired lifetime
  - Defined as when the customer will be satisfied
  - Should be actively used in development of part and product qualification

- Product performance
  - Returns during the warranty period
  - Survivability over lifetime at a set confidence level
  - Try to avoid MTBF or MTTF

## Why is Desired Lifetime Important?



### **Desired Lifetime (IC Wearout)**



**DfR Solutions** 

9000 Virginia Manor Rd Ste 290, Beltsville MD 20705 | 301-474-0607 | www.dfrsolutions.com

#### **Desired Lifetime (Solder Wearout)**

- More silicon, less plastic (CSP, Stacked Die, etc.)
- Elimination of leads (DFN, QFN, BTC, etc.)



**DfR Solutions** 

BOARD LEVEL ASSEMBLY AND RELIABILITY CONSIDERATIONS FOR QFN TYPE PACKAGES, Ahmer Syed and WonJoon Kang, Amkor Technology.

# **Product Performance: Warranty Returns**

#### Consumer Electronics

• Table on right

#### Low Volume, Non Hi-Rel

• 1 to 2%

#### Industrial Controls

- 500 to 2000 ppm (1<sup>st</sup> Year)
- Depends on complexity, production volumes, and risk sensitivity

#### Automotive

- 1 to 5% (Electrical, 1<sup>st</sup> Year)
- Can also be reported as problems per 100 vehicles
- Percent of revenue (1.2 to 4.5)

| Product                                                  | Repair rate (%)<br>[First 3 Yrs] |
|----------------------------------------------------------|----------------------------------|
| Desktop PC                                               | 37                               |
| Laptop PC                                                | 33                               |
| Refrigerator: side-by-side (with icemaker and dispenser) | 28                               |
| Washing machine                                          | 22                               |
| Refrigerator: top- and bottom-freezer (with icemaker)    | 17                               |
| Projection TV                                            | 16                               |
| Vacuum cleaner (excluding belt replacement)              | 13                               |
| Dishwasher                                               | 13                               |
| Clothes dryer                                            | 13                               |
| Microwave oven (over-the-range)                          | 12                               |
| Electric range                                           | 11                               |
| Camcorder                                                | 8                                |
| Digital camera                                           | 8                                |
| Refrigerator: top- and bottom-freezer (without icemaker) | 8                                |
| TV: 30- to 36-inch                                       | 7                                |
| TV: 25- to 27-inch                                       | 5                                |

Consumer Reports 2006



# **Product Performance: Disk Drives**

- Your product can not have a field performance better than its parts!
- Disk drives, microprocessors, fans, memory, power supplies are challenging

http://blog.backblaze.com/2013 /11/12/how-long-do-disk-driveslast/



# **Product Performance: Survivability**

- Some companies set reliability goals based on survivability
  - Often bounded by confidence levels
  - Example: 95% reliability with 90% confidence over 15 years

#### Advantages

- Helps set bounds on test time and sample size
- Does not assume a failure rate behavior (decreasing, increasing, steady-state)

#### Disadvantages

 Can be re-interpreted through mean time to failure (MTTF) or mean time between failures (MTBF)

# **Limitations of MTTF/MTBF**

- MTBF/MTTF calculations tend to assume that failures are random in nature
  - Provides no motivation for failure avoidance
- Easy to manipulate numbers
  - Tweaks are made to reach desired MTBF
  - E.g., quality factors for each component are modified
- Often misinterpreted
  - 50K hour MTBF <u>does not mean</u> no failures in 50K hours
- Better fit towards logistics and procurement, not failure avoidance

# **Identify Field Environment**

- Approach 1: Use of specifications
  - MIL-STD-810, MIL-HDBK-310, IPC-SM-785, Telcordia GR3108, IEC 60721-3, etc.
  - Low cost and can be very comprehensive
  - Agreement throughout the industry
  - Major disadvantage is always less or greater than actual (by how much, unknown)
- Approach 2: Based on actual measurements
  - Determine average and realistic worst-case
  - Identify all failure-inducing loads
  - Include all environments



|                                                           | WORST CASE LISE ENVIRONMENT |            |                         |                       |                          |                  |                    |            | NG     |                             |                       |
|-----------------------------------------------------------|-----------------------------|------------|-------------------------|-----------------------|--------------------------|------------------|--------------------|------------|--------|-----------------------------|-----------------------|
|                                                           | Tmin<br>°C                  | Tmax<br>°C | ΔT <sup>(1)</sup><br>°C | t <sub>D</sub><br>hrs | Cycles/<br>vear          | Typical<br>Years | Approx.<br>Accept. | Tmin<br>°C | Tmax   | ΔT <sup>(2)</sup><br>°C     | t <sub>D</sub><br>min |
| USE CATEGORY                                              |                             |            |                         |                       | ,                        | of<br>Service    | Failure<br>Risk, % |            |        |                             |                       |
| 1) CONSUMER                                               | 0                           | +60        | 35                      | 12                    | 365                      | 1-3              | 1                  | +25        | +100   | 75                          | 15                    |
| 2) COMPUTERS                                              | +15                         | +60        | 20                      | 2                     | 1460                     | 5                | 0.1                | +25        | +100   | 75                          | 15                    |
| 3) TELECOM                                                | - 40                        | +85        | 35                      | 12                    | 365                      | 7-20             | 0.01               | 0          | +100   | 100                         | 15                    |
| 4) COMMERCIAL<br>AIRCRAFT                                 | -55                         | +95        | 20                      | 12                    | 365                      | 20               | 0.001              | 0          | +100   | 100                         | 15                    |
| 5) INDUSTRIAL &<br>AUTOMOTIVE<br>PASSENGER<br>COMPARTMENT | -55                         | +95        | 20<br>&40<br>&60<br>&80 | 12<br>12<br>12<br>12  | 185<br>100<br>60<br>20   | 10               | 0.1                | 0          | +100   | 100<br>& COLD <sup>(3</sup> | 15                    |
| 6) MILITARY<br>GROUND &<br>SHIP                           | -55                         | +95        | 40<br>&60               | 12<br>12              | 100<br>265               | 10               | 0.1                | 0          | +100   | 100<br>& COLD <sup>(3</sup> | 15                    |
| 7) SPACE leo<br>geo                                       | -55                         | +95        | 3<br>to 100             | 1<br>12               | 8760<br>365              | 5-30             | 0.001              | 0          | +100   | 100<br>& COLD <sup>(3</sup> | 15                    |
| 8) MILITARY<br>AVIONICS a<br>b<br>c                       | -55                         | +95        | 40<br>60<br>80<br>&20   | 2<br>2<br>2<br>1      | 365<br>365<br>365<br>365 | 10               | 0.01               | 0          | +100   | 100                         | 15                    |
| 9) AUTOMOTIVE<br>UNDER HOOD                               | -55                         | +125       | 60<br>&100<br>&140      | 1<br>1<br>2           | 1000<br>300<br>40        | 5                | 0.1                | 0          | IPC    | C SN                        | 1785                  |
| A                                                         |                             |            |                         |                       |                          | -                |                    |            | & COLE | ) <sup>(3)</sup> & LAR(     | GE ΔT <sup>(4)</sup>  |

# **Drop / Mechanical Shock**



9000 Virginia Manor Rd Ste 290, Beltsville MD 20705 | 301-474-0607 | www.dfrsolutions.com

### Vibration





# **Failure Inducing Loads**

- Temperature Cycling
  - Tmax, Tmin, dwell, ramp times
- Sustained Temperature
  - T and exposure time
- Humidity
  - Controlled, condensation
- Corrosion
  - Salt, corrosive gases (Cl2, etc.)
- Power cycling
  - Duty cycles, power dissipation
- Electrical Loads
  - Voltage, current, current density
  - Static and transient
  - Electrical Noise
- Mechanical Bending (Static and Cyclic)
  - Board-level strain
- Random Vibration
  - PSD, exposure time, kurtosis
- Harmonic Vibration
  - G and frequency
- Mechanical shock
  - G, wave form, # of events



### **Field Environment: Temperatures in USA**

| Temperature   | Avg. U.S.<br>CLIM Data | Avg. U.S.<br>Weighted by Registration<br>(Source: Confidential) | vg. U.S.<br>by Registration<br>Confidential) |             |  |
|---------------|------------------------|-----------------------------------------------------------------|----------------------------------------------|-------------|--|
| 95F (35C)     | 0.375%                 | 0.650%                                                          | 11% (948)                                    | 13% (1,140) |  |
| 105F (40.46C) | 0.087%                 | 0.050%                                                          | 2.3% (198)                                   | 3.8% (331)  |  |
| 115F (46.11C) | 0.008%                 | 0.001%                                                          | 0.02% (1.4)                                  | 0.1% (9)    |  |

| Month            | Cycles/Year | Ramp  | Dwell | Max. Temp (°C) | Min. Temp. (°C) |
|------------------|-------------|-------|-------|----------------|-----------------|
| Jan.+Feb.+Dec.   | 90          | 6 hrs | 6 hrs | 20             | 5               |
| March+November   | 60          | 6 hrs | 6 hrs | 25             | 10              |
| April+October    | 60          | 6 hrs | 6 hrs | 30             | 15              |
| May+September    | 60          | 6 hrs | 6 hrs | 35             | 20              |
| June+July+August | 90          | 6 hrs | 6 hrs | 40             | 25              |



# **Field Environment: Closed Container Temp**



### **Field Environment: Electrical**

 Often very well defined in developed countries, but new markets can introduce surprises

- China: Can have issues with grounding (connected to rebar?)
- India: Numerous brownouts (several a day)
- Mexico: Voltage surges



#### **Dimensions**

- Keep dimensions loose at this stage
  - Large number of hardware mistakes driven by arbitrary size constraints
  - Examples include poor interconnect strategies and poor choices in component selection
- Case study: Use of 0201 chip components
  - Tight dimensional requirements push designer towards wholesale placement of 0201 components
  - 0201 is not yet an appropriate technology for systems requiring reliability

- Result: Major issues at customers
- Use the Toyota approach

### What is the Toyota Approach?



**DfR Solutions** 

#### Anti-V Model Product Development

9000 Virginia Manor Rd Ste 290, Beltsville MD 20705 | 301-474-0607 | www.dfrsolutions.com

# Toyota Approach

- Toyota's development engineers are 4X as productive as U.S. counterparts.
- Why?
  - Focus on learning as much as possible



#### • Western engineers

- Define several product concepts
- Select the one that has the most promise
- Draw up specifications and divide them into subsystems;
- Subsystems are designed, built and rolled up for system testing.
- Failures? Rework the specs and the designs accordingly (non-optimized and confusing endeavor)



#### • Toyota engineers

- Efforts concentrated at lowest possible design level
- Thorough understanding of the technology of a subsystem so it can be used appropriately in future designs



### **Toyota Example: Radiators**

- <u>Traditional approach</u>: Design radiator for a specific vehicle based on mechanical specifications written for that vehicle
- Toyota considers a range of radiator solutions based on cooling capacities and the cooling demands of various engines that might be used.
  - How the radiator actually fits into a vehicle would be kept loose so that Toyota's knowledge of radiator technology could be used to create the optimum design
- Toyota's system is "test & design" rather than the traditional "design & test."
  - Toyota engineers test at the fundamental knowledge level so they don't have to test at the later, more expensive stages of design and prototyping

# **Electrical and Software Reliability**

- Electrical reliability is often overlooked by many reliability engineers because
  - They don't have electrical backgrounds
  - Political concerns (don't step on engineering toes)
- However, electrical design does not always assess electrical issues from a reliability standpoint
  - Classic electrical engineering, especially today, lives in a virtual world



# **Electrical and Software Reliaiblity (Key Areas)**

- Power Stability
- Design for EMI/EMC
  - Don't worry about compliance; worry if it will work!
  - Before testing
- Design for EOS/ESD
  - Parts are increasingly ESD sensitive
  - Portability/mobility increases the potential for exposure to EOS/ESD events

**DfR Solutions** 

#### • Buffering

### **ESD Sensitivity and Location**

- Use ESD Protection on all susceptible parts
  - Box or System I/O
    - ESD Rating < Class 2 HBM IEC (4000V, 150pf, 330 Ohm) MANDATORY
  - Internal Components (not exposed to outside connectors)
    - ESD Rating <= Class 1 ANSI (0-999V) MANDATORY
    - ESD Rating < Class 2 ANSI (2000V) WHEREVER POSSIBLE
- High Speed, RF and GaAs parts will be sensitive to ESD [Class 0 (<250V) or Class 1A (<500V)]</li>
  - Place ESD sensitive components and traces to avoid locations where the board may be handled
  - Avoid Coupled ESD events Do not route traces to ESD sensitive parts near lines connected to the outside world

**DfR Solutions** 

• Install protective devices before ESD sensitive parts (Class 1 or lower)

# **Evaluate Potential ESD**

- If ESD sensitive parts are used in design, the circuitry connected to device pins should be evaluated
  - Insure that it provides "attenuation" to prevent voltage in excess of the parts ESD rating from developing in case the pin or connected traces are contacted during board handling or system assembly.
- Often the recommended circuit components for operation of the part will provide adequate ESD protection.
  - This should be verified by analysis or simulation and extra protection added as required to limit the voltage seen at the part.

- Assumptions for analysis/simulation
  - 2000V,1.5K, 100pf for Internal circuits
  - o 4000V, 330 Ohms, 150pf for I/Os



# **Component Selection**

- KIS: Keep it Simple
  - New component technology can be very attractive
  - Not always appropriate for high reliability embedded systems
  - Be conservative
  - BUT, don't be too conservative (old technology quickly becomes private labeled; i.e., 'lawful counterfeiting')
- Reality: Marketing hype FAR exceeds actual implementation
  - Component manufacturers typically use portable sales to boost numbers
  - <u>Claim</u>: We have built 100's of millions of these components without a single return!
  - <u>Actuality</u>: All sales were to two cell phone customers with lifetimes of 18 months
## **Component Selection (cont.)**

- Even when used by hi-rel companies, some modifications may have been made
  - <u>Example</u>: State-of-the-art crystal oscillator required specialized assembly to avoid failures one to three years later in the field
- Prior examples of where care should have been taken
  - New technologies: X5R dielectric, SiC diodes, etc.
  - New packaging: Quad flat pack no lead (QFN), 0201, etc.

#### **Derating: Component Ratings**

- Definition
  - A specification provided by component manufacturers that guides the user as to the appropriate range of stresses over which the component is guaranteed to function

#### Typical parameters

- Voltage
- Current
- Power
- Temperature

#### MSP430FG43x MIXED SIGNAL MICROCONTROLLER

SLAS380B - APRIL 2004 - REVISED JUNE 2007

absolute maximum ratings over operating free-air temperature (unless otherwise noted)<sup>†</sup>

| Voltage applied at V <sub>CC</sub> to V <sub>SS</sub> |              | –0.3 V to 4.1 V                  |
|-------------------------------------------------------|--------------|----------------------------------|
| Voltage applied to any pin (see Note)                 |              | 0.3 V to V <sub>CC</sub> + 0.3 V |
| Diode current at any device terminal .                |              | ±2 mA                            |
| Storage temperature, Tstg: (unprogram                 | nmed device) | 55°C to 150°C                    |
| (programm                                             | ed device)   | 40°C to 85°C                     |

<sup>1</sup> Stresse beyond those listed under absolute maximum ratings' may cause permanent damage to the device. These are stress ratings only, and functional operation of the device at these or any other conditions beyond those indicated under recommended operating conditions' is not implied. Exposure to absolute-maximum-rated conditions for extended periods may affect device reliability.

NOTE: All voltages referenced to V<sub>SS</sub>. The JTAG fuse-blow voltage, V<sub>FB</sub> is allowed to exceed the absolute maximum rating. The voltage is applied to the TDI/TCLK pin when blowing the JTAG fuse.

#### recommended operating conditions

|                                                                                                                                                   | MIN | NOM | MAX | UNITS |
|---------------------------------------------------------------------------------------------------------------------------------------------------|-----|-----|-----|-------|
| Supply voltage during program execution (see Note 1), V <sub>CC</sub> (AV <sub>CC</sub> = DV <sub>CC1/2</sub> = V <sub>CC</sub> )                 | 1.8 |     | 3.6 | v     |
| Supply voltage during program execution, SVS enabled, PORON=1 (see Note 1 and Note 2), V <sub>CC</sub> (AV <sub>CC</sub> = $V_{CC1/2} = V_{CC}$ ) | 2   |     | 3.6 | v     |
| Supply voltage during flash memory programming (see Note 1),<br>Vcc (AVcc = DVcc1/2 = Vcc)                                                        | 2.7 |     | 3.6 | v     |
| Supply voltage, V <sub>SS</sub> (AV <sub>SS</sub> = DV <sub>SS1/2</sub> = V <sub>SS</sub> )                                                       | 0   |     | 0   | V     |
| Operating free-air temperature range, T <sub>A</sub>                                                                                              | -40 |     | 85  | °C    |

#### IGBT MODULE (U series) 600V / 100A / PIM



Features
 Low Vcɛ(sat)
 Compact Package
 P.C. Board Mount Module
 Converter Diode Bridge Dynamic Brake Circuit

#### Maximum ratings and characteristics

| tem    | 11                                         | Symbol    | Condition                               | Rating      | Unit             |
|--------|--------------------------------------------|-----------|-----------------------------------------|-------------|------------------|
|        | Collector-Emitter voltage                  | Vces      |                                         | 600         | V                |
|        | Gate-Emitter voltage                       | Vags      |                                         | ±20         | V                |
| 2      |                                            | lc.       | Continuous                              | 100         | A                |
| nverte | Collector current                          | ICP       | 1ms                                     | 200         | A                |
|        | A CONTRACTOR DATA CONTRACTOR               | -lc       |                                         | 100         | A                |
|        | 2 S                                        | -Ic pulse | 1ms                                     | 200         |                  |
|        | Collector power disspation                 | Pc        | 1 device                                | 378         | W                |
|        | Collector-Emitter voltage                  | Vces      |                                         | 600         | V                |
|        | Gate-Emitter voltage                       | VGES      |                                         | ±20         | V                |
| 2      | Collector current                          | k         | Continuous                              | 50          | A                |
| Bra    |                                            | ICP.      | 1ms                                     | 100         | A                |
|        | Collector power disspation                 | Pc        | 1 device                                | 187         | W                |
|        | Repetitive peak reverse voltage            | VRRM      |                                         | 600         | V                |
| *      | Repetitive peak reverse voltage            | VRRM      |                                         | 800         | V                |
| et e   | Average output current                     | b         | 50Hz/60Hz sine wave                     | 100         | A                |
| E.     | Surge current (Non-Repetitive)             | IFSM      | Tj=150°C, 10ms                          | 700         | A                |
| õ      | Pt (Non-Repetitive)                        | R         | half sine wave                          | 2450        | A <sup>2</sup> s |
| Op     | erating junction temperature               | Tj        |                                         | +150        | *C               |
| Sto    | rage temperature                           | Tstg      |                                         | -40 to +125 | °C               |
| Iso    | lation between terminal and copper base *2 | Viso      | AC : 1 minute                           | AC 2500     | V                |
| vol    | tage between thermistor and others *3      |           | 100000000000000000000000000000000000000 | AC 2500     | V                |
| Mo     | unting screw torque.                       |           | 8                                       | 3.5 *1      | Nm               |

## Derating

• Derating is the practice of limiting stress on electronic parts to levels below the manufacturer's specified ratings

- Guidelines can vary based upon environment ("severe, protected, normal" or "space, aircraft, ground")
- One of the most common design for reliability (DfR) methods
- Goals of derating
  - Maintain critical parameters during operation (i.e., functionality)
  - Provide a margin of safety from deviant lots
  - Achieve desired operating life (i.e., reliability)
- Sources of derating guidelines
  - Governmental organizations and 3<sup>rd</sup> parties
  - OEM's
  - Component manufacturers
- Derating is assessed through component stress analysis



## **Derating Guidelines (Examples)**

| Part Type                   | Derating parameters            | Severe                  | Benign                  |
|-----------------------------|--------------------------------|-------------------------|-------------------------|
| Aluminium electrolytic caps | Voltage (% max rated)          | 70%                     | 80%                     |
|                             | Temperature (°C)               | T <sub>max</sub> - 20°C | T <sub>max</sub> - 20°C |
| Ceramic capacitors          | Voltage (% max rated)          | 60%                     | 70%                     |
|                             | Temperature (°C)               | T <sub>max</sub> - 10°C | T <sub>max</sub> - 10°C |
| Solid tantalum capacitors   | Voltage (% max rated)          | 70%                     | 80%                     |
|                             | Temperature (°C)               | T <sub>max</sub> - 20°C | T <sub>max</sub> - 20°C |
|                             | Reverse voltage (% max fwd)    | 2%                      | 2%                      |
| Signal diodes               | Forward current (% max rated)  | 90%                     | <100%                   |
|                             | Reverse voltage (% max rated)  | 70%                     | 80%                     |
|                             | Max. junction temperature      | 95°C                    | 115°C                   |
| Chip resistors              | Power dissipation(% max rated) | 50%                     | 70%                     |
| Digital MOS and bipolar ICs | Fanout (% max rated)           | 90%                     | <100%                   |
|                             | Frequency (% max rated)        | 90%                     | <100%                   |
|                             | Output current (% max rated)   | 90%                     | <100%                   |
|                             | Max. junction temperature      | 95°C                    | 115°C                   |
| Linear MOS and bipolar ICs  | Frequency (% max rated)        | 90%                     | <100%                   |
|                             | Output current (% max rated)   | 90%                     | <100%                   |
|                             | Max. junction temperature      | 95°C                    | 115°C                   |

## **Criticality of Component Stress Analysis**

- Failure to perform component stress analysis can result in higher warranty costs, potential recalls
  - Eventual costs can be in the millions of dollars
- Perspective from Chief Technologist at major
  Original Design Manufacturer (ODM)

"...based on our experience, we believe a significant number of field returns, and the majority of no-trouble-founds (NTFs), are related to overstressed components."

### **Derating Failures**

- Where are the derating mistakes?
- Problem #1: Designers do not derate
  - Failure to perform component stress analysis
- Problem #2: Derating does not have a practical or scientific foundation
  - Extraordinary measures are taken when inappropriate
  - Derating is excessive: 'The more, the better' rule



#### **Failure to Derate: Common Examples**

- Analog / Power Designs
  - Derating is typically overlooked during transient events
  - Especially turn-on, turn-off
- Digital
  - Excessive number of components and connections tends to limit attempts to perform component stress analysis



## **The Foundation of Derating**

- To be effective, derating must have a practical and scientific foundation
  - Problem: Manufacturer's ratings are not always based on a practical and scientific foundation
- Manufacturers' viewpoint
  - Ratings are based on specific design rules based on materials, process, and reliability testing
- The reality
  - Ratings can be driven by tradition and market forces as much as science
- Best practice
  - Based on data from field returns
  - Based on test to failure qualification (especially for new suppliers)



## **Manufacturer's Derating (example)**

- Tantalum capacitor
  - $\circ$  MnO<sub>2</sub> cathode
- Derating based on desired failure rate
  - 10 ppm at startup
- Why not 10 ppm failure rate at rated voltage?
- Was 0.3% failure rate acceptable?
  - 50% derating is a legacy

|                                     | MnO <sub>2</sub><br>(27Batches) |
|-------------------------------------|---------------------------------|
| 100 PPM FR<br>% V <sub>Rated</sub>  | 68%                             |
| @50% V <sub>Rated</sub><br>FR(PPM)  | 9                               |
| @80% V <sub>Rated</sub><br>FR(PPM)  | 458                             |
| @90% V <sub>Rated</sub><br>FR(PPM)  | 1,700                           |
| @100% V <sub>Rated</sub><br>FR(PPM) | 2,943                           |

Courtesy of Kemet



#### **Derating Decision Tree**

- <u>Step 1</u>: Derating guidelines should be based on component performance, not ratings
  - Test to failure approach (i.e., HALT of components)
  - Quantifies life cycle cost tradeoffs
  - For smaller OEMs, limit this practice to critical components

#### **Derating Decision Tree (cont.)**

- <u>Step 2</u>: Derating guidelines should be based on recommendations from the component manufacturer
  - They built it; they should know it
  - Don't trust the manufacturer? Use someone else
- <u>Step 3</u>: Derating guidelines should be based on customer requirements
- <u>Step 4</u>: Derating guidelines should be based industryaccepted specification/standard

# Be flexible, not absolute

## Layout / Mechanicals

- The biggest mistake at this stage of the design?
  - Manufacturability
- Problem is getting better, but suppliers will always try to build what you send them
  - If it doesn't work, rework!
  - Even some design for manufacturability (DfM) is limited; major problems are not always addressed



#### DfM

#### • Definition

• The process of ensuring a design can be consistently manufactured by the designated supply chain with a minimum number of defects

#### • Requirements

- An understanding of best practices (what fails during manufacturing?)
- An understanding of the limitations of the supply chain (you can't make a silk purse out of a sow's ear)



## **DfM Failures**

• DfM is often overlooked in the design process

#### • Reasons

- Design team often has poor insight into supply chain (reverse auction, anyone?)
- OEM requests no feedback on DfM from supply chain
- DfM feedback consists of standard rule checks (no insight)
- DfM activities at the OEM are not standardized or distributed

## **DfM Checklist**

#### • Baseline

- Your design matches their capabilities (75% 'sweet spot')
- Design is transferable
- Bare Board
  - Trace width and spacings
  - Laminate material
  - Symmetry of stackup
  - Complexity of via connections
  - Incorporation of new materials (embedded passives)
  - Single-sided vs. double-sided
- o System
  - Blind connections
  - Z dimension limitations

#### • Assembly

- Elimination of hand soldering or wave soldering when possible
- Proximity of components to flex points
- Component spacing
- Size of components and complexity of packaging
- Orientation of components to wave solder
- Shadowing during wave solder
- Appropriate dimensions and spacings for PTHs and bond pads
- Attachment methods
- Moisture sensitivity level (MSL)



#### **Designing for Defects**

| Soldor Droops  | Defects per Million Opportunities |               |  |  |  |  |
|----------------|-----------------------------------|---------------|--|--|--|--|
| Solder Flocess | Standard                          | Best in Class |  |  |  |  |
| Hand           | 5000                              | N/A           |  |  |  |  |
| Wave           | 500                               | 20 - 100      |  |  |  |  |
| Reflow         | 50                                | <10           |  |  |  |  |

 Designs that avoid manual soldering operations reduce defects

DfR Solutions

## **DfM Example: Flex Cracking of Ceramic Caps**

- Due to excessive flexure of the board
- Occurrence
  - Depaneling
  - Handling (i.e., placement into a test jig)
  - Insertion (i.e., mounting insertion-mount connectors or daughter cards)
  - Attachment of board to other structures (plates, covers, heatsinks, etc.)







#### Flex Cracking (Case Studies)









## Flex Cracking (cont.)

- Drivers
  - Distance from flex point
  - Orientation
  - Length (most common at 1206 and above; observed in 0603)

#### • Solutions

- Avoid case sizes greater than 1206
- Maintain 30-60 mil spacing from flex point
- Reorient parallel to flex point
- Replace with Flexicap (Syfer) or Soft Termination (AVX)
- Reduce bond pad width to 80 to 100% of capacitor width
- Transition to smaller case size
- Measure board-level strain (maintain below 750 microstrain)



#### **Avoiding Excessive Flexure**

- Every design should be evaluated through every post assembly process for excessive flexure
- This goal was until recently unobtainable because of cost and time constraints
- With Sherlock's ICT module this analysis can now take minutes (no more excuses)



## DfM Example (Plated Through Hole vs. Microvia)

- What should be the minimum diameter of a PTH in your design?
- What should be the maximum aspect ratio (PCB Thickness / PTH Diameter)?
- When should you switch to microvias?
- Answer: Depends!
  - Supplier
  - Reliability needs



**DfR Solutions** 

#### **PTH Diameter**

- Data from 26 board shops
  - Medium to high complexity
  - $\circ$  62 to 125 mil thick
  - 6 to 24 layer

- Results
  - Yield loss after worst-case assembly
  - Six simulated Pb-free reflows

| Process Attribute          | Hole/land<br>(mils) | Count | Min  | Q1   | Median | Q3   | Max   |
|----------------------------|---------------------|-------|------|------|--------|------|-------|
|                            | 8 / 18              | 6     | 0.00 | 0.00 | 0.31   | 3.24 | 17.16 |
| Yield Loss from            | 10 / 20             | 15    | 0.00 | 0.00 | 0.00   | 1.13 | 4.60  |
| Assembly<br>Simulation (%) | 12 / 22             | 26    | 0.00 | 0.00 | 0.00   | 0.00 | 5.23  |
| 0.000                      | 13.5 / 23.5         | 26    | 0.00 | 0.00 | 0.00   | 0.00 | 4.09  |
| Threshold: Open            | 14.5 / 24.5         | 19    | 0.00 | 0.00 | 0.00   | 0.00 | 0.00  |
|                            | 16 / 26             | 11    | 0.00 | 0.00 | 0.00   | 0.00 | 0.00  |

#### Yield loss can results in escapes to the customer!

## Are Microvias more reliable than PTHs?

- Depends!!
- Quality
  - Some fabricators have no problems
  - Some have more problems with microvias
  - Some have more problems with PTHs
  - Some have problems with both
- Reliability
  - A well-built microvia is more robust than a well-built PTH

DfR Solutions

## PTH vs. Microvia

#### **PTH Quality**

| Process Attribute              | Hole/land<br>(mils) | Count | Min | Q1 | Median | Q3  | Max  |
|--------------------------------|---------------------|-------|-----|----|--------|-----|------|
|                                | 8/18                | 6     | 25  | 60 | 177    | 380 | 737  |
|                                | 10 / 20             | 15    | 0   | 15 | 44     | 178 | 2947 |
| Defect Density<br>(Defects per | 12 / 22             | 26    | 0   | 0  | 6      | 30  | 1013 |
| Million Vias)                  | 13.5 / 23.5         | 26    | 0   | 0  | 0      | 27  | 512  |
| _                              | 14.5 / 24.5         | 19    | 0   | 0  | 0      | 17  | 173  |
|                                | 16 / 26             | 11    | 0   | 0  | 0      | 0   | 44   |

#### **Microvia Quality**

| Process<br>Attribute                         | Annular Ring<br>(milic) | A    | в    | с     | D  | Е  | F   | G  |
|----------------------------------------------|-------------------------|------|------|-------|----|----|-----|----|
| Defect Density<br>(Defects per Million Vias) | 2/8                     | 7384 | 8007 | 26598 | 68 | 61 | 598 | 81 |
|                                              | 3/9                     | 5527 | 2558 | 1735  | 38 | 8  | 76  | 24 |
|                                              | 4/10                    | 2370 | 1187 | 17    | 23 | 0  | 53  | 0  |
|                                              | 5/11                    | 2092 | 372  | 0     | 15 | 0  | 91  | 32 |



## **Summary (PTH and Microvias)**

- The capability of the PCB industry in regards to hole diameter tends to segment
  - Very high yield (>13.5 mil)
  - $_{\circ}$  High yield (10 13.5 mil)
  - Lower yield (< 10 mil)
- If 8 mil drill diameter is required
  - Consider using PCQR<sup>2</sup> to identify a capable supplier
  - Consider using interconnect stress test (IST) coupons to ensure quality for each build

DfR Solutions

• Consider transitioning to microvias (6 mil diameter)

## **Physics of Failure**

- Increasingly companies need powerful algorithms and tools to accurately predict the probability of failure over the lifetime of the product
- Even cell phones, with power amplifiers with high power cycles and dissipation, can experience wearout in three years



#### **Drivers for Thermo-Mechanical Failures**

 Knowing the critical drivers for solder joint fatigue, we can develop predictive models and design rules

**DfR Solutions** 



#### **Predictive Models – Physics of Failure (PoF)**

- Modified Engelmaier for Pb-free Solder (SAC305)
  - Semi-empirical analytical approach
  - Energy based fatigue
- Determine the strain range ( $\Delta\gamma$ )

$$\Delta \gamma = C \frac{L_D}{h_s} \Delta \alpha \Delta T$$

• C is a correction factor that is a function of dwell time and temperature,  $L_D$  is <u>diagonal distance</u>,  $\alpha$  is coefficient of thermal expansion (<u>CTE</u>),  $\Delta T$  is temperature cycle, h is <u>solder joint height</u>

DfR Solutions

#### **Predictive Models – Physics of Failure (PoF)(cont.)**

• Determine the shear force applied to the solder joint

$$\left(\alpha_2 - \alpha_1\right) \cdot \Delta T \cdot L = F \cdot \left(\frac{L}{E_1 A_1} + \frac{L}{E_2 A_2} + \frac{h_s}{A_s G_s} + \frac{h_c}{A_c G_c} + \left(\frac{2 - \nu}{9 \cdot G_b a}\right)\right)$$

- F is shear force, L is **length**, E is **elastic modulus**, A is the area, h is thickness, G is shear modulus, and a is edge length of bond pad
- Subscripts: 1 is <u>component</u>, 2 is <u>board</u>, s is solder joint, c is bond pad, and b is board

**DfR Solutions** 

 Takes into consideration foundation stiffness and both shear and axial loads

#### **Predictive Models – Physics of Failure (PoF)(cont.)**

Determine the strain energy dissipated by the solder joint

$$\Delta W = 0.5 \cdot \Delta \gamma \cdot \frac{F}{A_s}$$

 Calculate cycles-to-failure (N<sub>50</sub>), using energy based fatigue models

$$N_f = (0.0019 \cdot \Delta W)^{-1}$$

DfR Solutions

#### And It Works!

**BGA** Validation Graph



## **Thermo-Mechanical Design Rules Through Prediction**

More specific design rules requires performing a higher level of analysis (especially for power cycling)



Thermal Analysis Results



#### **Design Rules Through Prediction (cont.)**



## **Benchmarking Different Materials**

#### **SnPb** Assembly

#### SAC305 Assembly

DfR Solutions



- Demonstrated to avionics customer that transition to Pb-free would have a detrimental impact to product performance
  - Driven by severe use environment

## **Developing Accurate Accelerated Life Tests (ALT)**



- Lighting products customer was attempting to develop a product qualification plan
- Sherlock identified appropriate test time and test condition based on field environment and likely failure mechanism

**DfR Solutions** 

# Mechanical


## **Predicting Mechanical Shock Failures**

- Currently no methodology for predicting number of shocks/drops to failure
  - Assessment is go/no-go
- Based on a critical board level strain
  - Varies based on component type and strain rate



Initially developed by Steinberg



## **Calculating Board Level Strain**

- Except for really simple structures, you need finite element analysis (FEA)
  - There are techniques that use simple spring mass approximation to predict the board deflection during a shock event Spring/mass models assume masses connected by ideal weightless springs 0
  - Spring/mass models assume masses connected by 0

- FEA simulations are usually transient dynamic
  - DfR (Sherlock) utilizes an implicit transient dynamic simulation 0 (useful when solving linear/elastic)

## **Calculating Board Level Strain**

- Shock pulse is transmitted through the mounting points into the board
- The resulting board strains are extracted from the FEA results and used to predict robustness under shock conditions





0

0

50G shock pulse

Results in 12 mm

deflection (severe)

## **CPU Card with DC/DC Converter**









#### **Shock Failure Predictions**







0

0

## **Model Modification Shock**

# Two additional mounting points added mid-span Deflection drops from 12 mm to 1.65 mm









0

## **Model Modification Shock**

reliability designed, reliability delivered

#### Board mounted to a chassis plate



#### How to Mitigate Shock/Drop?

- Option One: Stop the board from bending!
  - Mount points, standoffs, epoxy bonding, thicker board, etc.
- Option Two: Give your part flexibility
  - Flexible terminations on ceramic capacitors
- Option Three: Strengthen your part (BGA / CSP)

- Corner Staking
- Edge Bonding
- Underfill

## **Shock/Drop and Corner Staking**



## Shock/Drop and Underfill



#### When Does Vibration Occur?

- Primarily affiliated with transportation
  - Shipping (very short part of the life cycle)
  - Automotive, trains, avionics, etc.
- Also a concern with rotating machinery (motors)
  - Transportation, appliances, HVAC, pipelines
- The two environments produce two very different forms of vibration

**DfR Solutions** 

• Harmonic (sinusoidal) and Random

## **In-Plane Component Vibration**



a\Roaming\Sherlock\projects\Herculusproject\_whirlpoolv3\PCB\modules\FEAModule\step1.frd



#### **Predicting Vibration Failures (Steinberg)**

- The board displacement is modeled as a single degree of freedom system (spring, mass) using an estimate (or measured) of the natural frequency
  - Allows for calculation of maximum deflection (Z<sub>0</sub>)

$$Z_0 = \frac{9.8 \times 3\sqrt{\frac{\pi}{2} \cdot \text{PSD} \cdot f_n \cdot Q}}{f_n^2}$$
Random

#### • Variables

- $\circ$  PSD is the power spectral density (g<sup>2</sup>/Hz)
- $\circ$  f<sub>n</sub> is the natural frequency of the CCA
- $\circ$  G<sub>in</sub> is the acceleration in g
- Q is transmissibility (assumed to be square root of natural frequency)



DfR Solutions

Steinberg D.S. Vibration analysis for electronic equipment. John Wiley & Sons, 2000.

#### **Predicting Vibration Failures (cont.)**

- Calculate critical displacement
  - This is the displacement value at which the component can survive 10 to 20 million cycles (harmonic, random)

 $Z_{c}$ 

0.00022B

 $chr\sqrt{L}$ 

DfR Solutions

- Variables
  - B is length of PCB parallel to component
  - c is a component packaging constant
    - 1 to 2.25
  - h is PCB thickness
  - r is a relative position factor
    - 1.0 when component at center of PCB
  - L is component length

Steinberg D.S. Vibration analysis for electronic equipment. John Wiley & Sons, 2000.

#### **Predicting Vibration Failures (cont.)**

- Life calculation
  - $_{\circ}$   $\,$  Nc is 10 or 20 million cycles  $\,$

$$N_0 = N_c \left(\frac{Z_c}{Z_0}\right)^{6.4}$$

DfR Solutions

- Several assumptions
  - CCA is simply supported on all four edges
  - More realistic support conditions, such as standoffs or wedge locks, can result in a lower or higher displacements
  - Chassis natural frequency differs from the CCA natural frequency by at least factor of two (octave)
  - Prevents coupling
  - Does not consider printed circuit board bending (components can have zero deflection but still be subjected to large amounts of bending)

Steinberg D.S. Vibration analysis for electronic equipment. John Wiley & Sons, 2000.

#### **FEA Based Vibration Predictions**

 Finite Element Analysis can be used to capture more complex geometries, loadings and boundary conditions



## **FEA Based Vibration Predictions (cont.)**



#### **FEA Modeling Loads**



- Loading can be applied to the model directly from the specification
- Vibration is applied to the structure through the standoffs/mount points



#### **FEA Vibration Simulation**

- Determining the response of the structure to a vibration load is commonly done using a Modal Dynamic Analysis
  - It is necessary to do a modal analysis before conducting this analysis
  - Determines the eigenvalues and eigenmodes (natural frequencies)
  - Calculates the stiffness and mass matrices









#### Summary of Best in Class DfR

- Step 1: Don't paint yourself into corner too early in the design process
- Step 2: Be aware of ALL requirements
- Step 3: Try to perform concurrent engineering
- Step 4: Use a design check list (don't rely on tests to develop a robust design)
  - Part selection
  - Derating
  - Power Stability
  - ESD
  - EMI / EMC
  - Design for Manufacturability / Testability / Environment

**DfR Solutions** 

Components that wearout