When it comes to digital signals, I always feel dizzy examining their timing sequences. It did take me a while to understand how the internal asynchronous clock is generated. In this post, I tried to write down the core idea behind the pulse generator of asynchronous SAR ADC.

First of all, the pulse generator (Fig.1) needs a dynamic latch comparator! At the reset phase, the differential outputs are both set to low; at the comparison phase, once the regeneration is complete, one of the differential outputs will go to high and another stay at low. This property is utilized by a succeeding NAND which will output a *valid* signal indicating the completion of comparison.

Since the comparison starts at the negative edge, the *sample* signal is used to trigger the first comparison of the latch. Once the comparison is finished, the *valid* signal will generate a positive edge. With an OR function of *sample* and *valid*, following certain deliberate delay, the clock of the latch can be generated.

Figure 7‑2 depicts the timing sequences of three critical signals (*sample*, *clk*, and *valid*). Four sources of delay keep the asynchronous machine running:

- td1 is composed of the delay of the OR gate, the charging path of VDC, and the succeeding inverter.
- td2 is composed of the regeneration time of the latch, the delay of the following inverter and the NAND gate.
- td3 is composed of the delay of the OR gate, the discharging path of VDC, and the succeeding inverter.
- td4 is composed of the reset time of the latch, the delay of the following inverter and the NAND gate.

The low level of the generated *clk* is the sum of td2 and td3, which varies as the input of the comparator changes. The high level of *clk* is the sum of td1 and td4, which is input-independent. One must pay attention to the DAC settling, which happens when *clk* goes high. The VDC can then be programmed/configured to ensure accurate DAC settling.

In summary, in order to increase the sample rate, the low level of *clk* should be as short as possible by speeding up the comparator, while the high level of *clk* should be long enough for the DAC to be fully settled. Worth to mention here, though the delay of digital control logic is not included in the discussion, it also become rather critical in high-speed design. This post is just a touch of the sphere of high-speed SAR ADC design. Salut to those who race in this challenging field!

I will start from the frequency domain and play with the equations we are familiar with. To simplify the problem, I assume it’s a unity negative-feedback second-order system. Denoting the DC gain by A, the two poles by f1 and f2, the unity-gain bandwidth by GBW (=Af1), the open-loop gain can be written as

.

The closed-loop gain can be thereafter written as

.

Express the denominator in the familiar control theory form, , where ξ is the damping factor and f_{n} the natural frequency.

and

If we take a close look at the damping factor, which tells the relative position between the second pole and the unity loop gain bandwidth, we shall know the phase margin as has been discussed in one of my old posts.

The peaking in the amplitude response can be found at with a value equal to

.

Now we have derived the relationship between phase margin and peaking in amplitude response. Let’s continue to look at the time domain. Considering we normally care about the underdamped case (ξ<1), the equation of its step response can be written as (detailed explanation and derivation can be referred to this link)

.

The peaking in the step response can be found at with a value equal to

.

The exact number of the phase margin and peakings can now be easily calculated. Poor phase margin corresponds to peaking in the frequency domain and ringing in the time domain.

]]>And now, finally, I decide to write down my very limited understanding on FFT with windowing, focusing on characterizing the dynamic performance of ADC in Matlab. Let’s go straightforward by first looking at the Matlab script I am using to calculate the SNR/SNDR of ADC.

fs = 1e6; % Sample rate nfft = 1024; % Number of FFT cycles = 113; % Number of input periods fin = cycles/nfft*fs; % Input frequency data = data(1:nfft); % Number of data to be the same as nfft w = 0.5*(1 - cos(2*pi*(0:nfft-1)/nfft)); % Hann window cg = sum(w)/nfft; % Normalized coherent gain enbw = sum(w.*w)/(sum(w)^2)*nfft; % Normalized equivalent noise bandwidth nb = 3; % Signal bins dcbin = (nb+1)/2; % Number of DC bins if (size(data,1)~= size(w,1)) % Check dimention w = w'; end ss = abs(fft(data.*w)); % FFT with windowing ss = ss/nfft/cg; % Compensate for window attenuation ss = ss(1:nfft/2).*2; % Drop the redundant half but keep total power the same signal_bin = cycles+1; % Signal bin, Matlab array starts from 1 dc_bins = 1:dcbin; % DC bins all_bins = setdiff(1:nfft/2, dc_bins); % Disregard DC bins signal_bins = signal_bin + (-(nb-1)/2:(nb-1)/2); % Signal leakage bins other_bins = setdiff(all_bins, signal_bins); % Further discard signal bins fh = (2:10)*fin/fs; % Harmonic tone: (-/+m)fin + (-/+k)fs while max(fh) > 1/2 fh = abs(fh - (fh > 1/2)); % If harmonic tone fh>fs/2, it is equal to fs-fh end harm_bins = round(nfft * fh) + 1; % Harmonic bins (2nd - 10th) harm_binsl = zeros(length(harm_bins),nb); % Find Harmonic leakage bins for i = 1:length(harm_bins) harm_binsl(i,:) = ((harm_bins(i) + (-(nb-1)/2 : (nb-1)/2))); end harm_binsl=reshape(harm_binsl',length(harm_bins)*nb,1); % Convert matrix to array harm_binsl=unique(harm_binsl); % Discard the repetitive harmonic bins noise_bins = setdiff(other_bins, harm_binsl); % Further discard the harmonic bins Psignal = sum(ss(signal_bin).^2); % Signal power PnoiseD = sum(ss(noise_bins).^2)/enbw/length(noise_bins);% Noise PSD Pnoise = PnoiseD*length(all_bins); % Total noise power Pharm = sum(ss(harm_bins).^2); % Power of harmonics snr = 10*log10(Psignal/Pnoise); % Calculate SNR sndr = 10*log10(Psignal/(Pnoise+Pharm)); % Calculate SNDR enob = (sndr - 1.76)/6.02; % Calculate ENOB thd = -10*log10(Psignal/Pharm); % Calculate THD Pharm_max = max(ss(harm_bins)).^2; Pnoise_max = max(ss(noise_bins)).^2; sfdr = 10*log10(Psignal/max(Pharm_max,Pnoise_max)); % Calculate SFDR sdb = 10*log10(ss.^2); f = (1:length(ss))/nfft; % Frequency vector normalized to fs plot(f, sdb, 'k-','linewidth',1.5); xlabel('Frequency [ f / f_s ]','FontSize',10); ylabel('Power Spectrum [ dB ]','FontSize',10); grid on; text(0.3, -40,... sprintf('SNDR = %.1fdB \n SNR = %.1fdB \n THD = %.1fdB \n SFDR = %.1fdB \n ENOB = %.1fbits', sndr, snr, thd, sfdr, enob), ... 'FontSize',10);

An example of FFT plot will look like this:

A default FFT using rectangular/box window has a normalized coherent gain as 1, a normalized equivalent noise bandwidth as 1, and only 1 signal bin. Hence, if other types of window are used, the corresponding window properties need to be specified. It’s all about two values: 1) the sum of the window terms; and 2) the sum of the squares of the window terms. Equations from the seminal paper [1] tell us why the two sums matter.

Let the input sampled sequence be defined by

where q(nT) is a white-noise sequence with variance . Then the signal component of the windowed spectrum is given by

Hence, the output amplitude of the noiseless signal is the input amplitude multiplied by a term which is **the** **sum of the window terms (S1)**. This term is called the processing gain (sometimes called coherent gain) of the window. The rectangular window has the largest gain compared to other windows. In the Matlab script, the normalized coherent gain (normalized by S1 of the rectangular window) is specified.

The noise component of the windowed spectrum is given by

The noise power is calculated using the expectation operator

As additive noise is assumed to be white, the above value represents the noise floor level (or the noise power spectral density), which is also constant. Notice the power gain of noise is **the sum of the squares of the window terms (S2)**.

The noise bandwidth is calculated by the total noise power (sigma^2*S2*fs) divided by the peak power gain of the window (S1^2). Here we only focus on the multiplication term to the input noise (S2*fs/S1^2). Further introducing a parameter, fres, the width of one frequency bin (fs/N), then the normalized equivalent noise bandwidth (ENBW) is given by

Therefore, to obtain correct power levels of the signal and the noise, the normalized coherent gain is subtracted from the PSD and the normalized ENBW is subtracted from the calculated total noise power, respectively.

Nerd Rage explains the ENBW from window energy point of view, which is quite insightful. Allow me to copy one of his picture and some remarks here:

“For the Hann window the main lobe height is -6.02dB and therefore the height of any single spectral line will be 6.02dB below its real value. To obtain correct energy levels of the spectral peaks (in the absence of scalloping), the main lobe height (in dB units) is usually subtracted from the PSD. However, this overcompensates the window-specific reduction of the noise floor – for the Hann window, peak compensation is 6.02dB while the noise floor is only 4.26dB below its true value. This decreases the observed SNR by 1.76dB or a factor of 1.5.”

Note: for Hann window, 20log(s1/N) = -6.02dB; 10log(s2/N)=-4.26dB; N*s2/s1^2 = 1.5.

Regarding the number of spread bins for a certain window function, a simple Matlab script performing FFT with windowing can do the job:

% Here you create an input fs = 1024; nfft = 1024; fin = 16; cycles = 16; vin = sin(2.*pi.*fin.*(0:1:nfft-1)./fs); % Here you perform FFT with and without windowing ss = abs(fft(vin(1:nfft)))/(nfft/2); w = 0.5*(1 - cos(2*pi*(0:nfft-1)/nfft)); cg = sum(w)/nfft; sshann = abs(fft(vin(1:nfft).*w))/(nfft/2)/cg; % Here you do the plot figure(1) subplot(1,2,1) stem(ss); title('Rectangular'); xlim([1 25]); grid on; subplot(1,2,2) stem(sshann); xlim([1 25]); title('Hann'); grid on;

Then it comes the plot:

Sorry for the boring and not well-written post (though I’ve tried my best). Fig.4 is just for a smile when one finishes reading all the above dull words.

Last but not least, may Mr. Fourier’s wisdom be always with us!

[1] Frederick J. Harris, “On the Use of Windows for Harmonic Analysis with the Discrete Fourier Transform,” Proceedings of the IEEE, vol 66, pp. 51-83, January 1978.

[2] G. Heinzel, A.Rudiger, and R. Schilling, “Spectrum and spectral density estimation by the Discrete Fourier transform (DFT), including a comprehensive list of window functions and some new flat-top windows,” February 15, 2002.

]]>First, let’s look at Fig.1 and ask oneself the following question: Between (a) and (b), which one is an integrator and which one is a gain stage?

The top is an integrator and the bottom a gain stage. The two working modes of the integrator and the gain stage are depicted in Fig.2 and Fig.3, respectively.

Now it’s time to take a close look at the switched-capacitor part of the integrator. Why does it act like a resistor?

First, let’s simplify the parasitic-insensitive version of Fig.4(a) to a straightforward implementation of Fig.4(b). In the first half of the period, S1 is closed and S2 is open. The capacitor is connected to V1, acquiring a charge of Q1=C*V1. In the second half, S1 is closed and S2 is open. The capacitor is connected to V2, storing a charge of Q2=C*V2. During this period, the total charge transferred from V1 to V2 is C*(V1-V2). In the next cycle, the capacitor is again connected to V1, replenishing its charge back to C*V1. Then it transfers a charge of C*(V1-V2) to V2. Accordingly, the average current flowing from source V1 to source V2 is equal to the charge moved in one period:

We can therefore view the discrete-time circuit as a resistor equals to

There are several considerations we shall pay attention to:

- Time varying of V1/V2 should be slower than the rate of switching.
- T/2 should be long enough for the capacitor to fully charge/discharge to the intended levels.
- Different from a resistor, the average currents are the same, but not the instantaneous current.
- … (to be discovered by oneself during real implementation)

Knowing the switched-capacitor version of a resistor, Fig.5 gives two types of integrator: the continuous and the discrete. A comparison between the two reveals one of the advantages of the latter: while the product of R and C may have as much as a 10% process variation, the ratio of two capacitances has only a variation within 0.1%. OK. Let’s also be fair to name one disadvantage of the discrete equivalent: it is jitter-sensitive.

Finally, allow me to copy part of Prof.Ali’s summary in the article – “If we view a resistor as an element that transfers charge from one terminal to another at a constant rate, we can implement it using a capacitor and two switches.”

]]>This topology, however, must be modified because the output of PD consists of a dc component (desirable) and high-frequency components (undesirable). The control voltage of VCO must remain quiet in the steady state, which means the PD output should be filtered. Therefore, a 1st-order low-pass RC filter is interposed between the PD and the VCO, as is shown in Fig.2.

PLLs are best analyzed in the phase domain (Fig.3). It is instructive to calculate the phase transfer function from the input to the output. The ideal PD can be modeled as the cascade of a summing node and a gain stage, because the dc value of the PD output is proportional to the phase difference of the input and output. The VCO output frequency is proportional to the control voltage. Since phase is the integral of the frequency, the VCO acts as an ideal integrator which receives a voltage and outputs a phase signal.

The overall loop transfer function of the 2nd-order PLL shown in Fig.2 can be written as

where . The phase error has the following transfer function

If the input is a sinusoidal of constant angular frequency ωi, the phase ramps linearly with time at a rate of ωi. Thus, the Laplace-domain representation of the input signal is . From the final value theorem, the steady-state phase error is

As can be seen, to lower the phase error, KpKv must be increased. Moreover, as the input frequency of the PLL varies, so does the phase error. Subsequently, in order to eliminate the phase error, a pole at the origin can be introduced. The RC loop filter can then be replaced by an integrator. Hence, it comes the popular architecture – charge-pump PLL (Fig.4), which comprises a phase/frequency detector, a charge pump, and a VCO.

As long as the loop dynamics are much slower than the signal, the charge pump can be treated as a continuous time integrator. The phase model of CPPLL is now shown in Fig.5. Writing the transfer function and doing some calculation, the phase error is finally confirmed to be eliminated. However, one must remember that two integrators are now sitting in the forward path, each contributing a constant phase shift of 90°. It will be frightening to see the phase curve is a straight line at -180° for a negative feedback system.

In order to stabilize the system, a zero is introduced by adding a resistor in series with the charge pump capacitor (Fig.6). Placing the zero before the gain crossover frequency helps to lift the phase curve up.

The compensated PLL suffers from a critical drawback. Each time a current is injected into the RC branch, the control voltage to the VCO will experience a large jump. Even in the locked conditions, the mismatches between charge and discharge current introduce voltage jumps in the control voltage. The resulting ripple disturbs the VCO. To relax this issue, a second capacitor is commonly tied between the control line and ground (Fig.7).

Finally, the PLL becomes a 3rd-order system. Don’t worry about the phase margin too much, as long as the zero, the unity-gain frequency, and the 3rd pole are positioned well (Fig.8).

The author refers to two books for writing this post: 1) Behzad Razavi, Design of analog CMOS Integrated Circuits; 2) Ali Hajimiri and Thomas H. Lee, The design of low noise oscillators.

]]>1. **Usually the most difficult condition is unity-gain feedback.** As Fig.1 shows, the close-loop bandwidth is normally smaller than or equal to the unity-gain bandwidth. This means it reaches the maximum phase drop at the unity-gain point, making unity-gain feedback the most difficult for stability.

2. **Adding a miller capacitor between the two cascaded stages is a common technique. **As Fig.2 shows, K is the ratio between the second pole and the GBW, which determines the phase margin (referring to this post). In addition, to push the right-half-plane (RHP) zero far more than the second pole, it’s better to keep Cm much smaller than C2. This further puts a demand on the ratio between gm1 and gm2. Finally, the trade-off between noise/speed (small gm1) and current consumption (large gm2) lands on the desk (as expected).

3. **Introducing a nulling resistor is the most popular approach to mitigate the positive zero.** Compared to Fig.2, both the poles (neglecting the third non-dominant pole) and the GBW won’t change, except for the positive zero. How to calculate the new zero? Fig.3 demonstrates a simple way which was introduced by Prof. Razavi in his analog design book. One can either use the zero to (try to) cancel the second pole or simply push the zero to infinity.

4. **Ahuja compensation [1] is another way to abolish the positive zero.** The cause of the positive zero is the feedforward current through Cm. To abolish this zero, we have to cut the feedforward path and create a unidirectional feedback through Cm. Adding a resistor such as in Fig.3 is one way to mitigate the effect of the feedforward current. Another approach uses a current buffer cascode to pass the small-signal feedback current but cut the feedforward current, as is depicted in Fig.4. People name this approach after the author Ahuja.

5. **A good example of using Ahuja compensation is to compensate a 2-stage folded-cascode amplifier.** As is shown in Fig.5, by utilizing the “already existed” cascode stage, the Ahuja compensation can be implemented without any additional biasing current. There are two ways to put the miller capacitor, which normally will provide the same poles but different zeros. In REF[2], the poles and zeros of the two approaches are calculated based on reasonable assumption. Out of curiosity, I also drew the small-signal model and derived the transfer function. With some patience I finally reached the same result as given in REF[2].

6. **The bloody equations of poles and zeros of two Ahuja approaches are shared in Fig.6.** Though these equations looks very dry at the moment, one will appreciate them during the actual design. They do help me to stabilize an amplifier with varying capacitive load. One thing worth to look at is the ratio between the natural frequency of the two complex non-dominant poles and the GBW. Considering Cm and C2 are normally in the same order and C1 is much smaller, the ratio will end up with a relatively large value and the phase margin can be guaranteed.

Oh…finally, it took me quite some time to reach here. The END.

Reference:

[1] B.K.Ahuja, “An improved frequency compensation technique for CMOS operational amplifiers”, JSSC, 1983.

[2] U. Dasgupta, “Issues in “Ahuja” frequency compensation technique”, IEEE International Symposium on Radio-Frequency Integration Technology, 2009.

]]>On the other hand, the annotation of the DC operating point provided by Cadence is really helpful. Now we can even have gm/ID annotated beside the transistor (it is called ‘gmoverid’ in the simulator). Hence, a curve showing the gm/ID-IC relationship will be informative, and this Mr.Sansen has [1]! It is plotted in Fig.1.

In order to derive the relationship, we first need to recall the following equations:

Based on the above equations, the gm/ID can be derived:

Now we may have a rough idea of IC based on the annotated gm/ID (assuming nUT is about 35 mV).

gm/ID 25 18 9

IC 0.1 1 10

**Reference**

[1] W. Sansen, “Minimum power in analog amplifying blocks – presenting a design procedure ”, *IEEE Solid-State Circuits Magazine*, fall 2015.

With the help of Gm/Id design kit, I can easily visualize the transistor performance as a function of its gate-source voltage (see Fig.1). As VGS increases, the transistor undergoes the weak, the moderate, and the strong inversion. For high gain, we go left; for high speed, we go right. Being far-left, the gain is not increasing but the speed drops extremely low; being far-right, the speed is not increasing but the drain current is still climbing! For a decent figure-0f-merit (speed*gain), go to the middle, go moderate!

As CMOSers, we love the square-law equation, we sometimes hate and sometimes embrace the exponential subthreshold current equation. But with regard to the current flowing between the strong and the weak, do we have one equation for it? No, but yes…by doing some math, the EKV model combines all the three. Referring to [1], the IC-V related equations are copied as follows:

,

, ,

,

where n is subthreshold slope factor and UT is thermal voltage. At room temperature, 2nUT is about 70mV [1]. As Fig.2 shows, the IC-V curve matches well with the weak for IC < 0.1 or the strong for IC > 10; the moderate locates where IC is between 0.1 and 10.

**Reference**

[1] W. Sansen, “Minimum power in analog amplifying blocks – presenting a design procedure ”, *IEEE Solid-State Circuits Magazine*, fall 2015.

The amplifying system may includes multiple poles:

.

Neglecting higher order terms, it could be simplified to a two-pole equation: one dominant pole and one equivalent non-dominant pole which is approximate to:

.

The frequency of interest is where the loop gain magnitude is close to unity, denoted as ωt. Normally ωt is much larger than the dominant pole. Hence, βA(s) around ωt can be further simplified to:

.

Considering the first pole introduces -90° phase shift, the phase of the loop gain at ωt is:

.

Consequently, the phase margin (PM) is calculated by adding 180° to the phase of the loop gain and it is written as:

.

It can be seen that the phase margin is determined by the relative position between the equivalent non-dominant pole and the unity loop gain bandwidth.

ωeq/ωt 0.5 1 2 3 4

PM 26.6° 45° 63.4° 71.6° 76°

]]>Earlier in 2012, I wrote an introductory post about EKV model and later extended the related topic a little bit in another post – Stay Simple – Square-Law Equation Related. Since then I keep following the information about the EKV model and the inversion-coefficient-based analog design methodology.

One of the major contributors on this design methodology is Prof. Willy Sansen. He has given a short tutorial named *Impact of Scaling on Analog Design. *The tutorial* *was organized by ISSCC through edX (free access after registration). Most recently he also published an article [1] to summarize his idea in the IEEE Solid-State Circuits Magazine.

The journey starts with a beautiful equation which nicely links the weak and the strong inversion (see the curve in Fig.1).

Fascinated by Prof.Sansen’s design procedure, I tried to apply it to my daily design work. Theoretically, it does give me a broader view and some insight on the low-power design. However, practically I find it difficult to make full use of it. Especially nowadays most of the design enters into the deep submicron region, and the model parameters are so complicated to interpret.

Then there comes another big guy – Prof. Boris Murmann. Yes, the professor provides the famous ADC performance survey! Now the professor also launches his gm/Id starter kit. The kit provides scripts that can co-simulate between SPICE simulator and Matlab and store transistor DC parameters into Matlab files. The data stored can then be used for systematic circuit design in Matlab. It looks brute-force but yet smart and efficient!

It’s free. Enjoy!

**Reference**

[1] W. Sansen, “Minimum power in analog amplifying blocks – presenting a design procedure ”, *IEEE Solid-State Circuits Magazine*, fall 2015.