One-way delay measurement using NTP synchronization
Vladimir Smotlacha (vs@cesnet.cz)
Introduction
This document describes a new method of OWD measurement in WAN with estimated error about 1 millisecond. Sites of measurement are synchronized by NTP servers. No other source of exact time is needed for the measurement, however we used PPS (Pulse Per Second) signal from GPS to evaluate the exact accuracy of described method and to compare different setups.
All tests were done between boxes located in Cesnet (NREN of The Czech republic) and Heanet (NREN of Ireland).
Basic idea and motivation is described here and different setups of measurement system are proposed here .
Method of Measurement
Raw OWD was measured by tool RUDE/CRUDE. RUDE transmits UDP packets with defined traffic shape. Each packet contains a sequential number and a timestamp of transmission time (given by clock of the sender). CRUDE collects this traffic, adds timestamps of reception (given by clock of the receiver) and generates log. We can evaluate many QoS parameters from this log, e.g. throughput, loss ratio, one-way delay, one-way delay variation.
Each site of measurement runs NTP daemon which synchronizes local clock to NTP server. We used the parameter 'maxpoll = 6' forcing the NTP process to contact NTP server with period not exceeding 64 s, while without this parameter the poll period can reach 1024 s, which has bad influence on quality of clock synchronization. We will illustrate later how important this parameter is.
As closed loop-back is the base of the NTP process, actual time offset of local clock is known. The offset is reported by the command 'ntpq -c rl' or returned by the function ntp_adjtime().
In order to know exact offset of local clock (independent on NTP daemon), we provide each box by the the PPS signal. Thus we obtain offset with absolute accuracy of about 10 microseconds. We have to note, that PPS signal was not used for synchronization but only for exact offset evaluation.
Let we assign:
- Ts - timestamp of packet sending (from CRUDE log)
- Tr - timestamp of packet receiving (from CRUDE log)
- Os - offset of sender clock (reported by NTP)
- Or - offset of receiver clock (reported by NTP)
- Ps - exact offset of sender clock (from PPS capture log)
- Pr - exact offset of receiver clock (from PPS capture log)
- Raw one-way delay obtained from CRUDE log
OWD_r = Tr - Ts - One-way delay corrected by NTP offsets
OWD_n = Tr - Or - (Ts - Os) - Exact one-way delay calculated from GPS time
OWD_e = Tr - Pr - (Ts - Ps)
Results of measurement
We did the measurement in two setups with different location of NTP servers. In Setup I we used two NTP servers (tik.cesnet.cz and Karlovy.hea.net alias tt35.ripe.net) located in pops of measurement with round-trip delay less then one millisecond from each box. In Setup II we used one common NTP server (worf.ijs.si) located in Slovenia, i.e. in GEANT network but outside of both pops.
Cesnet box of measurement is DELL 1400 with Pentium III/860MHz and Heanet site box is standard PC with Pentium III/450MHz. Both boxes run Linux, kernel 2.4.16 with nanokernel patch. We use NTP version 4.1.71 and RUDE version 0.50.
Setup I
Direction Cesnet -> Heanet
The first graph displays measured OWD_r, i.e. raw OWD (red) and
OWD_e , the exact OWD (green).
Figure 1.1: Measured and exact OWD
The second graph displays OWD_n, i.e. OWD corrected by offset
estimated by NTP process.
Figure 1.2: Corrected OWD
Direction Heanet -> Cesnet
Here are the same graphs for opposite direction.
Figure 1.3: Measured and exact OWD
Figure 1.4: Corrected OWD
Time offset of sites of measurement
The graph displays time offset of Cesnet box reported by NTP process
Figure 1.5: NTP reported offset
This graph displays error of NTP reported offset, i.e. Ps - Os, for Cesnet
box as a sender.
Figure 1.6: Error of NTP reported offset
The last graph is very important, as it represents the accuracy of our method of OWD measurement. It combines all external influences: quality of NTP server, features of network between NTP server and site of measurement and quality of clock oscillator.
The same graphs, now for Heanet box.
Figure 1.7: NTP reported offset
Figure 1.8: Error of NTP reported offset
Conclusions of Setup I
Following graph displays the absolute error of our OWD measurement,
i.e.
OWD_n - OWD_e
Figure 1.9: Error of OWD measurement
We see that the error is in interval +- 500 microsecond.
Another important result is different characteristic of offset error of both
sites of measurement, although their configuration is identical. The reason has
to be analyzed later but we assume that it depends on quality of NTP
servers.
Setup II
In Setup II was used one common NTP server located in third network - the box worf.ijs.si in Slovenia. The average round-trip delay is 30 ms between NTP server and Cesnet and 54 ms between NTP server and Heanet. Therefore, conditions for good time synchronization are much worse then in Setup I.
We present here analogous set of graphs as in Setup I. We see on all graphs the behavior of measurement system when NTP service was not available for about 12 hours.
Direction Cesnet -> Heanet
Figure 2.1: Measured and exact OWD
Figure 2.2: Corrected OWD
Direction Heanet -> Cesnet
Figure 2.3: Measured and exact OWD
Figure 2.4: Corrected OWD
Time offset of sites of measurement
Figure 2.5: NTP reported offset
Figure 2.6: Error of NTP reported offset
Figure 2.7 NTP reported offset
Figure 2.8: Error of NTP reported offset
Conclusions of Setup II
Following graph displays the absolute error of our OWD measurement,
i.e.
OWD_n - OWD_e
Figure 2.9: Error of OWD measurement
We expected much worse results than in Setup I, but the OWD error is still far bellow +- 1 millisecond when NTP server is available. We can see how measurement system converges after the end of NTP service failure.
Setup IIa
Setup IIa is the same as Setup II with only one difference: omitting of parameter 'maxpoll = 6' in NTP daemon configuration. Here is the same set of graphs.
Direction Cesnet -> Heanet
Figure 3.1: Measured and exact OWD
Figure 3.2: Corrected OWD
Direction Heanet -> Cesnet
Here are the same graphs for opposite direction.
Figure 3.3: Measured and exact OWD
Figure 3.4: Corrected OWD
Time offset of sites of measurement
Figure 3.5: NTP reported offset
Figure 3.6: Error of NTP reported offset
Figure 3.7: NTP reported offset
Figure 3.8: Error of NTP reported offset
Conclusions of Setup IIa
Following graph displays the absolute error of our OWD measurement,
i.e.
OWD_n - OWD_e
Figure 3.9: Error of OWD measurement
We see worse results than in Setup II, the OWD error is most time in interval -3ms - +1ms. It is an evidence that parameter polling interval of NTP daemon has to be kept small.
Plans for Future
We plan to continue with Setup III, where secondary NTP servers will be used. This configuration is most general and assumes that site of measurement is not located directly POP but rather in a customer network.
Another planed activity is to extend measurement to more sites.