Discussion:
[Linuxptp-users] PTP - MAC time
Ian Thompson
2017-04-05 14:34:14 UTC
Permalink
All

I have another question where I reveal my ignorance of IEEE-1588.

Why is the time that gets put into the PTP registers in the STM MAC, Unix time rather than PTP time?
Am I missing a configuration setting for this?

Regards
Ian T.
Richard Cochran
2017-04-06 05:48:34 UTC
Permalink
Post by Ian Thompson
Why is the time that gets put into the PTP registers in the STM MAC, Unix time rather than PTP time?
See below.
Post by Ian Thompson
Possibly following on from David’s post.
We have a system with 18 boards in a rack, each board has a Altera SoC with the STM Ethernet MAC connected via gigabit Ethernet to an Arista ptp-aware switch and then a Spectracom GrandMaster.
The boards are running Linux kernel 3.15.0.
That HW puts the time stamps into the buffer descriptor, and so in
theory it should never miss a time stamp. This is most likely a
driver bug. Looking at the git log I see:

v4.11-rc1~124^2~171^2~12 deeb637 net: stmmac: remove freesoftware address
v4.9-rc7~33^2~33^2~1 ba1ffd7 stmmac: fix PTP support for GMAC4
v4.9-rc7~33^2~33^2~2 d204205 stmmac: update the PTP header file
v4.9-rc4~28^2~68 c30a70d stmmac: fix and review the ptp registration.
v4.9-rc4~28^2~96 50756eb stmmac: fix an error code in stmmac_ptp_register()
v4.9-rc1~28^2~10 7086605 stmmac: fix error check when init ptp
v4.9-rc1~127^2~108 efee95f ptp_clock: future-proofing drivers against PTP subsystem becoming optional
v4.1-rc1~128^2~100^2~5 e7ea55b ptp: stmmac: use helpers for converting ns to timespec.
v4.1-rc1~128^2~119^2~6 3f6c465 ptp: stmmac: convert to the 64 bit get/set time methods.
v3.17-rc5~41^2~38 5566401 stmmac: ptp: fix the reference clock
v3.17-rc5~41^2~50 f95f404 stmmac: set ptp_clock to NULL while unregister
v3.15-rc1~113^2~108^2~5 4986b4f0 ptp: drivers: set the number of programmable pins.
v3.13-rc7~13^2 7cd0139 stmmac: Fix incorrect spinlock release and PTP cap detection.
v3.10-rc1~66^2~195 32ceabc stmmac: improve/review and fix kernel-doc
v3.10-rc1~66^2~327^2~1 92ba688 stmmac: add the support for PTP hw clock driver
v3.10-rc1~66^2~327^2~2 891434b stmmac: add IEEE PTPv1 and PTPv2 support.

Especially ba1ffd7 looks suspicious.
Post by Ian Thompson
Apr 4 13:42:04 localhost user.info ptp4l: [537.164] rms 123 max 599 freq +255 +/- 39 delay 7362 +/- 48
Apr 4 13:42:29 localhost user.err ptp4l: [561.387] timed out while polling for tx timestamp
Apr 4 13:42:29 localhost user.err ptp4l: [561.387] increasing tx_timestamp_timeout may correct this issue, but it is likely caused by a driver bug
Apr 4 13:42:29 localhost user.err ptp4l: [561.387] port 1: send delay request failed
Apr 4 13:42:29 localhost user.notice ptp4l: [561.387] port 1: SLAVE to FAULTY on FAULT_DETECTED (FT_UNSPECIFIED)
Apr 4 13:42:45 localhost user.notice ptp4l: [577.388] port 1: FAULTY to LISTENING on FAULT_CLEARED
Apr 4 13:42:45 localhost user.warn ptp4l: [577.414] clockcheck: clock jumped backward or running slower than expected!
Apr 4 13:42:45 localhost user.notice ptp4l: [577.414] port 1: new foreign master 000cec.fffe.0a085d-1
Apr 4 13:42:47 localhost user.notice ptp4l: [579.414] selected best master clock 000cec.fffe.0a085d
Apr 4 13:42:47 localhost user.notice ptp4l: [579.414] port 1: LISTENING to UNCALIBRATED on RS_SLAVE
Apr 4 13:42:54 localhost user.notice ptp4l: [587.164] port 1: UNCALIBRATED to SLAVE on MASTER_CLOCK_SELECTED
Apr 4 13:46:46 localhost user.info ptp4l: [818.414] rms 2312500092 max 37000001557 freq +246 +/- 250 delay 7358 +/- 46
Apr 4 13:51:02 localhost user.info ptp4l: [1074.413] rms 116 max 681 freq +256 +/- 48 delay 7373 +/- 88
Does this imply that one lost delay request can do this, or is there a retry mechanism?
One lost delay request shouldn't introduct such a large error. This
is a driver bug. Notice that the time error is 37 seconds, or the
UTC/TAI offset.

When resetting the fault, ptp4l re-initializes HW time stamping.

The funtion, stmmac_hwtstamp_ioctl(), in

drivers/net/ethernet/stmicro/stmmac/stmmac_main.c

programs the system time (UTC) into the PHC every time HW time
stamping is enabled. It shouldn't do that.
Post by Ian Thompson
We have a lot of traffic leaving the boards but only PTP traffic
coming in. As we increase the off board transfer rates the problem
seems to occur more often.
That could indicate a driver or a HW issue, or both.

HTH,
Richard
Ian Thompson
2017-04-10 15:18:32 UTC
Permalink
Richard

Thanks for the response.
I now have one board running kernel 3.18 and another running kernel 4.9.

I still see the issue with 3.18 but I haven't yet seen it on 4.9.
Unfortunately, we have a proprietary driver for a device on the pcie bus which doesn't yet support 4.x kernels and it is this that generates (via an application) most of the network traffic.
I might have to port all of the stmmac changes back to 3.18.

If I add 37 seconds to getnstimeofday then the effect of the "glitch" is less pronounced.
Kernel 3.18 introduced timekeeping.c, with timekeeping_get_tai_offset(), which I thought might give me the UTC offset but it returns 0 at the point I call it.
Is there a call within the kernel to find the UTC offset?

Regards
Ian T.

-----Original Message-----
From: Richard Cochran [mailto:***@gmail.com]
Sent: Thursday, April 06, 2017 12:49 AM
To: Ian Thompson
Cc: linuxptp-***@lists.sourceforge.net
Subject: [External] Re: [Linuxptp-users] PTP - MAC time
Post by Ian Thompson
Why is the time that gets put into the PTP registers in the STM MAC, Unix time rather than PTP time?
See below.
Post by Ian Thompson
Possibly following on from David’s post.
We have a system with 18 boards in a rack, each board has a Altera SoC with the STM Ethernet MAC connected via gigabit Ethernet to an Arista ptp-aware switch and then a Spectracom GrandMaster.
The boards are running Linux kernel 3.15.0.
That HW puts the time stamps into the buffer descriptor, and so in theory it should never miss a time stamp. This is most likely a driver bug. Looking at the git log I see:

v4.11-rc1~124^2~171^2~12 deeb637 net: stmmac: remove freesoftware address
v4.9-rc7~33^2~33^2~1 ba1ffd7 stmmac: fix PTP support for GMAC4
v4.9-rc7~33^2~33^2~2 d204205 stmmac: update the PTP header file
v4.9-rc4~28^2~68 c30a70d stmmac: fix and review the ptp registration.
v4.9-rc4~28^2~96 50756eb stmmac: fix an error code in stmmac_ptp_register()
v4.9-rc1~28^2~10 7086605 stmmac: fix error check when init ptp
v4.9-rc1~127^2~108 efee95f ptp_clock: future-proofing drivers against PTP subsystem becoming optional
v4.1-rc1~128^2~100^2~5 e7ea55b ptp: stmmac: use helpers for converting ns to timespec.
v4.1-rc1~128^2~119^2~6 3f6c465 ptp: stmmac: convert to the 64 bit get/set time methods.
v3.17-rc5~41^2~38 5566401 stmmac: ptp: fix the reference clock
v3.17-rc5~41^2~50 f95f404 stmmac: set ptp_clock to NULL while unregister
v3.15-rc1~113^2~108^2~5 4986b4f0 ptp: drivers: set the number of programmable pins.
v3.13-rc7~13^2 7cd0139 stmmac: Fix incorrect spinlock release and PTP cap detection.
v3.10-rc1~66^2~195 32ceabc stmmac: improve/review and fix kernel-doc
v3.10-rc1~66^2~327^2~1 92ba688 stmmac: add the support for PTP hw clock driver
v3.10-rc1~66^2~327^2~2 891434b stmmac: add IEEE PTPv1 and PTPv2 support.

Especially ba1ffd7 looks suspicious.
Post by Ian Thompson
Apr 4 13:42:04 localhost user.info ptp4l: [537.164] rms 123 max 599 freq +255 +/- 39 delay 7362 +/- 48
Apr 4 13:42:29 localhost user.err ptp4l: [561.387] timed out while polling for tx timestamp
Apr 4 13:42:29 localhost user.err ptp4l: [561.387] increasing tx_timestamp_timeout may correct this issue, but it is likely caused by a driver bug
Apr 4 13:42:29 localhost user.err ptp4l: [561.387] port 1: send delay request failed
Apr 4 13:42:29 localhost user.notice ptp4l: [561.387] port 1: SLAVE
to FAULTY on FAULT_DETECTED (FT_UNSPECIFIED) Apr 4 13:42:45 localhost user.notice ptp4l: [577.388] port 1: FAULTY to LISTENING on FAULT_CLEARED
Apr 4 13:42:45 localhost user.warn ptp4l: [577.414] clockcheck: clock jumped backward or running slower than expected!
Apr 4 13:42:45 localhost user.notice ptp4l: [577.414] port 1: new
foreign master 000cec.fffe.0a085d-1 Apr 4 13:42:47 localhost
user.notice ptp4l: [579.414] selected best master clock
000cec.fffe.0a085d Apr 4 13:42:47 localhost user.notice ptp4l: [579.414] port 1: LISTENING to UNCALIBRATED on RS_SLAVE Apr 4 13:42:54 localhost user.notice ptp4l: [587.164] port 1: UNCALIBRATED to SLAVE on MASTER_CLOCK_SELECTED
Apr 4 13:46:46 localhost user.info ptp4l: [818.414] rms 2312500092 max 37000001557 freq +246 +/- 250 delay 7358 +/- 46
Apr 4 13:51:02 localhost user.info ptp4l: [1074.413] rms 116 max 681 freq +256 +/- 48 delay 7373 +/- 88
Does this imply that one lost delay request can do this, or is there a retry mechanism?
One lost delay request shouldn't introduct such a large error. This is a driver bug. Notice that the time error is 37 seconds, or the UTC/TAI offset.

When resetting the fault, ptp4l re-initializes HW time stamping.

The funtion, stmmac_hwtstamp_ioctl(), in

drivers/net/ethernet/stmicro/stmmac/stmmac_main.c

programs the system time (UTC) into the PHC every time HW time stamping is enabled. It shouldn't do that.
Post by Ian Thompson
We have a lot of traffic leaving the boards but only PTP traffic
coming in. As we increase the off board transfer rates the problem
seems to occur more often.
That could indicate a driver or a HW issue, or both.

HTH,
Richard
Richard Cochran
2017-04-11 13:26:44 UTC
Permalink
Post by Ian Thompson
I still see the issue with 3.18 but I haven't yet seen it on 4.9.
Unfortunately, we have a proprietary driver for a device on the pcie bus which doesn't yet support 4.x kernels and it is this that generates (via an application) most of the network traffic.
I might have to port all of the stmmac changes back to 3.18.
Maybe I wasn't clear enough in my answer, but the loss of one time
stamp is unfortunate but understandable, and it normally should be
tolerable.

The resetting of the PHC time (the cause of the 37 second error) in the
driver is just plain wrong, and you should fix that.
Post by Ian Thompson
If I add 37 seconds to getnstimeofday then the effect of the "glitch" is less pronounced.
Kernel 3.18 introduced timekeeping.c, with timekeeping_get_tai_offset(), which I thought might give me the UTC offset but it returns 0 at the point I call it.
Is there a call within the kernel to find the UTC offset?
The offset has to be provided to the kernel by user space by calling
adjtimex() with the ADJ_TAI mode set.

Thanks,
Richard
Ian Thompson
2017-04-12 14:30:51 UTC
Permalink
Richard

Sorry, more questions.
If I do "adjtimex -p", I get a raw time e.g. 1492005529.681665099.
This is system time, but is it TAI, Unix time or UTC?
I was assuming it was seconds from 00:00:00 UTC without leap seconds.

We are trying to timestamp some data collection with GPS time.
The stmicro MAC has an auxiliary register that can be updated with the clock registers on an external event.
We clock that event at 4kHz and use the values in the auxiliary registers to form the timestamp, which is a single 64bit value in uSeconds.
This value always seems to be 37 seconds off, even if I "correct" stmmac_hwtstamp_ioctl to not use UTC.

Is there something wrong with my thinking?

Thanks
Ian T.

-----Original Message-----
From: Richard Cochran [mailto:***@gmail.com]
Sent: Tuesday, April 11, 2017 8:27 AM
To: Ian Thompson
Cc: linuxptp-***@lists.sourceforge.net
Subject: Re: [External] Re: [Linuxptp-users] PTP - MAC time
Post by Ian Thompson
I still see the issue with 3.18 but I haven't yet seen it on 4.9.
Unfortunately, we have a proprietary driver for a device on the pcie bus which doesn't yet support 4.x kernels and it is this that generates (via an application) most of the network traffic.
I might have to port all of the stmmac changes back to 3.18.
Maybe I wasn't clear enough in my answer, but the loss of one time stamp is unfortunate but understandable, and it normally should be tolerable.

The resetting of the PHC time (the cause of the 37 second error) in the driver is just plain wrong, and you should fix that.
Post by Ian Thompson
If I add 37 seconds to getnstimeofday then the effect of the "glitch" is less pronounced.
Kernel 3.18 introduced timekeeping.c, with timekeeping_get_tai_offset(), which I thought might give me the UTC offset but it returns 0 at the point I call it.
Is there a call within the kernel to find the UTC offset?
The offset has to be provided to the kernel by user space by calling
adjtimex() with the ADJ_TAI mode set.

Thanks,
Richard
Richard Cochran
2017-04-12 17:37:42 UTC
Permalink
Post by Ian Thompson
If I do "adjtimex -p", I get a raw time e.g. 1492005529.681665099.
I assume that is some wrapper program around the adjtimex system call?
Post by Ian Thompson
This is system time, but is it TAI, Unix time or UTC?
The system call...

ADJTIMEX(2) Linux Programmer's Manual ADJTIMEX(2)

NAME
adjtimex - tune kernel clock

SYNOPSIS
#define _BSD_SOURCE /* See feature_test_macros(7) */
#include <sys/timex.h>

int adjtimex(struct timex *buf);

... returns UTC.
Post by Ian Thompson
I was assuming it was seconds from 00:00:00 UTC without leap seconds.
UTC always includes leap seconds by definition.
Post by Ian Thompson
We are trying to timestamp some data collection with GPS time.
The stmicro MAC has an auxiliary register that can be updated with the clock registers on an external event.
We clock that event at 4kHz and use the values in the auxiliary registers to form the timestamp, which is a single 64bit value in uSeconds.
This value always seems to be 37 seconds off, even if I "correct" stmmac_hwtstamp_ioctl to not use UTC.
Is there something wrong with my thinking?
I am not sure. Do you read /dev/ptpX to get those time stamps?
What does 'testptp -c' say about your device?

Thanks,
Richard

Continue reading on narkive:
Loading...