Discussion:
[Linuxptp-users] issue with NetXtreme II BCM57810
Luke Bigum
2016-10-04 10:49:46 UTC
Permalink
Hello,

I have a hardware timestamping issue with a Broadcom NIC that I'd like a few opinions on. I'm 95% sure it's a NIC / driver problem, but I need some expert opinions before I go down the vendor / manufacturer route.

Both ptp4l and a build of ptpd using the PHC subsystem exhibit similar problems - either daemon continuously thinks the PHC needs to be stepped forward, anywhere from 1 - 1.5 seconds for each second of wall clock time.

If I do some simple tests with linux/Documentation/ptp/testptp.c, it appears the PHC is spinning at about half the Hz it should be doing (it takes roughly 2 real seconds for the PHC to advance 1 second). While it is an anecdotal observation, half as fast seems rather uniform a problem, like there's a /2 somewhere gone wrong in the driver code... If I wanted to dig into the bnx2x source, any pointers on where to start looking?



The rest of this email are log file + relevant details about the OS and NIC...


Oct 3 15:38:36 isptpsl01 ptp4l: [14201.187] port 1: UNCALIBRATED to SLAVE on MASTER_CLOCK_SELECTED
Oct 3 15:38:37 isptpsl01 ptp4l: [14202.185] master offset -1453645423 s0 freq -31000000 path delay 142741226
Oct 3 15:38:37 isptpsl01 ptp4l: [14202.185] port 1: SLAVE to UNCALIBRATED on SYNCHRONIZATION_FAULT
Oct 3 15:38:38 isptpsl01 ptp4l: [14203.185] master offset -1938178522 s0 freq -31000000 path delay 142741226
Oct 3 15:38:39 isptpsl01 ptp4l: [14204.185] master offset -2422706382 s1 freq -31000000 path delay 142741226
Oct 3 15:38:41 isptpsl01 ptp4l: [14206.185] master offset -962038200 s2 freq -31000000 path delay 135714969
Oct 3 15:38:41 isptpsl01 ptp4l: [14206.187] port 1: UNCALIBRATED to SLAVE on MASTER_CLOCK_SELECTED
Oct 3 15:38:42 isptpsl01 ptp4l: [14207.185] master offset -1446569992 s0 freq -31000000 path delay 135714969
Oct 3 15:38:42 isptpsl01 ptp4l: [14207.185] port 1: SLAVE to UNCALIBRATED on SYNCHRONIZATION_FAULT
Oct 3 15:38:43 isptpsl01 ptp4l: [14208.185] master offset -1931098770 s0 freq -31000000 path delay 135714969
Oct 3 15:38:44 isptpsl01 ptp4l: [14209.185] master offset -2422655704 s1 freq -31000000 path delay 142741226
Oct 3 15:38:45 isptpsl01 ptp4l: [14210.185] master offset -484531919 s2 freq -31000000 path delay 142741226
Oct 3 15:38:45 isptpsl01 ptp4l: [14210.187] port 1: UNCALIBRATED to SLAVE on MASTER_CLOCK_SELECTED
Oct 3 15:38:46 isptpsl01 ptp4l: [14211.185] master offset -969076973 s2 freq -31000000 path delay 142741226
Oct 3 15:38:47 isptpsl01 ptp4l: [14212.185] master offset -1446579628 s0 freq -31000000 path delay 135714969
Oct 3 15:38:47 isptpsl01 ptp4l: [14212.185] port 1: SLAVE to UNCALIBRATED on SYNCHRONIZATION_FAULT
Oct 3 15:38:48 isptpsl01 ptp4l: [14213.185] master offset -1925293801 s0 freq -31000000 path delay 129888779
Oct 3 15:38:49 isptpsl01 ptp4l: [14214.186] master offset -2370870770 s1 freq -31000000 path delay 90938984
Oct 3 15:38:51 isptpsl01 ptp4l: [14216.186] master offset -969077157 s2 freq -31000000 path delay 90938984


The controller is:

01:00.7 Ethernet controller: Broadcom Corporation NetXtreme II BCM57810 10 Gigabit Ethernet Multi Function (rev 10)
Subsystem: Dell Device 0636


Tried two kernels and drivers:

kernel: 3.18.14
driver: bnx2x
version: 1.710.51-0
firmware-version: FFV7.12.15 bc 7.12.4

kernel: 4.6.5
driver: bnx2x
version: 1.712.30-0
firmware-version: FFV7.12.15 bc 7.12.4



And apparent capabilities of the NIC:

Time stamping parameters for em2_1:

Capabilities:
hardware-transmit (SOF_TIMESTAMPING_TX_HARDWARE)
software-transmit (SOF_TIMESTAMPING_TX_SOFTWARE)
hardware-receive (SOF_TIMESTAMPING_RX_HARDWARE)
software-receive (SOF_TIMESTAMPING_RX_SOFTWARE)
software-system-clock (SOF_TIMESTAMPING_SOFTWARE)
hardware-raw-clock (SOF_TIMESTAMPING_RAW_HARDWARE)
PTP Hardware Clock: 1
Hardware Transmit Timestamp Modes:
off (HWTSTAMP_TX_OFF)
on (HWTSTAMP_TX_ON)
Hardware Receive Filter Modes:
none (HWTSTAMP_FILTER_NONE)
ptpv1-l4-event (HWTSTAMP_FILTER_PTP_V1_L4_EVENT)
ptpv1-l4-sync (HWTSTAMP_FILTER_PTP_V1_L4_SYNC)
ptpv1-l4-delay-req (HWTSTAMP_FILTER_PTP_V1_L4_DELAY_REQ)
ptpv2-l4-event (HWTSTAMP_FILTER_PTP_V2_L4_EVENT)
ptpv2-l4-sync (HWTSTAMP_FILTER_PTP_V2_L4_SYNC)
ptpv2-l4-delay-req (HWTSTAMP_FILTER_PTP_V2_L4_DELAY_REQ)
ptpv2-l2-event (HWTSTAMP_FILTER_PTP_V2_L2_EVENT)
ptpv2-l2-sync (HWTSTAMP_FILTER_PTP_V2_L2_SYNC)
ptpv2-l2-delay-req (HWTSTAMP_FILTER_PTP_V2_L2_DELAY_REQ)
ptpv2-event (HWTSTAMP_FILTER_PTP_V2_EVENT)
ptpv2-sync (HWTSTAMP_FILTER_PTP_V2_SYNC)
ptpv2-delay-req (HWTSTAMP_FILTER_PTP_V2_DELAY_REQ)



ptp4l config file (I put in stupidly big step thresholds just to see what happens):

[global]
#
# Default Data Set
#
twoStepFlag 1
slaveOnly 1
priority1 128
priority2 128
domainNumber 0
clockClass 248
clockAccuracy 0xFE
offsetScaledLogVariance 0xFFFF
free_running 0
freq_est_interval 1
#
# Port Data Set
#
logAnnounceInterval 1
logSyncInterval 0
logMinDelayReqInterval 0
logMinPdelayReqInterval 0
announceReceiptTimeout 3
syncReceiptTimeout 0
delayAsymmetry 0
fault_reset_interval 4
neighborPropDelayThresh 20000000
#
# Run time options
#
assume_two_step 0
logging_level 6
path_trace_enabled 0
follow_up_info 0
hybrid_e2e 0
tx_timestamp_timeout 1
use_syslog 1
verbose 0
summary_interval 0
kernel_leap 1
check_fup_sync 0
#
# Servo Options
#
pi_proportional_const 0.0
pi_integral_const 0.0
pi_proportional_scale 0.0
pi_proportional_exponent -0.3
pi_proportional_norm_max 0.7
pi_integral_scale 0.0
pi_integral_exponent 0.4
pi_integral_norm_max 0.3
step_threshold 1.0
first_step_threshold 1.0
max_frequency 900000000
clock_servo pi
sanity_freq_limit 1000000000
ntpshm_segment 0
#
# Transport options
#
transportSpecific 0x0
ptp_dst_mac 01:1B:19:00:00:00
p2p_dst_mac 01:80:C2:00:00:0E
udp_ttl 8
udp6_scope 0x0E
uds_address /var/run/ptp4l
#
# Default interface options
#
network_transport UDPv4
delay_mechanism E2E
time_stamping hardware
tsproc_mode filter
delay_filter moving_median
delay_filter_length 10
egressLatency 0
ingressLatency 0
boundary_clock_jbod 0
#
# Clock description
#
productDescription ;;
revisionData ;;
manufacturerIdentity 00:00:00
userDescription ;
timeSource 0xA0
--
Luke Bigum
Senior Systems Engineer

Information Systems
---

LMAX Exchange, Yellow Building, 1A Nicholas Road, London W11 4AN
http://www.LMAX.com/

Recognised by the most prestigious business and technology awards

2016 Best Trading & Execution, HFM US Technology Awards
2016, 2015, 2014, 2013 Best FX Trading Venue - ECN/MTF, WSL Institutional Trading Awards

2015 Winner, Deloitte UK Technology Fast 50
2015, 2014, 2013, One of the UK's fastest growing technology firms, The Sunday Times Tech Track 100
2015 Winner, Deloitte EMEA Technology Fast 500
2015, 2014, 2013 Best Margin Sector Platform, Profit & Loss Readers' Choice Awards

---

FX and CFDs are leveraged products that can result in losses exceeding your deposit. They are not suitable for everyone so please ensure you fully understand the risks involved.

This message and its attachments are confidential, may not be disclosed or used by any person other than the addressee and are intended only for the named recipient(s). This message is not intended for any recipient(s) who based on their nationality, place of business, domicile or for any other reason, is/are subject to local laws or regulations which prohibit the provision of such products and services. This message is subject to the following terms (http://lmax.com/pdf/general-disclaimers.pdf), if you cannot access these, please notify us by replying to this email and we will send you the terms. If you are not the intended recipient, please notify the sender immediately and delete any copies of this message.

LMAX Exchange is the trading name of LMAX Limited. LMAX Limited operates a multilateral trading facility. LMAX Limited is authorised and regulated by the Financial Conduct Authority (firm registration number 509778) and is a company registered in England and Wales (number 6505809).

LMAX Hong Kong Limited is a wholly-owned subsidiary of LMAX Limited. LMAX Hong Kong is licensed by the Securities and Futures Commission in Hong Kong to conduct Type 3 (leveraged foreign exchange trading) regulated activity with CE Number BDV088.
Richard Cochran
2016-10-04 13:28:39 UTC
Permalink
Post by Luke Bigum
If I do some simple tests with linux/Documentation/ptp/testptp.c, it
appears the PHC is spinning at about half the Hz it should be doing
(it takes roughly 2 real seconds for the PHC to advance 1
second). While it is an anecdotal observation, half as fast seems
rather uniform a problem, like there's a /2 somewhere gone wrong in
the driver code... If I wanted to dig into the bnx2x source, any
pointers on where to start looking?
I don't have this card nor was I involved in reviewing the driver.
The file, drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c, contains
the PHC stuff.

Looking at bnx2x_init_cyclecounter(), they have mult = shift = 1.
That implies that

nanoseconds = ticks / 2

and that their internal clock runs at 2 GHz!

Maybe they meant to have shift=0, or maybe your card has a (model
specific 1 GHz clock).

Anyhow, the testptp result is clear enough. You should take this up
with the driver maintainers.

Thanks,
Richard
Luke Bigum
2016-10-04 14:18:05 UTC
Permalink
----- Original Message -----
Sent: Tuesday, 4 October, 2016 14:28:39
Subject: Re: [Linuxptp-users] issue with NetXtreme II BCM57810
I don't have this card nor was I involved in reviewing the driver.
The file, drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c, contains
the PHC stuff.
Looking at bnx2x_init_cyclecounter(), they have mult = shift = 1.
That implies that
nanoseconds = ticks / 2
and that their internal clock runs at 2 GHz!
Maybe they meant to have shift=0, or maybe your card has a (model
specific 1 GHz clock).
Anyhow, the testptp result is clear enough. You should take this up
with the driver maintainers.
Thanks,
Richard
Thanks Richard, that's helpful. I'll track it down.
---

LMAX Exchange, Yellow Building, 1A Nicholas Road, London W11 4AN
http://www.LMAX.com/

Recognised by the most prestigious business and technology awards

2016 Best Trading & Execution, HFM US Technology Awards
2016, 2015, 2014, 2013 Best FX Trading Venue - ECN/MTF, WSL Institutional Trading Awards

2015 Winner, Deloitte UK Technology Fast 50
2015, 2014, 2013, One of the UK's fastest growing technology firms, The Sunday Times Tech Track 100
2015 Winner, Deloitte EMEA Technology Fast 500
2015, 2014, 2013 Best Margin Sector Platform, Profit & Loss Readers' Choice Awards

---

FX and CFDs are leveraged products that can result in losses exceeding your deposit. They are not suitable for everyone so please ensure you fully understand the risks involved.

This message and its attachments are confidential, may not be disclosed or used by any person other than the addressee and are intended only for the named recipient(s). This message is not intended for any recipient(s) who based on their nationality, place of business, domicile or for any other reason, is/are subject to local laws or regulations which prohibit the provision of such products and services. This message is subject to the following terms (http://lmax.com/pdf/general-disclaimers.pdf), if you cannot access these, please notify us by replying to this email and we will send you the terms. If you are not the intended recipient, please notify the sender immediately and delete any copies of this message.

LMAX Exchange is the trading name of LMAX Limited. LMAX Limited operates a multilateral trading facility. LMAX Limited is authorised and regulated by the Financial Conduct Authority (firm registration number 509778) and is a company registered in England and Wales (number 6505809).

LMAX Hong Kong Limited is a wholly-owned subsidiary of LMAX Limited. LMAX Hong Kong is licensed by the Securities and Futures Commission in Hong Kong to conduct Type 3 (leveraged foreign exchange trading) regulated activity with CE Number BDV088.
Luke Bigum
2016-10-11 09:13:28 UTC
Permalink
----- Original Message -----
Sent: Tuesday, 4 October, 2016 15:18:05
Subject: Re: [Linuxptp-users] issue with NetXtreme II BCM57810
----- Original Message -----
Sent: Tuesday, 4 October, 2016 14:28:39
Subject: Re: [Linuxptp-users] issue with NetXtreme II BCM57810
I don't have this card nor was I involved in reviewing the driver.
The file, drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c, contains
the PHC stuff.
Looking at bnx2x_init_cyclecounter(), they have mult = shift = 1.
That implies that
nanoseconds = ticks / 2
and that their internal clock runs at 2 GHz!
Maybe they meant to have shift=0, or maybe your card has a (model
specific 1 GHz clock).
Anyhow, the testptp result is clear enough. You should take this up
with the driver maintainers.
Thanks,
Richard
Thanks Richard, that's helpful. I'll track it down.
For the Internet record, Qlogic have fixed this in version 7.14.05 of their 10Gig driver pack. I've compiled this bnx2x driver against a 3.18 kernel and the PHC advances at regular speed. The source and some binaries are available here:

http://driverdownloads.qlogic.com/QLogicDriverDownloads_UI/SearchByProduct.aspx?ProductCategory=322&Product=1249&Os=65

I don't know when they will be added to the mainline kernel.



-Luke

---

LMAX Exchange, Yellow Building, 1A Nicholas Road, London W11 4AN
http://www.LMAX.com/

Recognised by the most prestigious business and technology awards

2016 Best Trading & Execution, HFM US Technology Awards
2016, 2015, 2014, 2013 Best FX Trading Venue - ECN/MTF, WSL Institutional Trading Awards

2015 Winner, Deloitte UK Technology Fast 50
2015, 2014, 2013, One of the UK's fastest growing technology firms, The Sunday Times Tech Track 100
2015 Winner, Deloitte EMEA Technology Fast 500
2015, 2014, 2013 Best Margin Sector Platform, Profit & Loss Readers' Choice Awards

---

FX and CFDs are leveraged products that can result in losses exceeding your deposit. They are not suitable for everyone so please ensure you fully understand the risks involved.

This message and its attachments are confidential, may not be disclosed or used by any person other than the addressee and are intended only for the named recipient(s). This message is not intended for any recipient(s) who based on their nationality, place of business, domicile or for any other reason, is/are subject to local laws or regulations which prohibit the provision of such products and services. This message is subject to the following terms (http://lmax.com/pdf/general-disclaimers.pdf), if you cannot access these, please notify us by replying to this email and we will send you the terms. If you are not the intended recipient, please notify the sender immediately and delete any copies of this message.

LMAX Exchange is the trading name of LMAX Limited. LMAX Limited operates a multilateral trading facility. LMAX Limited is authorised and regulated by the Financial Conduct Authority (firm registration number 509778) and is a company registered in England and Wales (number 6505809).

LMAX Hong Kong Limited is a wholly-owned subsidiary of LMAX Limited. LMAX Hong Kong is licensed by the Securities and Futures Commission in Hong Kong to conduct Type 3 (leveraged foreign exchange trading) regulated activity with CE Number BDV088.
Loading...