Discussion:
[etherlab-dev] [PATCH] Default branch patchset
Gavin Lambert
2016-05-03 06:35:41 UTC
Permalink
Hi all,

So I've finally managed to get some time to try bringing my patchset (which
still includes several patches by others) over to the default branch
(specifically 42b62867574d). Since that already incorporates some of them
I've been able to drop a few. :) Just for reference I'll quickly cover
those first:

frank-04-string-download.patch: in changeset a26dee45c467
frank-07-sdo-up-download.patch: in changeset 1aee02c1e294
frank-08-mrproper.patch: in changeset ae24ede76e16
frank-16-frame-corruption.patch: in changeset a380cce7d6f0
frank-29-init-restart.patch: in changeset ecef88726fc3
gavinl-0001-deactivate_unmap.patch: in changeset f99e5b11806c
gavinl-0002-dc_refclk_not_op.patch: in changeset 559f2f9c5b08
gavinl-0003-refclk_nxio.patch: in changeset 3affe9cd0b66
gavinl-0004-abort_slave_config_reg_requests.patch: in changeset
f2bc4000e47a
gavinl-0005-abort_detached_requests.patch: in changeset 0e4d098db815
gavinl-1002-foe_read_debug.patch: in changeset 764801a0f2aa
knud-0003-Eoe-mac-address-now-derived-from-unique-mac.patch: in
changeset e25af8bd3957
knud-0014-Maximum-length-of-foe-filename-extended-to-255.patch: in
changeset d123727b805b

knud-0015-Internal-SDO-requests-now-synchronized-with-external.patch: in
changeset a2701af27fde

Thanks to Dave Page and Florian Pose for getting the patches in, and of
course Frank Heckenbach and Knud Baastrup for the original patches.

There's a couple more that I used to have but I dropped because they didn't
apply cleanly on default, I wasn't sure of the best way to fix them, and in
my personal use case I didn't think I needed them (though I could be wrong);
they include:

knud-0001-Use-call-back-functions.patch
knud-0002-Input-output-counter-not-incremented-for-lrw.patch

I've replaced knud-0004-Semaphores-replaced-with-mutexes.patch (more on that
later). And I also dropped gavinl-1006-sdo_dict_fast_arrays.patch because
(as mentioned previously) the performance saving isn't really relevant with
knud-0010-Sdo-directory-now-only-fetched-on-request.patch applied.

Looking through the other changes in default, I did have some
concerns/questions about one particular changeset:
- 1a969896d52e "Added --enable-loop-control to make use of the loop
control registers."
In fsm_master.c: ec_fsm_master_state_open_port, it looks like the
state machine will become wedged if the received datagram WC isn't 1.
Shouldn't it either restart or move on to the next slave in this case?


Finally, that brings us to the actual patches included in here. They've all
been defuzzed (and in some cases edited slightly) for default, so they'll be
a little different from my previous stable-1.5 patchset. I've verified that
everything compiles after each individual patch, and that the corpus at
least basically still behaves itself (but I haven't given them a full soak
yet, so tread cautiously).

For the most part they can be applied in any order, though there are some
dependencies and I've renumbered them according to the intended minimum-fuzz
application order (so the names have changed slightly from the previous
release) -- and a series file is included so it can just be dropped in as an
HG-MQ or quilt patch queue. They include commit messages at the top of the
patch with more detail (and I've described most of them in prior posts), so
I'll just summarise here:

- 0001-graemef-dc_fixes_and_helpers.patch:
New patch; from Graeme Foot's email
http://lists.etherlab.org/pipermail/etherlab-users/2016/002916.html. Adds
ecrt_master_setup_domain_memory and ecrt_master_deactivate_slaves; fixes up
application-selected reference clocks.

- 0002-junyuan-dc_sync_issues.patch:
New patch; from the same email as above. "This sorts out some timing
issues to do with slave dc syncing."

- 0003-frank-index-reuse.patch:
New patch (was frank-14-index-reuse.patch); do not reuse the index of a
pending datagram.

- 0004-gavinl-Semaphores-replaced-with-mutexes.patch:
New patch -- replaces knud-0004-Semaphores-replaced-with-mutexes.patch;
this is the same basic idea (allow using rt-mutexes) but instead of being
replaced unconditionally this introduces a thin abstraction so that you can
select regular semaphores or rt-mutexes at configure time. The default is
semaphores, use --enable-rtmutex to get rt-mutexes. (I'm hoping that making
them optional will make this easier to merge to mainline.)

- Knud patches (0005-0013, 0016, 0017):
Previous patches; these improve mailbox handling and SII scan
performance. They're mostly the same as before except that I've updated the
locking for the above and added changes for the new EoE FSM; I don't use EoE
myself so I haven't tested these, but it should be consistent with the rest
of the mailbox FSMs at least.

- 0101-gavinl-mbox-finish-sm.patch:
Previous patch; this fixes the exit condition of the mailbox FSMs --
previously they returned whether they were sending a datagram or not, and
the parent assumed that this meant they were done if they didn't want to
send a datagram. Since the Knud patches (among other things) can cause idle
cycles now, this could cause unexpected early exit, so they now return
whether they're complete or not explicitly.

- 0102-gavinl-sdo_logging_verbosity.patch
New patch; this makes changeset a2701af27fde "Internal SDO requests now
synchronized with external requests." a bit less noisy.

- 0103-gavinl-clear_on_deactivate.patch:
New patch; it clears the master configuration when
ecrt_master_deactivate() is called even if ecrt_master_activate() has not
been called prior (though it still logs a warning). In particular, this
allows creating slave configs and request objects and then discarding them.
(I wanted this so that I could use the sdo_request API to perform some
requests asynchronously via the idle thread prior to activating the master.)

- 0104-gavinl-quick_op_watchdog.patch:
Previous patch; allows a slave in SAFEOP+ERR with Sync Watchdog Timeout
state (0x001B) to transition straight back to OP (do not pass through PREOP,
do not reconfigure). This lets a device recover faster from a comms
interruption that doesn't cause a network structure change.

- 0105-gavinl-more_state.patch:
Previous patch; adds some extra fields to the ecrt.h interface so the
application can know a bit more about what's going on.

- 0106-gavinl-topology_bypass.patch:
Previous patch; when doing delay measurement, detects ports that are
completely bypassed (eg. by a slave-based redundant ring topology), allowing
them to be ignored rather than using invalid timestamps.

- 0107-gavinl-topology_upstream.patch:
Previous patch; indicates the most-likely upstream port for each device.
Normally this should always be 0, but it can sometimes be 1-3 if a device
has been connected backwards accidentally or as a result of active
redundancy. Useful for diagnostics.

- 0108-gavinl-redundant_device.patch:
Previous patch; fixes some of the log messages so that they're
unambiguous when using master-side redundancy.

- 0109-gavinl-e1000e_watchdog.patch:
Previous patch; fixes the e1000e driver watchdog (particularly helps for
master-side redundancy, but also kicks it off the realtime thread).

- 0110-gavinl-reboot.patch:
Previous patch; adds feature to perform software reboot of slaves that
support this (via register 0x0040).

- 0111-gavinl-reg_readwrite.patch:
Previous patch; adds feature to perform register read+write requests;
useful for reading and clearing error counter registers without races.

- 0112-gavinl-reg_req_info.patch:
Previous patch; adds some additional level 1 logging for register
requests.

- 0113-gavinl-sdo_complete_upload.patch:
Previous patch; adds feature for SDO Complete Upload requests (sync
only); handy for reading large arrays/records from a slave more efficiently.

- 0114-gavinl-sdo_write_size.patch:
Previous patch; SDO write requests now have to explicitly specify the
desired write size, instead of inheriting it from the most recent read or
the initial capacity when created. [Breaking API change]

- 0115-gavinl-sdo_async_complete.patch:
Previous patch; extends 0113 to support async Complete Access as well.
[Breaking API change]

- 0116-gavinl-dc-sync-vs-sys-time.patch:
Previous patch; when resyncing the time of a DC slave in SAFEOP or OP
(during a network rescan as a result of the number of slaves changing),
deactivate sync generation before and reactivate it afterwards. This is
probably only safe for some slaves (it will cause several cycles with no DC
activations) but without either this, doing a full PREOP reconfigure, or
using a different method to sync the time, slaves would sometimes completely
stop DC sync generation forever when resynced (specifically, if the time is
adjusted to a point after the expected next activation time, and the slave
is using 64-bit DC; for 32-bit DC it will cause an up to 4 second halt
instead of halting forever). (Another way to avoid this is to disable scans
once in OP, but that makes it harder to find actual network changes.)

As always, I'm open to suggestions for improvements to any of the patches
(eg. avoiding API breakage, or any other suggestion or query or additional
patch, or reordering, splitting, or combining them or providing them in a
different format if that makes it easier to merge).

Also note that as before several of the gavinl patches alter the IOCTL
interface and so if applied the EC_IOCTL_VERSION_MAGIC value should be
incremented; I haven't included patches for this to avoid bumping it up too
many times if only a subset of patches are applied.

I'm preparing to make a fairly major change soon. :)
Knud Baastrup
2016-05-03 09:42:59 UTC
Permalink
Thanks, it is great to see the patches already merged into the default branch and we really hope that the remaining patches can be included as well (maybe this year?)

Gavin, concerning the two patches not included in your patch serie:

knud-0001-Use-call-back-functions.patch:
I have re-applied this still important patch on top of your patches and attach the patch.

knud-0002-Input-output-counter-not-incremented-for-lrw.patch:
I can see that this code was completely refactored by David Page in commit f0fdcc on default branch. The patch solve an issue where the number of working counter changes were calculated wrongly if domain data were spit in 2 datagrams where the last datagram only included output data and therefore used a LWR command instead of a LRW command. We will need to test if the refactored code solve the issue and eliminates the need for this patch.

Thanks for introducing the config option for rt- mutexes. I agree with you that this will make it more likely to me included on the default branch.


BR, Knud


-----Original Message-----
From: Gavin Lambert [mailto:***@compacsort.com]
Sent: 3. maj 2016 08:36
To: Florian Pose; 'David Page'; Knud Baastrup
Cc: etherlab-***@etherlab.org
Subject: [PATCH] Default branch patchset

Hi all,

So I've finally managed to get some time to try bringing my patchset (which still includes several patches by others) over to the default branch (specifically 42b62867574d). Since that already incorporates some of them I've been able to drop a few. :) Just for reference I'll quickly cover those first:

frank-04-string-download.patch: in changeset a26dee45c467
frank-07-sdo-up-download.patch: in changeset 1aee02c1e294
frank-08-mrproper.patch: in changeset ae24ede76e16
frank-16-frame-corruption.patch: in changeset a380cce7d6f0
frank-29-init-restart.patch: in changeset ecef88726fc3
gavinl-0001-deactivate_unmap.patch: in changeset f99e5b11806c
gavinl-0002-dc_refclk_not_op.patch: in changeset 559f2f9c5b08
gavinl-0003-refclk_nxio.patch: in changeset 3affe9cd0b66
gavinl-0004-abort_slave_config_reg_requests.patch: in changeset f2bc4000e47a
gavinl-0005-abort_detached_requests.patch: in changeset 0e4d098db815
gavinl-1002-foe_read_debug.patch: in changeset 764801a0f2aa
knud-0003-Eoe-mac-address-now-derived-from-unique-mac.patch: in changeset e25af8bd3957
knud-0014-Maximum-length-of-foe-filename-extended-to-255.patch: in changeset d123727b805b

knud-0015-Internal-SDO-requests-now-synchronized-with-external.patch: in changeset a2701af27fde

Thanks to Dave Page and Florian Pose for getting the patches in, and of course Frank Heckenbach and Knud Baastrup for the original patches.

There's a couple more that I used to have but I dropped because they didn't apply cleanly on default, I wasn't sure of the best way to fix them, and in my personal use case I didn't think I needed them (though I could be wrong); they include:

knud-0001-Use-call-back-functions.patch
knud-0002-Input-output-counter-not-incremented-for-lrw.patch

I've replaced knud-0004-Semaphores-replaced-with-mutexes.patch (more on that later). And I also dropped gavinl-1006-sdo_dict_fast_arrays.patch because (as mentioned previously) the performance saving isn't really relevant with knud-0010-Sdo-directory-now-only-fetched-on-request.patch applied.

Looking through the other changes in default, I did have some concerns/questions about one particular changeset:
- 1a969896d52e "Added --enable-loop-control to make use of the loop control registers."
In fsm_master.c: ec_fsm_master_state_open_port, it looks like the state machine will become wedged if the received datagram WC isn't 1.
Shouldn't it either restart or move on to the next slave in this case?


Finally, that brings us to the actual patches included in here. They've all been defuzzed (and in some cases edited slightly) for default, so they'll be a little different from my previous stable-1.5 patchset. I've verified that everything compiles after each individual patch, and that the corpus at least basically still behaves itself (but I haven't given them a full soak yet, so tread cautiously).

For the most part they can be applied in any order, though there are some dependencies and I've renumbered them according to the intended minimum-fuzz application order (so the names have changed slightly from the previous
release) -- and a series file is included so it can just be dropped in as an HG-MQ or quilt patch queue. They include commit messages at the top of the patch with more detail (and I've described most of them in prior posts), so I'll just summarise here:

- 0001-graemef-dc_fixes_and_helpers.patch:
New patch; from Graeme Foot's email
http://lists.etherlab.org/pipermail/etherlab-users/2016/002916.html. Adds ecrt_master_setup_domain_memory and ecrt_master_deactivate_slaves; fixes up application-selected reference clocks.

- 0002-junyuan-dc_sync_issues.patch:
New patch; from the same email as above. "This sorts out some timing issues to do with slave dc syncing."

- 0003-frank-index-reuse.patch:
New patch (was frank-14-index-reuse.patch); do not reuse the index of a pending datagram.

- 0004-gavinl-Semaphores-replaced-with-mutexes.patch:
New patch -- replaces knud-0004-Semaphores-replaced-with-mutexes.patch;
this is the same basic idea (allow using rt-mutexes) but instead of being replaced unconditionally this introduces a thin abstraction so that you can select regular semaphores or rt-mutexes at configure time. The default is semaphores, use --enable-rtmutex to get rt-mutexes. (I'm hoping that making them optional will make this easier to merge to mainline.)

- Knud patches (0005-0013, 0016, 0017):
Previous patches; these improve mailbox handling and SII scan performance. They're mostly the same as before except that I've updated the locking for the above and added changes for the new EoE FSM; I don't use EoE myself so I haven't tested these, but it should be consistent with the rest of the mailbox FSMs at least.

- 0101-gavinl-mbox-finish-sm.patch:
Previous patch; this fixes the exit condition of the mailbox FSMs -- previously they returned whether they were sending a datagram or not, and the parent assumed that this meant they were done if they didn't want to send a datagram. Since the Knud patches (among other things) can cause idle cycles now, this could cause unexpected early exit, so they now return whether they're complete or not explicitly.

- 0102-gavinl-sdo_logging_verbosity.patch
New patch; this makes changeset a2701af27fde "Internal SDO requests now synchronized with external requests." a bit less noisy.

- 0103-gavinl-clear_on_deactivate.patch:
New patch; it clears the master configuration when
ecrt_master_deactivate() is called even if ecrt_master_activate() has not been called prior (though it still logs a warning). In particular, this allows creating slave configs and request objects and then discarding them.
(I wanted this so that I could use the sdo_request API to perform some requests asynchronously via the idle thread prior to activating the master.)

- 0104-gavinl-quick_op_watchdog.patch:
Previous patch; allows a slave in SAFEOP+ERR with Sync Watchdog Timeout state (0x001B) to transition straight back to OP (do not pass through PREOP, do not reconfigure). This lets a device recover faster from a comms interruption that doesn't cause a network structure change.

- 0105-gavinl-more_state.patch:
Previous patch; adds some extra fields to the ecrt.h interface so the application can know a bit more about what's going on.

- 0106-gavinl-topology_bypass.patch:
Previous patch; when doing delay measurement, detects ports that are completely bypassed (eg. by a slave-based redundant ring topology), allowing them to be ignored rather than using invalid timestamps.

- 0107-gavinl-topology_upstream.patch:
Previous patch; indicates the most-likely upstream port for each device.
Normally this should always be 0, but it can sometimes be 1-3 if a device has been connected backwards accidentally or as a result of active redundancy. Useful for diagnostics.

- 0108-gavinl-redundant_device.patch:
Previous patch; fixes some of the log messages so that they're unambiguous when using master-side redundancy.

- 0109-gavinl-e1000e_watchdog.patch:
Previous patch; fixes the e1000e driver watchdog (particularly helps for master-side redundancy, but also kicks it off the realtime thread).

- 0110-gavinl-reboot.patch:
Previous patch; adds feature to perform software reboot of slaves that support this (via register 0x0040).

- 0111-gavinl-reg_readwrite.patch:
Previous patch; adds feature to perform register read+write requests; useful for reading and clearing error counter registers without races.

- 0112-gavinl-reg_req_info.patch:
Previous patch; adds some additional level 1 logging for register requests.

- 0113-gavinl-sdo_complete_upload.patch:
Previous patch; adds feature for SDO Complete Upload requests (sync only); handy for reading large arrays/records from a slave more efficiently.

- 0114-gavinl-sdo_write_size.patch:
Previous patch; SDO write requests now have to explicitly specify the desired write size, instead of inheriting it from the most recent read or the initial capacity when created. [Breaking API change]

- 0115-gavinl-sdo_async_complete.patch:
Previous patch; extends 0113 to support async Complete Access as well.
[Breaking API change]

- 0116-gavinl-dc-sync-vs-sys-time.patch:
Previous patch; when resyncing the time of a DC slave in SAFEOP or OP (during a network rescan as a result of the number of slaves changing), deactivate sync generation before and reactivate it afterwards. This is probably only safe for some slaves (it will cause several cycles with no DC
activations) but without either this, doing a full PREOP reconfigure, or using a different method to sync the time, slaves would sometimes completely stop DC sync generation forever when resynced (specifically, if the time is adjusted to a point after the expected next activation time, and the slave is using 64-bit DC; for 32-bit DC it will cause an up to 4 second halt instead of halting forever). (Another way to avoid this is to disable scans once in OP, but that makes it harder to find actual network changes.)

As always, I'm open to suggestions for improvements to any of the patches (eg. avoiding API breakage, or any other suggestion or query or additional patch, or reordering, splitting, or combining them or providing them in a different format if that makes it easier to merge).

Also note that as before several of the gavinl patches alter the IOCTL interface and so if applied the EC_IOCTL_VERSION_MAGIC value should be incremented; I haven't included patches for this to avoid bumping it up too many times if only a subset of patches are applied.

I'm preparing to make a fairly major change soon. :)
Graeme Foot
2016-05-04 00:41:57 UTC
Permalink
Hi,

Jesper originally needed the patch due to his module requiring a 3kb SII (which he generated from the TwinCAT xml), but the module only had 2kb of memory. The patch loads the first 32 bytes of the xml:
- alias
- vendor_id
- product_code
- revision
- serial number

from this it tries to find first a revision based file, then a generic product file
- 1st: ethercat/ec_%08x_%08x_%08x.bin (vendor_id, product_code, revision)
- 2nd: ethercat/ec_%08x_%08x. bin (vendor_id, product_code)

The alias is read from SII, so that is fine. Revision specific SII's is also covered. If a file can't be found then the remaining SII information is read.


I needed this patch due to our early Yaskawa amps having a faulty SII. The amp also did not allow new SII's to be uploaded. Downloading the SII from a newer amp and reading it from file for the older amp worked well.


So the cases where it is needed is if the SII memory is not big enough, or the SII memory is faulty and the module does not allow new SII information to be uploaded.


I have attached the patch.


Regards,
Graeme.


-----Original Message-----
From: Gavin Lambert [mailto:***@compacsort.com]
Sent: Wednesday, 4 May 2016 11:13 a.m.
To: Graeme Foot
Subject: RE: [etherlab-dev] [PATCH] Default branch patchset
Are you interested in a patch that can read the SII information for a
slave from
disk (if the slave has SII problems)?
It can look up the SII "firmware" using the hotplug/udev framework
(original
patch from Jesper Smith) or directly from a given directory via a
couple
of
defines (my changes).
It's an interesting idea, certainly. It's probably not something I'd make use of myself (all the slaves I'm using have good SII data) but I know there are slaves out there where the vendor has forgotten to program the EEPROM fully.

The naming convention bothers me, though; I would expect the SII data needs to differ between different revisions of the slave as well (it certainly does for my slaves).

Another problem is that without real SII data on the slaves themselves, they can't hold unique aliases, which is something else that I consider mandatory on my networks, at least. So I'm inclined to think that the "correct"
reaction is to reprogram such slaves with their correct SII rather than trying to patch it at runtime.

Although at least part of the SII must be working, otherwise you wouldn't be able to get the vendor/product ids (unless you're using an ENI file like TwinCAT, but that's a massive departure from the Etherlab model).

I'm not opposed to including it anyway (if for no other reason than keeping handy patches in one place makes them less likely to get lost in the mists of time) but I'm curious what your motivating cases were?
Loading...