[etherlab-dev] ec_lock_* vs. ec_ioctl

Discussion:

[etherlab-dev] ec_lock_* vs. ec_ioctl_lock in master/ioctl.c

Esben Haabendal

2018-02-27 15:39:10 UTC

Hi

I have been fixing a number of locking related issues, and have a hard
time figuring out how to handle locking in general, and in
master/ioctl.c in particular.

As of patch
base/0017-Master-locks-to-avoid-corrupted-datagram-queue.patch
there is now the macro pair of
ec_ioctl_lock_down_interruptible() and ec_ioctl_lock_up(), which maps to
ec_lock_down_interruptible() and ec_lock_up() for the non RTDM use-case.

But looking at the master/ioctl.c file as of patchset version 20171108,
there are the following number of calls:

8 ec_lock_down()
52 ec_lock_down_interruptible()
129 ec_lock_up()
6 ec_ioctl_lock_up()
4 ec_ioctl_lock_down_interruptible()

When should the ec_lock_* functions be called directly, and when should
they be wrapped (and thus compiled out for RTDM)?

And how is this supposed to work for RTDM in the first place?
I mean, there are code outside of ioctl.c which is called from ioctl.c
which are taking locks. Isn't this kind of defeating the purpose of
this idea?

Also, as far as I can see, EC_IOCTL_RTDM is only defined when compiling
rtdm-ioctl.c. When and how is master/ioctl.c supposed to be compiled
with EC_IOCTL_RTDM defined?

Best regards,
Esben Haabendal

Graeme Foot

2018-02-27 22:38:33 UTC

Permalink

Hi,

The ec_ioctl_lock_up() and ec_ioctl_lock_down_interruptible() calls were added to protect the following functions when multiple send/receive loops are running:
- ec_master_send()
- ec_master_receive()
- ec_master_domain_process()
- ec_master_domain_queue()

In my opinion any locking on these functions should be at the application level instead. However, if they are being called from multiple processes (rather than threads) then you need to use something like a named semaphore so that all processes share the same lock. Of course if you are using callbacks (for EoE) you are probably doing that anyway.

The reason that they have been named differently is that they are the functions that are called from the realtime send/receive loop and the define allows them to be ignored. RTAI (and Xenomi?) are NOT allowed to call standard Linux blocking functions from a hard realtime thread (to maintain hard realtime). The EC_IOCTL_RTDM define turns off these locks for (RTAI / Xenomi) RTDM applications to avoid this problem. They should probably also be turned off for RTAI / Xenomi in general and as I said above use application level locking.

You can pass --enable-rtdm when compiling to enable RTDM (and compile rtdm-ioctl.o).

So in summary:
ec_ioctl_lock_up() and ec_ioctl_lock_down_interruptible() are for the hard realtime loop function calls, otherwise use the standard lock calls.

Regards,
Graeme Foot

-----Original Message-----
From: etherlab-dev [mailto:etherlab-dev-***@etherlab.org] On Behalf Of Esben Haabendal
Sent: Wednesday, 28 February 2018 4:39 AM
To: etherlab-***@etherlab.org
Subject: [etherlab-dev] ec_lock_* vs. ec_ioctl_lock in master/ioctl.c

Hi

I have been fixing a number of locking related issues, and have a hard time figuring out how to handle locking in general, and in master/ioctl.c in particular.

As of patch
base/0017-Master-locks-to-avoid-corrupted-datagram-queue.patch
there is now the macro pair of
ec_ioctl_lock_down_interruptible() and ec_ioctl_lock_up(), which maps to
ec_lock_down_interruptible() and ec_lock_up() for the non RTDM use-case.

But looking at the master/ioctl.c file as of patchset version 20171108, there are the following number of calls:

8 ec_lock_down()
52 ec_lock_down_interruptible()
129 ec_lock_up()
6 ec_ioctl_lock_up()
4 ec_ioctl_lock_down_interruptible()

When should the ec_lock_* functions be called directly, and when should they be wrapped (and thus compiled out for RTDM)?

And how is this supposed to work for RTDM in the first place?
I mean, there are code outside of ioctl.c which is called from ioctl.c which are taking locks. Isn't this kind of defeating the purpose of this idea?

Also, as far as I can see, EC_IOCTL_RTDM is only defined when compiling rtdm-ioctl.c. When and how is master/ioctl.c supposed to be compiled with EC_IOCTL_RTDM defined?

Best regards,
Esben Haabendal

Esben Haabendal

2018-02-28 07:23:07 UTC

Permalink

Post by Graeme Foot
The ec_ioctl_lock_up() and ec_ioctl_lock_down_interruptible() calls were added
to protect the following functions when multiple send/receive loops are
- ec_master_send()
- ec_master_receive()
- ec_master_domain_process()
- ec_master_domain_queue()
In my opinion any locking on these functions should be at the application
level instead.

Well, I have a different opinion on that.

Implementation of locking is inherently difficult to get right. You
both have to ensure against race conditions while avoiding deadlocks.
But you also do not want to block for too long.

While I do acknowledge that for trivial cases where there are only a
single application, it is not a big program for that single application
to maintain synchronization, I don't think that it is a good solution to
let each application developer in the more complicated situations (like
multiple independent processes) have to do this without any help from
etherlabmaster. Forcing all (or at least some) application developers
to solve this same problem again and again should not be the best we can
do.

Post by Graeme Foot
However, if they are being called from multiple processes (rather than
threads) then you need to use something like a named semaphore so that
all processes share the same lock. Of course if you are using
callbacks (for EoE) you are probably doing that anyway.

You easily have We have multiple processes. Even with just a single
application, you have EtherCAT-OP, EtherCAT-EoE and the application.
All using some of the same shared data structures. Throwing more
multiple application just adds to that, but I think it is critical
enough with just a single application.

Post by Graeme Foot
The reason that they have been named differently is that they are the
functions that are called from the realtime send/receive loop and the define
allows them to be ignored. RTAI (and Xenomi?) are NOT allowed to call
standard Linux blocking functions from a hard realtime thread (to maintain
hard realtime).

Yes, I get that.

Post by Graeme Foot
The EC_IOCTL_RTDM define turns off these locks for (RTAI / Xenomi)
RTDM applications to avoid this problem.

Yes, EC_IOCTL_RTDM turns off these locks. But it does not really do
anything, as it is never defined.

Post by Graeme Foot
They should probably also be turned off for RTAI / Xenomi in general
and as I said above use application level locking.
You can pass --enable-rtdm when compiling to enable RTDM (and compile rtdm-ioctl.o).

Passing --enable-rtdm to ./configure will define EC_RTDM macro and
enable the automake conditional ENABLE_RTDM.

This will trigger Kbuild to build rtdm-ioctl.o (from rtdm-ioctl.c), and
this will be done with EC_IOCTL_RTDM macro defined.

But ioctl.c will be compiled as always, and that is without
EC_IOCTL_RTDM defined. Was it supposed to be defined? If so, it should
be easy to fix, but someone should definitely do some real testing of
it.

Post by Graeme Foot
ec_ioctl_lock_up() and ec_ioctl_lock_down_interruptible() are for the hard
realtime loop function calls, otherwise use the standard lock calls.

And these are the closed set of

- ec_ioctl_send()
- ec_ioctl_receive()
- ec_ioctl_domain_process()
- ec_ioctl_domain_queue()

?

/Esben

Graeme Foot

2018-03-01 00:02:10 UTC

Permalink

Post by Graeme Foot
-----Original Message-----
Sent: Wednesday, 28 February 2018 8:23 PM
Subject: Re: [etherlab-dev] ec_lock_* vs. ec_ioctl_lock in master/ioctl.c

Post by Graeme Foot
The ec_ioctl_lock_up() and ec_ioctl_lock_down_interruptible() calls
were added to protect the following functions when multiple
send/receive loops are
- ecrt_master_send()
- ecrt_master_receive()
- ecrt_master_domain_process()
- ecrt_master_domain_queue()
In my opinion any locking on these functions should be at the
application level instead.

Well, I have a different opinion on that.
Implementation of locking is inherently difficult to get right. You both have to ensure against race conditions while avoiding deadlocks.
But you also do not want to block for too long.
While I do acknowledge that for trivial cases where there are only a single application, it is not a big program for that single application to maintain synchronization, I don't think that it is a good solution to let each application developer in the more complicated situations (like multiple independent processes) have to do this without any help from etherlabmaster. Forcing all (or at least some) application developers to solve this same problem again and again should not be the best we can do.

These 4 functions are special. The master should be written (and mostly seems to be) in a fashion that between a ecrt_master_activate() and ecrt_master_deactivate() the above calls do not require any locks to synchronize with the master code, except for the EOE thread which uses callbacks. And the reason it uses callbacks is because only your application knows if it is appropriate to allow the EOE thread to call ecrt_master_recieve() and ecrt_master_send_ext() etc at any particular time.

However, if your application has multiple send/receive loops then they need to be synchronized with each other (see next comment). There are a few more functions such as the distributed clock calls that are also in the send/receive loops. They also do not require ethercat level locks as they should be safe to call between the activate and deactivate. They also do not need application level locking as they should really only be called by one send/receive loop per master. All other ethercat function calls should not be in a hard realtime context.

Post by Graeme Foot

You easily have We have multiple processes. Even with just a single application, you have EtherCAT-OP, EtherCAT-EoE and the application.
All using some of the same shared data structures. Throwing more multiple application just adds to that, but I think it is critical enough with just a single application.

Even if you have multiple processes (rather than threads), it is your design decision as to which process takes priority and whether a particular send/receive loop should wait a bit even if it could get the lock now. You can only do that if your application controls the locking of these functions.

Post by Graeme Foot

Post by Graeme Foot
The reason that they have been named differently is that they are the
functions that are called from the realtime send/receive loop and the
define allows them to be ignored. RTAI (and Xenomi?) are NOT allowed
to call standard Linux blocking functions from a hard realtime thread
(to maintain hard realtime).

Yes, I get that.

Post by Graeme Foot
The EC_IOCTL_RTDM define turns off these locks for (RTAI / Xenomi)
RTDM applications to avoid this problem.

Yes, EC_IOCTL_RTDM turns off these locks. But it does not really do anything, as it is never defined.

EC_IOCTL_RTDM gets defined for "rtdm-ioctl.c" if --enable-rtdm is used (as you say below). I don't know if you noticed, but "rtdm-ioctl.c" is just a link to "ioctl.c". So the rtdm version gets it, but the standard version does not.

Post by Graeme Foot

Passing --enable-rtdm to ./configure will define EC_RTDM macro and enable the automake conditional ENABLE_RTDM.
This will trigger Kbuild to build rtdm-ioctl.o (from rtdm-ioctl.c), and this will be done with EC_IOCTL_RTDM macro defined.
But ioctl.c will be compiled as always, and that is without EC_IOCTL_RTDM defined. Was it supposed to be defined? If so, it should be easy to fix, but someone should definitely do some real testing of it.

I think the locks should be disabled in both "rtdm-ioctl.c" and "ioctl.c" if using RTAI, but I use RTDM so haven't confirmed this. Further to that I don't think they should be there at all. Simple applications have one send/receive loop so don't need locks. Applications with multiple send/receive loops or EOE need to control their own locking for optimal results anyway.

Also, having these lock functions use master_sem the send/receive functions block unnecessarily with non-realtime ethercat function calls. At a minimum they should be locking on their own semaphore.

Post by Graeme Foot

Post by Graeme Foot
ec_ioctl_lock_up() and ec_ioctl_lock_down_interruptible() are for the
hard realtime loop function calls, otherwise use the standard lock calls.

And these are the closed set of
- ecrt_ioctl_send()
- ecrt_ioctl_receive()
- ecrt_ioctl_domain_process()
- ecrt_ioctl_domain_queue()
?

Yes they are only for these 4 calls as these are the ones used in the hard realtime send/receive loops, so need to be disabled for RTAI.

Post by Graeme Foot
/Esben

Regards,
Graeme.

Esben Haabendal

2018-03-02 09:27:03 UTC