[etherlab-dev] Userspace fork of Etherlab

Discussion:

Frank Heckenbach

2014-11-26 05:31:18 UTC

Hi everybody,

this is to announce my plan to port the Etherlab code to userspace.
I'll explain my reasons and roadmap below. If someone is interested
in this or has some comments, please let me know. Otherwise, I
expect to proceed on my own and publish the result on my web site
when finished.

Reasons:

- I had to make a number of changes to the Etherlab code to fix some
bugs and make it useable to us. The Etherlab developers are
obviously not interested in those changes, so I have to maintain
them myself. As was discussed on this list some months ago,
keeping up with newer Etherlab versions would cost me additional
maintenance and testing effort (already now since my changes are
based on 1.5.0, whereas 1.5.2 has some conflicting changes), so
I'd use my own fork of 1.5.0 which is known to work for us rather
than 1.5.2 anyway.

- Keeping up with new kernel versions is also not always easy
(especially for the drivers which are patched files from the
standard kernel, but also other kernel interfaces are known to
change often), whereas userspace code is much easier to maintain
(incompatible library changes are quite rare).

- So far we've been using RTAI for our realtime code. But that's
also always been a bit troublesome (kernel version dependencies,
high crash potential in case of problems, additional code with its
own set of bugs, etc.), so we'd rather try to get rid of it
anyway. Meanwhile the RT capabilities of the standard kernel have
improved in recent years, and due to the wide availability of SMP,
we can, if necessary, increase RT-ability by using CPU affinity
(reserve one CPU for RT code, leave the other CPUs for the rest --
of course, kernel code, esp. network drivers might need special
consideration here).

- Our application code uses a lot of floating point which is
supported in RTAI (though with some quirks), not in non-RTAI
kernel mode, but of course easily in userspace.

- Userspace code is generally much easier to debug.

- If I had known and considered all this back then, I might have
started with other code instead of Etherlab which is already
userspace based, but has different interfaces (and possibly
different bugs). But as things are now, since my code is tightly
bound to the Etherlab interface, and well tested with (the patched
version of) it, it seems easier to port this code to userspace
than change my application code.

Roadmap:

- Start with 1.5.0 plus my patches.

- Replace kernel infrastructure by corresponding userspace
functionality (e.g. kthread -> pthread, semaphore ->
pthread_mutex, kmalloc -> malloc, kprintf -> fprintf); copy kernel
utilities that have no actual kernel dependencies (in particular
kernel lists).

- Turn the code into a library which the application can use, so the
userspace application thread(s) will do what the (normal or RTAI)
kernel application thread(s) did and, being part of the same
process, they can access the master's data structures (e.g. PDO
memory, datagram queues) without explicit shared memory or such.
(Though the library will start some threads of its own, e.g.
master, EoE and cdev (see below), just like the master now starts
some kthreads.) This will not allow the master to stay resident
while the application is restarted, so it will need to re-scan the
bus every time. This is not a high-priority problem for me
(because we rarely restart our application code, and when we do,
the hardware reinitialization takes longer than the bus scan
anyway). If needed, one could solve this problem by putting a
layer in userspace in between which would then need to do IPC
and/or use shared memory for communication.

- Use the generic backend driver which also shouldn't require too
many changes since it already uses a raw ethernet socket which is
also available from userspace).

- Implement EoE via a tap/tun device.

- Replace the cdev ioctl interface by a (normal TCP) socket and
adapt the library to use it. Basically wrap each ioctl data
structure as-is together with its cmd value in a data packet.

- Omit the mmap for the cdev because I don't need it. (If someone
has a need for it, it could be supported via userspace shared
memory.)

- Omit the tty because I don't need it. (If someone has a need for
it, it could be supported via a pty.)

- Generally, make as few changes to the code as possible
(e.g. #define stuff in a common header rather than making global
replacements wherever possible).

Regards,
Frank

Florian Pose

2014-11-26 08:49:28 UTC

Permalink

Hallo Frank,

Post by Frank Heckenbach
this is to announce my plan to port the Etherlab code to userspace.
I'll explain my reasons and roadmap below. If someone is interested
in this or has some comments, please let me know. Otherwise, I
expect to proceed on my own and publish the result on my web site
when finished.

we discussed many of your arguments before, so let me comment on them
below.

Post by Frank Heckenbach
- I had to make a number of changes to the Etherlab code to fix some
bugs and make it useable to us. The Etherlab developers are
obviously not interested in those changes, so I have to maintain
them myself. As was discussed on this list some months ago,
keeping up with newer Etherlab versions would cost me additional
maintenance and testing effort (already now since my changes are
based on 1.5.0, whereas 1.5.2 has some conflicting changes), so
I'd use my own fork of 1.5.0 which is known to work for us rather
than 1.5.2 anyway.

I wrote, that I *am* willing to include your patches (I already prepared
the default branch to include them), but due to my running projects I am
a little later than expected, but I'm sure I will manage it this year.
So please be unpatient, this is an open-source project.

Post by Frank Heckenbach
- Keeping up with new kernel versions is also not always easy
(especially for the drivers which are patched files from the
standard kernel, but also other kernel interfaces are known to
change often), whereas userspace code is much easier to maintain
(incompatible library changes are quite rare).

The problem is, that using a socket (generic driver) is not always
sufficient. My first goal is to keep stable interfaces for RTAI, Xenomai
(Kernel) and lxrt, posix (Userspace). Using a socket automatically uses
the lower network stack layers and is *not* realtime-capable up to now
(even with an RT-preempt-patched kernel). So the generic driver is one
option. It may work, but there are setups where it definitely does not.
That's why we keep the native drivers and have to stay in kernel (at
least with the driver layer).

Post by Frank Heckenbach
- So far we've been using RTAI for our realtime code. But that's
also always been a bit troublesome (kernel version dependencies,
high crash potential in case of problems, additional code with its
own set of bugs, etc.), so we'd rather try to get rid of it
anyway. Meanwhile the RT capabilities of the standard kernel have
improved in recent years, and due to the wide availability of SMP,
we can, if necessary, increase RT-ability by using CPU affinity
(reserve one CPU for RT code, leave the other CPUs for the rest --
of course, kernel code, esp. network drivers might need special
consideration here).

We also dropped RTAI a long time ago for our projects, but you have to
always be grateful in what you support, because there are many users
that still require RTAI, lxrt, etc.

Post by Frank Heckenbach
- Our application code uses a lot of floating point which is
supported in RTAI (though with some quirks), not in non-RTAI
kernel mode, but of course easily in userspace.

Noone forces you to have your application in kernel space. The
userspace-library offers the same API in userspace.

Post by Frank Heckenbach
- Userspace code is generally much easier to debug.
- If I had known and considered all this back then, I might have
started with other code instead of Etherlab which is already
userspace based, but has different interfaces (and possibly
different bugs). But as things are now, since my code is tightly
bound to the Etherlab interface, and well tested with (the patched
version of) it, it seems easier to port this code to userspace
than change my application code.

Your argumentation sounds a little bit like you don't know of the
userspace library at all. Please take a look in the documentation and
the lib/ subdirectory of the stable-1.5 branch. Our realtime
applications of the last years are alltogether in userspace.

I had the idea to pull the master statemachine out of kernel-space but I
dropped it because of the additional interface complexity coming with
it.

As long as the generic driver is not completely realtime capable and the
kernel interfaces shall be maintained, there are strong arguments to
keep a significant part of the master code in kernel-space.

--
Best regards,
Florian Pose

http://etherlab.org

Armin Steinhoff

2014-11-26 22:18:54 UTC

Permalink

Post by Florian Pose
Hallo Frank,
[ clip]
As long as the generic driver is not completely realtime capable and the
kernel interfaces shall be maintained, there are strong arguments to
keep a significant part of the master code in kernel-space.

I'm not conviienced that a significant part of the master code must be
in the kernel space!
I did recently an integration of the openPowerlink stack and the
userspace driver of Intel's i210 board as used within the DPDK project.
It is working in real-time under PREEMPT_RT LINUX after solving problems
with the different buffer systems.

A similar solution should also be possible with EtherCAT ..

Cheers

Armin Steinhoff

Frank Heckenbach

2015-01-16 13:00:18 UTC

Permalink

Post by Florian Pose

we discussed many of your arguments before, so let me comment on them
below.

Just out of curiosity, did you actually do it? I searched the
repository ("default", "stable-1.5") for things from my patches, but
didn't find anything. (Or did I get confused by the versioning
again?)

Well, it's probably too late now anyway. I'll need to really start
my project next week.

We're also upgrading to a new kernel version and everything which
means I'd first have to build a new RTAI kernel which has always
been a bit problematic in my experience. OTOH, Martin Troxler sent
me patches from his started userspace port. So even if I had a
working 1.5.2 with my patches integrated, the userspace port in fact
seems the easier option to me now (especially in the long run).

Post by Florian Pose
So please be unpatient, this is an open-source project.

This means there are always two options, integrate or fork. Well ...

Post by Florian Pose

Latency was my main worry, so I did some timing tests. I used a
simple C program that sends and receives valid trivial EtherCAT
packets. I ran it in soft-realtime with CPU affinity (both of which
proved necessary to get the results I did). During the test I
exerted heavy load on the CPU, disk I/O and another ethernet
interface.

My result was that with very small packets, I can get cycle times of
1500ms without overruns. As the size of the packet (or packets if
larger than MTU) per cycles increases, so does the cycle time to run
reliably, and with really large packets I can get cycle times such
that a bit more than 50% of theoretical bandwidth is used in either
direction (which seems quie reasonable since it's what I've
experienced and seen recommended for other communication protocols,
and it also proves full-duplex works, otherwise no more than 50%
would be possible).

Our project runs at 2000ms (500Hz), and at that rate, I could get
packets of 5KB/cycle without overruns which is way above what we
require. So the userspace port will only be for "slow" cycles (up to
500Hz), but that's what we need. Maybe in a few years, the standard
kernel's RT features will improve and the userspace code will allow
for faster cycle times without many changes.

Post by Florian Pose

We also dropped RTAI a long time ago for our projects, but you have to
always be grateful in what you support, because there are many users
that still require RTAI, lxrt, etc.

I'm not criticizing your choice to still support RTAI, just saying
that I hope I won't need it anymore.

Post by Florian Pose

Post by Frank Heckenbach
- Our application code uses a lot of floating point which is
supported in RTAI (though with some quirks), not in non-RTAI
kernel mode, but of course easily in userspace.

Noone forces you to have your application in kernel space. The
userspace-library offers the same API in userspace.

I know about the userspace library. But moving to userspace is not
my main goal. My main goal is better maintainability for my
application, and getting rid of kernel dependencies is a step
towards this goal. Using the kernelspace Etherlab code with a
userspace application wouldn't help in any significant way since I'd
have the same amount of kernel dependencies (my application does not
contain many).

Besides, I'm not sure whether to trust the userspace library and the
cdev it uses. Don't take this personally, but given the number of
bugs I've found and fixed in the code, and the fact that the two
versions sometimes use parallel code to achieve the same things (cf.
my comment about my patch #27), you must understand that I'm at
least a bit skeptical. And even though I didn't use it much, I did
already seem to find some locking bugs in the cdev code (see my
patch #19). So realistically, when using this for my application,
I'd have to expect a long time of testing before I could consider it
stable ...

Regards,
Frank

--
Dipl.-Math. Frank Heckenbach <***@fh-soft.de>
Stubenlohstr. 6, 91052 Erlangen, Germany, +49-9131-21359
Systems Programming, Software Development, IT Consulting

Gavin Lambert

2015-01-18 22:39:15 UTC

Permalink

Post by Frank Heckenbach
My result was that with very small packets, I can get cycle times of

1500ms

Post by Frank Heckenbach
without overruns. As the size of the packet (or packets if larger than

MTU)

Post by Frank Heckenbach
per cycles increases, so does the cycle time to run reliably, and with

really

Post by Frank Heckenbach
large packets I can get cycle times such that a bit more than 50% of
theoretical bandwidth is used in either direction (which seems quie
reasonable since it's what I've experienced and seen recommended for other
communication protocols, and it also proves full-duplex works, otherwise

Post by Frank Heckenbach
more than 50% would be possible).
Our project runs at 2000ms (500Hz), and at that rate, I could get packets

Post by Frank Heckenbach
5KB/cycle without overruns which is way above what we require. So the
userspace port will only be for "slow" cycles (up to 500Hz), but that's

what

Post by Frank Heckenbach
we need. Maybe in a few years, the standard kernel's RT features will

improve

Post by Frank Heckenbach
and the userspace code will allow for faster cycle times without many
changes.

I assume you meant microseconds here, which are usually shortened to µs or
us, not ms (which is milliseconds). Cycle times of 1500ms would be quite
bad for most applications. :)

Post by Frank Heckenbach
I know about the userspace library. But moving to userspace is not my main
goal. My main goal is better maintainability for my application, and

getting

Post by Frank Heckenbach
rid of kernel dependencies is a step towards this goal. Using the

kernelspace

Post by Frank Heckenbach
Etherlab code with a userspace application wouldn't help in any

significant

Post by Frank Heckenbach
way since I'd have the same amount of kernel dependencies (my application
does not contain many).

Given that one of your requirements (judging from your previous patches) is
EoE support, that might be tricky without kernel support.

Frank Heckenbach

2015-01-18 22:43:28 UTC

Permalink

Post by Gavin Lambert
I assume you meant microseconds here, which are usually shortened to µs or
us, not ms (which is milliseconds). Cycle times of 1500ms would be quite
bad for most applications. :)

Yes, of course, sorry.

Post by Gavin Lambert
Given that one of your requirements (judging from your previous patches) is
EoE support, that might be tricky without kernel support.

Why? As I wrote, I plan to use a tap/tun deivce for that.

Regards,
Frank

--
Dipl.-Math. Frank Heckenbach <***@fh-soft.de>
Stubenlohstr. 6, 91052 Erlangen, Germany, +49-9131-21359
Systems Programming, Software Development, IT Consulting

Jeroen Van den Keybus

2014-11-26 09:38:51 UTC

Permalink

Dear Frank,

The Etherlab developers are

Post by Frank Heckenbach
obviously not interested in those changes, so I have to maintain
them myself.

I'm not sure. They are rather in a continuous state of business, I would
say.

- If I had known and considered all this back then, I might have

Post by Frank Heckenbach
started with other code instead of Etherlab which is already
userspace based, but has different interfaces (and possibly
different bugs).

It is often instructive to review why you made the initial choice. You
could be forgetting important positive arguments. For me, that would
include the fact that Etherlab is a clean, well-structured project which
has a sizeable user group.

Post by Frank Heckenbach
But as things are now, since my code is tightly
bound to the Etherlab interface, and well tested with (the patched
version of) it, it seems easier to port this code to userspace
than change my application code.

If it get it correctly, you consider porting your code that is running in
kernel space to user space. I would say that, depending on your platform,
that conversion is a no-brainer and Etherlab has little to do with that.

- Replace kernel infrastructure by corresponding userspace

Post by Frank Heckenbach
functionality (e.g. kthread -> pthread, semaphore ->
pthread_mutex, kmalloc -> malloc, kprintf -> fprintf); copy kernel
utilities that have no actual kernel dependencies (in particular
kernel lists).

Have you considered using Xenomai ? It has a POSIX skin and, since version
3, you should be able to run under RT-Preempt as well as the native I-pipe
using the same code. It is a very high quality project that is well
maintained.

I do not know what your application is and how time-critical it is, but you
will occasionally have to deal with increased latencies and I fear that
dealing with these occurrences may cost you much more in terms of
catch-coding than selecting a real-time environment in the first place.

Post by Frank Heckenbach
- Use the generic backend driver which also shouldn't require too
many changes since it already uses a raw ethernet socket which is
also available from userspace).

The raw ethernet socket has limited zero-copy capability and not every
driver is optimized for it. It's great (and designed) for capturing massive
amounts of traffic (mostly Wireshark), but low latency was never the
objective. Be warned.

I'm trying to convince you to reconsider your choice to fork, because I
think that the choice may not be based on the correct arguments, but also
because Etherlab would benefit from as large a user base as it can get. In
a sense, it is already a niche project as it is.

Jeroen.

Frank Heckenbach

2015-01-16 13:00:24 UTC

Permalink

Post by Frank Heckenbach
The Etherlab developers are

Post by Frank Heckenbach
obviously not interested in those changes, so I have to maintain
them myself.

I'm not sure. They are rather in a continuous state of business, I would
say.

OK, I shouldn't speculate about motivations, but the result to me is
the same.

Post by Frank Heckenbach
- If I had known and considered all this back then, I might have

Post by Frank Heckenbach
started with other code instead of Etherlab which is already
userspace based, but has different interfaces (and possibly
different bugs).

I suppose so (though I never researched the alternatives too closely
-- at the time, RTAI seemed the natural choice and Etherlab was the
only one that supported it). OTOH, considering the amount of time
(several weeks in total) I spent finding, debugging and fixing the
various bugs, I could probably also have gotten one of the
alternatives to a useful state, even if they might not have been so
clean and well-structured.

But all that's academic now. I don't really consider porting my
application to another code base. Porting Etherlab to userspace
seems much easier in comparison: Most of the changes will (almost :)
follow the old saying "if it compiles, ship it", since when
replacing/adjusting kernel functionality, compile errors will show
immediately what's missing; whereas the complex logic (e.g. my
changes WRT mailbox dispatching, but also the master and slave logic
of the original code) will stay mostly untouched.

Post by Frank Heckenbach

No, I actually plan to port the whole of Etherlab to userspace.
That's a little more involved. I think I can handle the interface
differences. The only thing I was really unsure about is packet
latency (which was of course the only reason to use RTAI in the
first place back then).

Post by Frank Heckenbach

Post by Frank Heckenbach
- Use the generic backend driver which also shouldn't require too
many changes since it already uses a raw ethernet socket which is
also available from userspace).

The raw ethernet socket has limited zero-copy capability and not every
driver is optimized for it.

Do you know which drivers work best with it?

(For a fair comparison, I should note that the non-generic Etherlab
drivers also limit the choice of network adapters. For me, the only
viable choice ATM is e1000; others report rtl8169 works better, but
in my initial tests it didn't; though this might have been due to
the other bugs I found later; after I found and fixed the bugs in
the e1000 driver, it's been working well for me, and I didn't go
back to rtl8169. In any case, the choice of network adapters for
EtherCAT is quite limited, and as long as at least one popular model
is among them, I can live with that.)

Post by Frank Heckenbach
It's great (and designed) for capturing massive
amounts of traffic (mostly Wireshark), but low latency was never the
objective. Be warned.

Please correct me if I'm wrong, but isn't zero-copy about
performance rather than latency? Performance is not really an issue
for me (our peak load is well below 1 MByte/s), and a fixed latency
overhead due to copying shouldn't matter much either -- in other
words, I don't care if the CPU is busy dealing with the network
adapter while sending out and receiving packets; it would be idle
during this time otherwise; my application code runs in the time
between.

What worried me more is network packets delayed by other
(non-cyclic, unplanned) activity such as network traffic on another
adapter, disk I/O, etc., but I did some timing tests which look
alright (see my answer to Florian).

Post by Frank Heckenbach
I'm trying to convince you to reconsider your choice to fork, because I
think that the choice may not be based on the correct arguments, but also
because Etherlab would benefit from as large a user base as it can get. In
a sense, it is already a niche project as it is.

This sounds nice in theory, but in practice my experience has been
little different.

What good is a large user base? Right, you get more tests, bug
reports and fixes. But if those get ignored, well ...

Regards,
Frank

--
Dipl.-Math. Frank Heckenbach <***@fh-soft.de>
Stubenlohstr. 6, 91052 Erlangen, Germany, +49-9131-21359
Systems Programming, Software Development, IT Consulting