[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [openss7] SCTP Proto
I have included some comments. I am going to compile all my settings,
and send another message either today or tomorrow. I will look into
releasing the code. Some really good comments in here.
Thanks,
Chuck
"Brian F. G. Bidulock" wrote:
>
> Chuck,
>
> First: Thank you for performing this testing!
>
> Yes the results are very interesting. I have a number of questions and
> comments:
>
> Q: Would it be possible for me to post up your results on the
> OpenSS7 site? (Our list server rejected the size of your
> attachment as too large and I am sure that others would like to
> look at the results.)
I will check with my manager
>
> Q: Can you share the test code? At least the portion which
> interfaces directly with the socket, sends and receives data and
> makes the time measurements?
I will check with my manager
>
> Q: How does the test application operate? Does it send one
> forward message (timestamping) and then poll for an
> acknowledgment (timestamping)? Or, does it send a stream of
> forward messages (timestamping) and then correlate the
> (timestamped) responses?
It grabs a timestamp, sends a message of n bytes, waits for an
acknowledgement, then grabs another timestamp. Does a diff, and records
the findings and goes at it again.
>
> Q: What was the setting of the various SCTP configuration options
> and socket options for the test? What was the setting of TCP
> configuration options and socket options for the test? (e.g.
> was TCP set TCP_NODELAY, was CONFIG_SCTP_SLOW_VERIFICATION set?
> etc.)
I am going to go through it a lot closer today, so I will keep you
posted.
>
> Q: Is it possible to get Ethereal dumps generated by a third
> box snooping between the two?
Good idea! I will try and get that setup by the end of the week.
>
> C: Although using a single byte reply is quite applicable to TCP,
> it is a rather unfair comparison for SCTP. TCP can place a 1
> bytes acknowledgement into a 21 byte IP payload. SCTP when
> bundling SACK with DATA chunks requires a 12 byte message header
> a 12 byte SACK chunk, a 12 bytes DATA chunk header and a 4 bytes
> (padded) data for a grand total of a 40 byte IP payload.
>
> That is, it is far more complicated for SCTP to generate a one
> byte reply than it is for TCP. A fairer comparison might be an
> echo test where the receiving side merely echoes the data back
> to the originator.
I will look into this. Seems very reasonable.
>
> C: In the test results for SCTP, it appears that 50% of the RTTs
> were exactly 1000 usecs. This dirac delta function in the
> results makes me suspect a bug in the code, the test code or the
> method of generating the graphs. It is even more suspicious
> that this spike at 1000 occurs at all frame sizes.
>
> I seriously doubt that one could write a software clock that was
> this accurate at 1 MHz.
That is in my code. There was a max of 1000 usecs. I wasn't expecting
too much over that. Basically, if I had set the code to output more
outliers, you would see a very long list of times over 1000 usecs. I
will try it again with a higher setting.
>
> C: The extremely large variances in the RTT makes me wonder
> whether SCTP is getting itself into a retransmission scenario.
> Ethereal traces would be very helpful.
>
> C: It is interesting that the SCTP minimums are consistently about
> double that of TCP's minimums. SCTP SACKs only ever second data
> chunk received unless it sends data. SACKs are bundled ahead of
> DATA in SCTP messages. The receiving stack may be introducing a
> delay by processing the SACK before the DATA. Again, Ethereal
> dumps and the testing code would help here. If you are sending
> one DATA chunk and waiting for the one byte reply, this might be
> exactly what is happening.
>
> C: Kernel crashes over 512 bytes is a good debugging lead. I will
> chase that one down and release a patch. It would be very
> interesting to see comparisons with TCP over 1024 (TCP's default
> MSS) when TCP is forced to fragment, or comparisons of packet
> sizes greater than the MTU.
512 is not a hard limit, but as the packet sizes get larger we run into
more problems.
>
> Overall your testing indicates that there might be some problems in poll
> handling, sleeping or waking proceses, acknowledgement handling, etc.,
> but some strong numbers down at the 300 usec side of the histograms
> indicate that it is quite possible to get this SCTP stack performing as
> well as TCP and even outperforming it at larger message sizes. A little
> more information (test code, ethereal dumps) would make things quite
> easy to chase down.
>
> There are about 5 places in the code that I know of where significant
> speed improvements can be made once these quirks are found. There are:
>
> 1) Rework the copy_and_checksum_from_user. I turned it off in
> France due to some problems in generating incorrect
> checksums. As it stands, data is copied from the user and
> the checksum is recalculated on the data each time that it
> is retransmitted.
>
> 2) Rework cloning of sk_buffs when bundling DATA chunks. As
> it stands, data is copied too many times.
>
> 3) Place stream datastructures into kmem caches. Currently
> the stream data is kmalloc'ed and kfree'd rather than being
> placed in a hardware aligned kmem cache. This data
> structure is accessed on every DATA chunk transmission and
> should really be cached.
>
> 4) There is really no need to perform slow verification. The
> option should be removed.
>
> 5) The module is compiling -O2 and I'm not sure that the
> compiler is inlining everything that needs to be inlined. I
> can check this an rewrite as macros those things which are
> missing being inlined. I particularly suspect the
> established fast path for receive data.
>
> In France we were hoping to get the conformance correct before
> addressing performance. I'm sure that it will not take too much to get
> this stack running as fast as you would like.
>
> --Brian
>
> Chuck Winters wrote: Tue, 29 May 2001 18:10:45
> >
> > Hey,
> > I recompiled my kernel for to only use one processor. I have been
> > doing some preliminary testing of the protocol, and have found it to be
> > quite slow. I am getting average rtt of about 1900 microseconds. I am
> > including two preliminary tests. One on tcp and one on SCTP.
> > You will notice that the sctp one only went to 512, but that is only
> > because the kernel crashes every time at that point. These are only
> > preliminary, but I thought they may be interesting.
> >
> > Thanks,
> > Chuck
> >
> > --
> > Chuck Winters | Email: cwinters@atl.lmco.com
> > Distributed Processing Laboratory | Phone: 856-338-3987
> > Lockheed Martin Advanced Technology Labs |
> > 1 Federal St - A&E-3W |
> > Camden, NJ 08102 |
>
> --
> Brian F. G. Bidulock ¦ The reasonable man adapts himself to the ¦
> bidulock@openss7.org ¦ world; the unreasonable one persists in ¦
> http://www.openss7.org/ ¦ trying to adapt the world to himself. ¦
> ¦ Therefore all progress depends on the ¦
> ¦ unreasonable man. -- George Bernard Shaw ¦
--
Chuck Winters | Email: cwinters@atl.lmco.com
Distributed Processing Laboratory | Phone: 856-338-3987
Lockheed Martin Advanced Technology Labs |
1 Federal St - A&E-3W |
Camden, NJ 08102 |