Opened 8 years ago

Closed 8 years ago

#672 closed defect (fixed)

UDP crashing network stack

Reported by: Stanislav Gálfy Owned by:
Priority: major Milestone:
Component: helenos/net/udp Version: mainline
Keywords: Cc:
Blocker for: Depends on:
See also:

Description

When there is no UDP association on which udp message can be received the whole receiving part of network stack crashes. The reason is assertion at line 413 in /uspace/srv/net/udp/assoc.c in function udp_assoc_find_ref.

static udp_assoc_t *udp_assoc_find_ref(inet_ep2_t *epp)
{

int rc;
void *arg;
udp_assoc_t *assoc;

log_msg(LOG_DEFAULT, LVL_DEBUG, "udp_assoc_find_ref(%p)", epp);
fibril_mutex_lock(&assoc_list_lock);

rc = amap_find_match(amap, epp, &arg);
if (rc != EOK) {

assert(rc == ENOMEM);

fibril_mutex_unlock(&assoc_list_lock);
return NULL;

}

assoc = (udp_assoc_t *)arg;
udp_assoc_addref(assoc);

fibril_mutex_unlock(&assoc_list_lock);
return assoc;

}

Here amap_find_match returns ENOENT, so next assert(rc == ENOMEM); crashes udp module and EHANGUP is propagated all the way down to driver, stopping the invocation of callbacks when receiving.

This is also the cause of bug #667 ( http://www.helenos.org/ticket/667 ) where unix on the other side was broadcasting some udp packets that crashed the network stack, leading to ping not working afterwards.

Attachments (1)

Screenshot from 2017-02-20 23-07-28.png (231.1 KB ) - added by Stanislav Gálfy 8 years ago.
UDP log screenshot

Download all attachments as: .zip

Change History (13)

comment:1 by Jakub Jermář, 8 years ago

I wonder if UDP broadcast has to do anything with this. Also, can you collect some additional debugging info by increasing log level for udp via:

# logset udp 5

before reproducing the issue and attaching /log/udp afterwards?

by Stanislav Gálfy, 8 years ago

UDP log screenshot

comment:2 by Stanislav Gálfy, 8 years ago

Log added in attachment. I think the bug is obvious from these three lines:

rc = amap_find_match(amap, epp, &arg);
if (rc != EOK) {
    assert(rc == ENOMEM);
...

rc = amap_find_match(amap, epp, &arg); - amap_find_match will not find association matching endpoint and returns ENOENT (endpoint is given by received packet, associations are given by clients listening to UDP service). This corresponds to the last line in log - [udp] debug2: No match.
if (rc != EOK) { - rc is ENOENT, so we go inside the if
assert(rc == ENOMEM); - here the assertion crashes module if rc is anything but ENOMEM and rc is ENOENT.

Everything works fine after commenting out the line with assertion. Having the line assert(rc == ENOMEM); there makes no sense to me.

Last edited 8 years ago by Stanislav Gálfy (previous) (diff)

in reply to:  2 ; comment:3 by Jakub Jermář, 8 years ago

Replying to StanislavGalfy:

I think the bug is obvious from these three lines:

Well the immediate cause is indeed obvious, but the question is why the association wasn't found. For example, is this because the local address is 255.255.255.255?

Everything works fine after commenting out the line with assertion. Having the line assert(rc == ENOMEM); there makes no sense to me.

It may be a superfluous assert, but we first need to understand what went wrong and why there was an assumption that the association must always be found.

in reply to:  3 ; comment:4 by Stanislav Gálfy, 8 years ago

Replying to jermar:

Well the immediate cause is indeed obvious, but the question is why the association wasn't found. For example, is this because the local address is 255.255.255.255?

More important is the port, in this case 68. Association is created when some client asks for it. Then it is added to map. When creating association, client can specify local port, local address, remote port, remote address. Here is api for creating association from /uspace/lib/c/inet/udp.c :

/** Create new UDP association.
 *
 * Create a UDP association that allows sending and receiving messages.
 *
 * @a epp may specify remote address and port, in which case only messages
 * from that remote endpoint will be received. Also, that remote endpoint
 * is used as default when @c NULL is passed as destination to
 * udp_assoc_send_msg.
 *
 * @a epp may specify a local link or address. If it does not, the association
 * will listen on all local links/addresses. If @a epp does not specify
 * a local port number, a free dynamic port number will be allocated.
 *
 * The caller is informed about incoming data by invoking @a cb->recv_msg
 *
 * @param udp    UDP client
 * @param epp    Internet endpoint pair
 * @param cb     Callbacks
 * @param arg    Argument to callbacks
 * @param rassoc Place to store pointer to new association
 *
 * @return EOK on success or negative error code.
 */
int udp_assoc_create(udp_t *udp, inet_ep2_t *epp, udp_cb_t *cb, void *arg,
    udp_assoc_t **rassoc)
...

If there is no client above udp, that created association with particular "local port" or "local port/local address" or "local port/local address/remote port/remote address" and packet addressed to it is received, this bug occurs. Surely it is not meant to take down udp module and half the network stack with it when there is no recipient for udp message above. The message should be just simply discarded which is what happens when the assert is removed.

in reply to:  4 ; comment:5 by Jakub Jermář, 8 years ago

Replying to StanislavGalfy:

If there is no client above udp, that created association with particular "local port" or "local port/local address" or "local port/local address/remote port/remote address" and packet addressed to it is received, this bug occurs. Surely it is not meant to take down udp module and half the network stack with it when there is no recipient for udp message above. The message should be just simply discarded which is what happens when the assert is removed.

Please excuse my ignorance, but how may this situation legally happen that the client is receiving UDP packets it didn't ask for? Am I missing something? The only situation I could imagine is a broadcast, but that does not reproduce it for me.

comment:6 by Jakub Jermář, 8 years ago

Okay, here are simplified steps to reproduce. Start HelenOS via the tools/ew.py script. In HelenOS, start netecho:

# netecho -l 8080

From your favorite desktop, run nc:

$ echo foo | nc -u 255.255.255 8081

in reply to:  5 ; comment:7 by Stanislav Gálfy, 8 years ago

Replying to jermar:

Please excuse my ignorance, but how may this situation legally happen that the client is receiving UDP packets it didn't ask for? Am I missing something? The only situation I could imagine is a broadcast, but that does not reproduce it for me.

When a udp packet is received it goes: driver→ethip→inetsrv(here is decided if it is addressed to one of local addresses, 255.255.255.255 being broadcast qualifies)→udp(here is decided if there exists a client listening for messages addressed to endpoint - local port + local address - where the message is addressed). By client i mean some application using udp server. Analogicaly ethip is client of driver, inetserv is client of ethip, udp is client of inetsrv. The problem is quite opposite - bug occurs when there does not exist a client to receive the message

Last edited 8 years ago by Stanislav Gálfy (previous) (diff)

in reply to:  7 ; comment:8 by Jakub Jermář, 8 years ago

Replying to StanislavGalfy:

When a udp packet is received it goes: driver→ethip→inetsrv(here is decided if it is addressed to one of local addresses, 255.255.255.255 being broadcast qualifies)→udp(here is decided if there exists a client listening for messages addressed to endpoint

Got it. And udp_assoc_find_ref is the function where this is decided. Anyway, it seems to me that the assert is not superfluous, but contains a typo. IMHO it should read:

=== modified file 'uspace/srv/net/udp/assoc.c'
--- uspace/srv/net/udp/assoc.c	2015-06-07 12:36:44 +0000
+++ uspace/srv/net/udp/assoc.c	2017-02-21 20:12:22 +0000
@@ -410,7 +410,7 @@
 
 	rc = amap_find_match(amap, epp, &arg);
 	if (rc != EOK) {
-		assert(rc == ENOMEM);
+		assert(rc == ENOENT);
 		fibril_mutex_unlock(&assoc_list_lock);
 		return NULL;
 	}

Asserting ENOMEM does not make any sense here, but ENOENT is perfectly ok.

in reply to:  8 ; comment:9 by Stanislav Gálfy, 8 years ago

Replying to jermar:

Got it. And udp_assoc_find_ref is the function where this is decided.

Correct.

Anyway, it seems to me that the assert is not superfluous, but contains a typo. IMHO it should read:

=== modified file 'uspace/srv/net/udp/assoc.c'
--- uspace/srv/net/udp/assoc.c	2015-06-07 12:36:44 +0000
+++ uspace/srv/net/udp/assoc.c	2017-02-21 20:12:22 +0000
@@ -410,7 +410,7 @@
 
 	rc = amap_find_match(amap, epp, &arg);
 	if (rc != EOK) {
-		assert(rc == ENOMEM);
+		assert(rc == ENOENT);
 		fibril_mutex_unlock(&assoc_list_lock);
 		return NULL;
 	}

Asserting ENOMEM does not make any sense here, but ENOENT is perfectly ok.

Well, it will do no harm, but still, rc can only end up being EOK or ENOENT.

Version 0, edited 8 years ago by Stanislav Gálfy (next)

in reply to:  9 ; comment:10 by Jakub Jermář, 8 years ago

Replying to StanislavGalfy:

Well, it will do no harm, but still, rc can only end up being EOK or ENOENT.

Yes, that's the point here I guess. I will commit the fix shortly. Thanks for spotting this.

in reply to:  10 comment:11 by Stanislav Gálfy, 8 years ago

Replying to jermar:

Replying to StanislavGalfy:

Well, it will do no harm, but still, rc can only end up being EOK or ENOENT.

Yes, that's the point here I guess. I will commit the fix shortly. Thanks for spotting this.

After the fix both this ticket and ticket #667 can be closed.

comment:12 by Jakub Jermář, 8 years ago

Resolution: fixed
Status: newclosed

Fixed in mainline,2583.

Note: See TracTickets for help on using tickets.