Network Services Start Before Network Is Up Since Migrating To 7.2

Home » CentOS » Network Services Start Before Network Is Up Since Migrating To 7.2
CentOS 23 Comments

Hello all, I updated two of my servers to CentOS 7.2 (1511) two days ago, and since, on one of them, the network services are started (and fail to start) before the network interfaces are online.

Parts of “journalctl” after the last reboot :

déc. 17 10:21:44 myserver kernel: NET: Registered protocol family 40
déc. 17 10:21:45 myserver sshd[700]: error: Bind to port 22 on 172.20.XX.XX failed: Cannot assign requested address. déc. 17 10:21:45 myserver sshd[700]: fatal: Cannot bind any address. déc. 17 10:21:45 myserver systemd[1]: sshd.service: main process exited, code=exited, status=255/n/a déc. 17 10:21:45 myserver systemd[1]: Unit sshd.service entered failed state. déc. 17 10:21:45 myserver systemd[1]: sshd.service failed. déc. 17 10:21:45 myserver sssd[729]: Starting up déc. 17 10:21:45 myserver kernel: nf_conntrack version 0.5.0 (16384 buckets, 65536 max)
déc. 17 10:21:34 myserver systemd[1]: Time has been changed déc. 17 10:21:35 myserver iptables.init[699]: iptables: Applying firewall rules: [ OK ]
déc. 17 10:21:35 myserver systemd[1]: Started IPv4 firewall with iptables. déc. 17 10:21:35 myserver systemd[1]: Starting LSB: Bring up/down networking… déc. 17 10:21:35 myserver network[790]: Activation de l’interface loopback : [ OK ]
déc. 17 10:21:36 myserver httpd[686]: (99)Cannot assign requested address: AH00072: make_sock: could not bind to address 172.19.XX.XX:443
déc. 17 10:21:36 myserver httpd[686]: no listening sockets available, shutting down déc. 17 10:21:36 myserver httpd[686]: AH00015: Unable to open logs déc. 17 10:21:36 myserver systemd[1]: httpd.service: main process exited, code=exited, status=1/FAILURE
déc. 17 10:21:36 myserver kernel: vmxnet3 0000:03:00.0 ens160: intr type 3, mode 0, 2 vectors allocated déc. 17 10:21:36 myserver kernel: vmxnet3 0000:03:00.0 ens160: NIC Link is Up 10000 Mbps déc. 17 10:21:36 myserver kill[924]: kill: cannot find process “”
déc. 17 10:21:36 myserver systemd[1]: httpd.service: control process exited, code=exited status=1
déc. 17 10:21:36 myserver systemd[1]: Failed to start The Apache HTTP Server. déc. 17 10:21:36 myserver systemd[1]: Unit httpd.service entered failed state. déc. 17 10:21:36 myserver systemd[1]: httpd.service failed. déc. 17 10:21:36 myserver postfix/postfix-script[959]: starting the Postfix mail system déc. 17 10:21:36 myserver postfix/master[961]: daemon started — version 2.10.1, configuration /etc/postfix déc. 17 10:21:36 myserver systemd[1]: Started Postfix Mail Transport Agent. déc. 17 10:21:36 myserver snmpd[704]: Turning on AgentX master support. déc. 17 10:21:36 myserver snmpd[704]: Error opening specified endpoint “udp:172.19.XX.XX:161”
déc. 17 10:21:36 myserver snmpd[704]: Server Exiting with code 1
déc. 17 10:21:36 myserver systemd[1]: snmpd.service: main process exited, code=exited, status=1/FAILURE
déc. 17 10:21:36 myserver systemd[1]: Failed to start Simple Network Management Protocol (SNMP) Daemon.. déc. 17 10:21:36 myserver systemd[1]: Unit snmpd.service entered failed state. déc. 17 10:21:36 myserver systemd[1]: snmpd.service failed.
(…)
déc. 17 10:21:38 myserver network[790]: Activation de l’interface ens160 : [ OK ]
déc. 17 10:21:38 myserver kernel: vmxnet3 0000:0b:00.0 ens192: intr type 3, mode 0, 2 vectors allocated déc. 17 10:21:38 myserver kernel: vmxnet3 0000:0b:00.0 ens192: NIC Link is Up 10000 Mbps déc. 17 10:21:39 myserver ntpd[694]: Listen normally on 1 ens160 172.19.XX.XX UDP 123
déc. 17 10:21:39 myserver ntpd[694]: new interface(s) found: waking up resolver déc. 17 10:21:40 myserver ntpd[694]: 0.0.0.0 c61c 0c clock_step +11.002914 s déc. 17 10:21:51 myserver ntpd[694]: 0.0.0.0 c614 04 freq_mode déc. 17 10:21:51 myserver systemd[1]: Time has been changed déc. 17 10:21:51 myserver network[790]: Activation de l’interface ens192 : [ OK ]
déc. 17 10:21:51 myserver systemd[1]: Started LSB: Bring up/down networking. déc. 17 10:21:51 myserver systemd[1]: Reached target Network is Online. déc. 17 10:21:51 myserver systemd[1]: Starting Network is Online. déc. 17 10:21:51 myserver systemd[1]: Reached target Multi-User System. déc. 17 10:21:51 myserver systemd[1]: Starting Multi-User System. déc. 17 10:21:51 myserver systemd[1]: Starting Update UTMP about System Runlevel Changes… déc. 17 10:21:51 myserver systemd[1]: Started Stop Read-Ahead Data Collection 10s After Completed Startup. déc. 17 10:21:51 myserver systemd[1]: Started Update UTMP about System Runlevel Changes. déc. 17 10:21:51 myserver systemd[1]: Startup finished in 650ms (kernel) + 2.623s (initrd) + 13.647s (userspace) = 16.922s.

I found a workaround, by replacing “After=network.target” by “After=network-online.target” is the failing services’ units, but I want to understand what is the root problem, and what is the difference between my two servers… And by now, I found nothing.

Got an idea ?

Sylvain CANOINE.

Pensez ENVIRONNEMENT : n’imprimer que si ncessaire

23 thoughts on - Network Services Start Before Network Is Up Since Migrating To 7.2

  • Well it looks like you are using the network service rather than the recommended NetworkManager …

    The network service is not blocking the flow so it executes and systemd carries on …

    From the point of view of the system as soon as /etc/init.d/network start has been called the service is running as a state… as you can see from your logs lots of other services also start before the network interface itself is up.

    There’s a few of different ways of accomplishing what you want …

    Keep in mind that you must not edit files in /usr/lib/systemd/ if you want to maintain your sanity for future updates… use overrides in
    /etc/systemd/system/foo.service.d

    The real reason httpd/sshd/snmpd failed there is that unlike the default configuration of these you aren’t listening on all addresses (:: or
    0.0.0.0) but on a specific 172.X address … which isn’t present until the network adaptor is up and configured.

    So how to solve this…

    1) Have the services bind on :: (or 0.0.0.0) rather than a specific IP like the default configuration so that they are not dependent on the network being up with a specific IP on the interface
    2) Set the sysctl ip_nonlocal_bind so that the services can bind to IPs not yet on the system (if it’s using a systemd socket you can override with FreeBind for that socket rather than set this globally)
    3) Provide overrides for each service to order it after network-online.target (which is effectively when the non-local IP address can be found on the interface) as per the systemd.special man page documenting this.

    Look at man systemd.special for more detail on this …

    Incidentally I just tried a quick test in a VM and it would appear NetworkManager.service completed with an IP on the network interface before network.target was considered reached … you may want to test this on your system to see if it’s a race condition or it actually works out that way for you as a systemctl cat NetworkManager indicates it should be before network and it looks like it may block progress until it’s on dbus …

  • Hello James,

    Yes. That’s the way our security experts made the models I use to setup my servers. I’ll test a migration to NetworkManager, and take their advice on it.

    I understand this, but why only on one of my servers ? Is the order the services start only a question of latencies ?

    Ok. Thank you for the tip. I’m trying to avoid this workaround, anyway.

    It is by design, for security considerations. So I can’t make the services listen on all interfaces.

    I’ll take a look on this.

    Ok, I’ll try, and see if that solves my problem. Thank you.

    Sylvain CANOINE.

    Pensez ENVIRONNEMENT : n’imprimer que si ncessaire

  • Our security experts don’t want me to use NetworkManager… It’s even uninstalled on the models, so I understand better why all the required files are not here :

    # systemctl status NetworkManager-wait-online.service
    ● NetworkManager-wait-online.service
    Loaded: not-found (Reason: No such file or directory)
    Active: inactive (dead)

    So I made a crappy but easy-to-deploy script to make the services start after network is online :

    for fic in $(grep -rl “After=.*network.target” /lib/systemd/system | cut -d/ -f5 | grep -v “network-online.target”)
    do
    [ ! -d “/etc/systemd/system/${fic}.d” ] && mkdir -v “/etc/systemd/system/${fic}.d”
    echo -e “[Unit]\nAfter=network-online.target” > “/etc/systemd/system/${fic}.d/local-network-online.conf” && echo “/etc/systemd/system/${fic}.d/local-network-online.conf”
    done systemctl daemon-reload

    That’s working as is, so I’ll keep this workaround for now.

    Sylvain. Pensez ENVIRONNEMENT : n’imprimer que si ncessaire

  • “experts” … I’m sorry …

    What a horrible work around but I’m glad you got something in place that works for you.

  • Em 21-12-2015 14:24, James Hogarth escreveu:

    Agreed. Sylvain, if possible, please elaborate on their reasoning for this, because it just seems like a case of “we fear what we don’t know”, so they are recommending to stick to old habits instead.

    Or have they identified real attack vectors in NM? If yes, we would love to hear that so it can be fixed.

    Marcelo

  • —– Mail original —–

    In short, “you don’t need it, so don’t use it”. They said NM is more a desktop-oriented tool, already had privilege escalation issues in the past (I didn’t search if they’re right), has too many dependencies (such as wpa_supplicant and avahi, which are, of course, also forbidden), needs extra mechanisms (PAM ? Polkit ?) to avoid users changing its settings, needs D-bus just to work, so it is too much complex just to set static IP addresses on network interfaces. They said multiples administrator actions, and potentially human errors, to set it up, may be a security risk…

    Sylvain. Pensez ENVIRONNEMENT : n’imprimer que si ncessaire

  • Also known as “we have our policies for EL6 and we haven’t paid any attention to EL7 to see how things have changed” … Wonder if they have read my NM blog article yet …

    Honestly any ‘security’ people banning wpa_supplicant needs their heads examined given that is used for 802.1x authentication … which if they care about security they should be paying attention to.

    As for polkit and dbus … well they have to be there in EL7 and systemd relies on these mechanisms.

    That said if they’re having kittens about NM, polkit, dbus and wpa_supplicant they probably hate systemd and frankly I’m surprised they permit EL7 at all ;)

    Note that by default a non administrator user cannot change system network configuration … bah idiots …

  • James Hogarth wrote:

    Really? Why?

    a) All the servers I’ve ever dealt with (and I don’t mean a large tower under someone’s desk) are racked in locked rooms and hardwired.

    b) NONE I’ve ever seen has any wifi, so I’ve never understood why avahi, and the firewall hole for it, was installed in the “server” version by default.

    c) wpa-supplicant – again, why? If it’s hardwired, and behind switches and firewalls, why PNAC if every server is running firewalls?

    mark “let’s *please* NOT talk about NAC via Cisco,
    and people who allegedly know and have planned
    rolling it out….”

  • —– Mail original —–

    I’m confused. I updated two more servers this afternoon, and… all is working well. The services start in correct order. Even after three reboots. So only one of the (now) five updated servers doesn’t start properly.

    Then what is the difference ? All I see for now is the network.target unit seems not active on the failing server.

    (failing) # systemctl list-units|grep network network.service loaded active exited LSB: Bring up/down networking rhel-import-state.service loaded active exited Import network configuration from initramfs network-online.target loaded active active Network is Online
    (failing) # systemctl status network
    ● network.service – LSB: Bring up/down networking
    Loaded: loaded (/etc/rc.d/init.d/network)
    Active: active (exited) since lun. 2015-12-21 12:49:31 CET; 1 day 5h ago
    Docs: man:systemd-sysv-generator(8)

    déc. 21 12:49:35 (failing) systemd[1]: Starting LSB: Bring up/down networking… déc. 21 12:49:26 (failing) network[747]: Activation de l’interface loopback : [ OK ]
    déc. 21 12:49:28 (failing) network[747]: Activation de l’interface ens160 : [ OK ]
    déc. 21 12:49:31 (failing) network[747]: Activation de l’interface ens192 : [ OK ]
    déc. 21 12:49:31 (failing) systemd[1]: Started LSB: Bring up/down networking.

    (correct) # systemctl list-units|grep network network.service loaded active exited LSB: Bring up/down networking rhel-import-state.service loaded active exited Import network configuration from initramfs network-online.target loaded active active Network is Online network.target loaded active active Network
    (correct) # systemctl status network
    ● network.service – LSB: Bring up/down networking
    Loaded: loaded (/etc/rc.d/init.d/network)
    Active: active (exited) since mar. 2015-12-22 17:42:15 CET; 33min ago
    Docs: man:systemd-sysv-generator(8)
    Process: 753 ExecStart=/etc/rc.d/init.d/network start (code=exited, status=0/SUCCESS)

    déc. 22 17:42:07 (correct) systemd[1]: Starting LSB: Bring up/down networking… déc. 22 17:42:10 (correct) network[753]: Activation de l’interface loopback : [ OK ]
    déc. 22 17:42:13 (correct) NET[935]: /etc/sysconfig/network-scripts/ifup-post : updated /etc/resolv.conf déc. 22 17:42:13 (correct) network[753]: Activation de l’interface ens160 : [ OK ]
    déc. 22 17:42:15 (correct) network[753]: Activation de l’interface ens192 : [ OK ]
    déc. 22 17:42:15 (correct) systemd[1]: Started LSB: Bring up/down networking.

    To be continued…

    Sylvain.

    Pensez ENVIRONNEMENT : n’imprimer que si ncessaire

  • yeah, gotta get rid of those pesky humans, they always mess things up. And, get rid of the computers too, they’ve always had security problems.

    voila, problem solved!!

  • more likely their policies were developed in the days of RHEL <= 4, and have only begrudgingly been brought forward to support 6.

  • Ha-ha! I like it. But I always remember what one of my friends says: All systems suck. And thanks to that I got my job ;-)

    Valeri

    ++++++++++++++++++++++++++++++++++++++++
    Valeri Galtsev Sr System Administrator Department of Astronomy and Astrophysics Kavli Institute for Cosmological Physics University of Chicago Phone: 773-702-4247
    ++++++++++++++++++++++++++++++++++++++++

  • Yamaban wrote:

    I beg your pardon. What *possible* reason is there for a server, hardwired, to “announce” itself to anything, other than DHCP? Everywhere I’ve worked, and what I know, is that servers are assigned IP addresses, they don’t just take whatever’s offered, willy-nilly. And if they do… I
    do *not* want to work there. That’s not only unprofessional, it’s an insane security risk. Suppose someone puts their laptop on the intranet, and has *it* running a DHCP server?

    mark

  • You do know there’s more to life than static IP webapp servers, right?

    how about a internal media server cluster being used in a professional video editing environment with workstations running various sorts of editing software, monitors doing streaming playback and such ? that world relies heavily on uPnP, BonJour, etc.

    My development lab environment, most of my servers (75% VMs) are DHCP
    configured (using static and/or long lease time reservations), which makes doing PXE and such much easier. A foreign DHCP server would quickly be detected by the corporate IDS and cut off the network.

  • John R Pierce wrote:

    You mean, like dhcp-served IP addresses that are tied to MAC addresses for compute nodes, and heavy-duty research servers? No, really?

    Sorry, I believe I’ve mentioned here, before, that we only have a couple-three VMs… we run the o/s on bare metal, because we need every cycle.

    Though I will admit that the system that I had to power cycle this morning, where one of my user’s week-long job had toasted, top showing a load of (I’m not making this up) 286, and no response on the console, is an extreme case. Normal for some of these week and two week-long jobs is
    30-75….

    mark

  • … I’m a little confused, too. But, it might be more informative to query the system for “network.target” than “network.service” since the former is the one missing.

    # rpm -V systemd
    # locate network.target
    /usr/lib/systemd/system/network.target
    # systemctl status network.target
    ● network.target – Network
    Loaded: loaded (/usr/lib/systemd/system/network.target; static;
    vendor preset: disabled)
    Active: active since Wed 2015-12-16 20:19:26 PST; 6 days ago
    Docs: man:systemd.special(7)
    http://www.freedesktop.org/wiki/Software/systemd/NetworkTarget

    Dec 16 20:19:26 x systemd[1]: Reached target Network. Dec 16 20:19:26 x systemd[1]: Starting Network.

  • —– Mail original —–

    # rpm -V systemd S.5….T. c /etc/rc.d/rc.local

    Ok, normal…

    # ll /usr/lib/systemd/system/network.target
    -rw-r–r–. 1 root root 480 20 nov. 05:49 /usr/lib/systemd/system/network.target
    # cat /usr/lib/systemd/system/network.target
    # This file is part of systemd.
    #
    # systemd is free software; you can redistribute it and/or modify it
    # under the terms of the GNU Lesser General Public License as published by
    # the Free Software Foundation; either version 2.1 of the License, or
    # (at your option) any later version.

    [Unit]
    Description=Network Documentation=man:systemd.special(7)
    Documentation=http://www.freedesktop.org/wiki/Software/systemd/NetworkTarget After=network-pre.target RefuseManualStart=yes
    # systemctl status network.target
    ● network.target – Network
    Loaded: loaded (/usr/lib/systemd/system/network.target; static; vendor preset: disabled)
    Active: inactive (dead)
    Docs: man:systemd.special(7)
    http://www.freedesktop.org/wiki/Software/systemd/NetworkTarget

    Dead ? Hmmm…

    Sylvain. Pensez ENVIRONNEMENT : n’imprimer que si ncessaire

  • Em 22-12-2015 13:53, m.roth@5-cent.us escreveu:

    It’s the same reason you think that adding one layer of management (dbus
    & cia) adds more risk than not adding it. It’s another wall to be crossed, if anything happens. Some thing firewalls are enough, some not.

  • Em 22-12-2015 08:33, Sylvain CANOINE escreveu:

    Gotta say, this policy is very subjective. These reasons, they fit pretty much everything else too. If memory serves, sudo also had privilege escalation issues in the past, but it’s needed. NM is just a newborn and soon will be required. They are free to wait for it to mature more, yes, but just keep in mind that at least for now, that’s a certain future, NM is getting more and more mainstream.

    NM already can be used only during startup, with no daemon running after that. That helps a lot already with the reasoning they presented.

    Thanks for sharing that.

    Marcelo

  • —– Mail original —–

    Ok, I found the difference between the failing servers (I updated one more this morning, and the same symptom came) : the failing ones don’t need to mount NFS shares. So I didn’t install nfs-utils, so there’s not a rpc-statd-notify.service, which unit file contain “Requires=network.target”… And so there’s no service “requiring” network.target at all !

    Then I’m wondering :
    1/ why “After=foo” does not imply “Requires=foo” for systemd. That’s obvious, yet,
    2/ why “After=foo” does not imply “Requires=foo” for systemd 219, while it appeared to be in systemd 208. Either it’s a regression, or the behaviour of 208, although logical, is buggy.

    Anyway, for the NetworkManager-opponents, it may be opportune to add a “Requires=network.target” on an usual network service’s unit, such as sshd ou ntpd… Or, better, on network-online.target’s unit.

    I chose another solution : I made a symlink to /usr/lib/systemd/system/network/target in /etc/systemd/system/multi-user.target.wants/ directory (“systemctl enable network.target” sent me to hell). And voilà.

    Sylvain.

    Pensez ENVIRONNEMENT : n’imprimer que si ncessaire

  • I’m not entirely certain, but “After=” is independent of “Requires=”, as documented on an up-to-date install of CentOS 7.

  • —– Mail original —–

    http://www.freedesktop.org/wiki/Software/systemd/NetworkTarget/ says :
    “Services using the network should hence simply place an After=network.target dependency in their unit files, and avoid any Wants=network.target or even Requires=network.target.”

    But all the other related explanations I found on the web either says nothing about the relationship between “After=” and “Requires=”/”Wants=”, or confirms there’s not. For example in http://www.freedesktop.org/software/systemd/man/systemd.unit.html :
    “Note that this setting (NDR : “After=” or “Before=”) is independent of and orthogonal to the requirement dependencies as configured by Requires=.”

    I didn’t found the related CentOS documentation, but I suppose it’s correct. I suppose it mentions NetworkManger, anyway.

    I’m able to understand systemd isn’t designed to make the relationship between “After=” and “Requires=”… But why designing it like that ? Giving the ability to start a service before or after a disabled other is a nonsense.

    But all of that don’t give any clue concerning the different behaviour of the two quoted versions of systemd. I think an additional “Requires=network.target” parameter in the network-online.target unit by default, or at least a note to the users, would be appreciated.

    Sylvain.

    Pensez ENVIRONNEMENT : n’imprimer que si ncessaire