Parsing Out Adjacent Text

Home » CentOS » Parsing Out Adjacent Text
CentOS 11 Comments

hey all,

I’m trying to figure out how to use apache’s mod_status module to figure out which of the web servers in a farm of six are processing more requests than others.

I’m writing a script to grep out requests per second from the status module like this:

[root@uszmpwslp014lc ~]# GET http://$(hostname -i)/server-status | grep
-i requests/sec

4.08 requests/sec – 80.9 kB/second – 19.8 kB/request

That works ok. And next I’m grepping it back down and awking it to just the part I’m interested in:

[root@uszmpwslp014lc ~]# GET http://$(hostname -i)/server-status | grep -i
-e request -e requests/sec | grep -i -v -e currently -e code -e ss | awk
‘{print $1}’

4.08

But now I need to get rid of just the

in front of the 4.08?

I think I may be able to use the ‘cut’ command to do this, but I’m unsure how.

Any thoughts?

Thanks Tim

11 thoughts on - Parsing Out Adjacent Text

  • op 03-06-14 15:18, schreef Tim Dunphy:
    cut –delimiter=”>” –field=2

    you could even get rid of the awk and pipe your grep to cut –delimiter=”>” –field=2 | cut –delimiter=” ” –field=1

    But there are many different ways to solve this.

    greetings Patrick

  • Guys,

    Thank you all for your input. I can’t believe how helpful this list is and I’m very grateful. Ok so here is what I have so far of my script to get the number of apache requests to a given host:

    #!/bin/bash

    # this script parses mod_status to see which hosts are getting the most requests

    echo “Time:” >> /tmp/apache_request_log

    /usr/bin/ts >> /tmp/apache_request_log

    echo -e “\n” >> /tmp/apache_request_log

    echo “hostname” >> /tmp/apache_request_log

    /bin/hostname -f >> /tmp/apache_request_log

    echo -e “\n” >> /tmp/apache_request_log

    echo “hostname ip” >> /tmp/apache_request_log

    /bin/hostname -i >> /tmp/apache_request_log

    echo -e “\n” >> /tmp/apache_request_log

    echo “Requests per second:” >> /tmp/apache_request_log

    /usr/bin/GET http://$(/bin/hostname -i)/server-status | /bin/grep -i -e request -e requests/sec | grep -i -v -e currently -e code -e ss | awk
    ‘{print $1}’ | cut -d’>’ -f2 >> /tmp/apache_request_log

    echo “Requests processed / Idle workers:” >> /tmp/apache_request_log

    /usr/bin/GET http://$(hostname -i)/server-status | /bin/grep -i -e requests -e currently | grep -v -e requests/sec | cut -d’>’ -f2 | cut -d’<' -f1 >> /tmp/apache_request_log

    echo -e “\n\n” >> /tmp/apache_request_log

    /bin/sleep 60

    So now my question is, is there any way to limit the size of the output log from within the script without having to use logrotate? I can use that, but I would prefer to do that from within the script if that’s possib.e

    Thanks

    Tim


    GPG me!!

    gpg –keyserver pool.sks-keyservers.net –recv-keys F186197B

  • Try accessing the stats with the additional “?auto” suffix, it is meant to be machine-readable, and is much shorter, e.g:

    http://$(hostname -i)/server-status/?auto


    Marios Zindilis

  • Awesome tip! This is some of the output I get when I run this command:

    [root@uszmpwslp014lc script]# GET $(hostname -f)/server-status/?auto Total Accesses: 1371927
    Total kBytes: 27060974
    CPULoad: 2.70778
    Uptime: 333370
    ReqPerSec: 4.11533
    BytesPerSec: 83122.2
    BytesPerReq: 20198.2
    BusyWorkers: 7
    IdleWorkers: 44

    This will be way easier to parse!


    GPG me!!

    gpg –keyserver pool.sks-keyservers.net –recv-keys F186197B

  • Ok! So this is where my script is at this point:

    #!/bin/bash
    # this script parses mod_status to see which hosts are getting the most requests

    while true do echo “Time and date:” >> /tmp/apache_request_log
    /bin/date +”%D %H:%M:%S” >> /tmp/apache_request_log echo -e “\n”
    echo “hostname:” >> /tmp/apache_request_log
    /bin/hostname -f >> /tmp/apache_request_log echo -e “\n” >> /tmp/apache_request_log echo “host ip” >> /tmp/apache_request_log
    /bin/hostname -i >> /tmp/apache_request_log echo -e “\n” >> /tmp/apache_request_log echo “Server Stats:” >> /tmp/apache_request_log GET $(hostname -f)/server-status/?auto | egrep -i “(accesses|kbytes)” >>
    /tmp/apache_request_log echo -e “\n” >> /tmp/apache_request_log
    /bin/sleep 60
    done

    Now the only problems I am having are some output issues. This is the output I’ve gotten from this:

    [root@webhost014lc ~]# tail -f /tmp/apache_request_log Time and date:
    06/03/14 10:24:09
    “hostname:”
    webhost014lc.west.dmz-nbcuni.com
    “n”
    “host ip”
    10.10.1.98
    “n”
    Server Stats:
    Total Accesses: 1383898
    Total kBytes: 27198225
    “n”
    Time and date:
    06/03/14 10:25:09
    “hostname:”
    webhost014lc.west.dmz-nbcuni.com
    “n”
    “host ip”
    10.10.1.98
    “n”
    Server Stats:
    Total Accesses: 1384666
    Total kBytes: 27206570
    “n”

    What I need to figure out at this point is how to get the time and date info on the same line as it’s category. ie get

    Time and date: 06/03/14 10:24:09

    instead of

    Time and date:
    06/03/14 10:24:09

    As it is now.

    Also I’m trying to print out newlines with echo -e “\n” but somehow that isn’t working. Tho I think I’ve gotten that to work in the past.

    If someone could please help me fix these minor formatting issues that would be great and appreciated.

    Thanks Tim


    GPG me!!

    gpg –keyserver pool.sks-keyservers.net –recv-keys F186197B

  • op 03-06-14 16:32, schreef Tim Dunphy:
    printf “Time and date: $(/bin/date +”%D %H:%M:%S”)\n” solves both problems here

  • Ok this is what I came up with:

    #!/bin/bash
    # this script parses mod_status to see which hosts are getting the most requests

    while true do echo “Time and date: $(/bin/date +”%D %H:%M:%S”)” >>
    /tmp/apache_request_log >> /tmp/apache_request_log echo “hostname: $(/bin/hostname -f)\n” >> /tmp/apache_request_log echo “host ip: $(/bin/hostname -i)” >> /tmp/apache_request_log echo “Server Stats: $(/usr/bin/GET `hostname -f`/server-status/?auto |
    /bin/egrep -i ‘kbytes’)” >> /tmp/apache_request_log echo “Server Stats: $(/usr/bin/GET `hostname -f`/server-status/?auto |
    /bin/egrep -i ‘ReqPerSec’)” >> /tmp/apache_request_log echo -e “\n”
    sleep 60
    done

    Still can’t get the echo -e “\n” statement to print a new line for some reason. Other than that I’m good. And thanks for everyone’s help!

    Tim


    GPG me!!

    gpg –keyserver pool.sks-keyservers.net –recv-keys F186197B

  • Look at this code structure:

    while true
    do
    {
    echo Time and date: $(date +”%D %H:%M:%S”)
    echo Hostname: $(hostname -f)
    echo Hostname IP: $(hostname -i)


    # Leave two blank lines
    echo
    echo
    } >> /tmp/apache_request_log
    sleep 60
    done

    Note how we’re only doing one redirect; this makes the code easier to read and less likely to make a mistake (and more efficient).

    That’s one of the mistakes; you forgot the >> /tmp/apache_request_log on the echo line. But “echo” on its own without anything else leaves a blank line.

    The next clever bit is to not call “GET” twice; why make apache do twice the work? Call it once and store the results in a variable

    stat=$(GET $(hostname -f)/server-status/?auto)
    echo Server Stats: $(echo “$stat” | grep -i kbytes)
    echo Server Stats: $(echo “$stat” | grep -i ReqPerSec)

    (You can get even more clever, but that’s a little more involved; we’ll start with some basics :-))

    So we end up with something like:

    #!/bin/bash

    # These never change…
    name=$(hostname -f)
    ip=$(hostname -i)

    # Once a minute, record some stats
    while true
    do
    {
    echo Time and date: $(date +”%D %H:%M:%S”)
    echo Hostname: $name
    echo Hostname IP: $ip
    stat=$(GET $name/server-status/?auto)
    echo Server Stats: $(echo “$stat” | grep -i kbytes)
    echo Server Stats: $(echo “$stat” | grep -i ReqPerSec)
    echo
    echo
    } >> /tmp/apache_request_log
    sleep 60
    done

  • ‘echo’ on it’s own should print a new line. If you want two, why not just use two echo lines?

    Also, you are piping everything else to the apache_request_log except for the last echo line. Is the problem simply that you forgot to pipe that to the log file?

    echo -e “\n” >> /tmp/apache_request_log


    Bowie

  • Tim Dunphy writes:

    You could add a counter to the script and send the log off by email after polling so many times, or on each poll, tail the file, omitting so many lines as to remove the first entry.