Parsing Out Adjacent Text
hey all,
I’m trying to figure out how to use apache’s mod_status module to figure out which of the web servers in a farm of six are processing more requests than others.
I’m writing a script to grep out requests per second from the status module like this:
[root@uszmpwslp014lc ~]# GET http://$(hostname -i)/server-status | grep
-i requests/sec
That works ok. And next I’m grepping it back down and awking it to just the part I’m interested in:
[root@uszmpwslp014lc ~]# GET http://$(hostname -i)/server-status | grep -i
-e request -e requests/sec | grep -i -v -e currently -e code -e ss | awk
‘{print $1}’
But now I need to get rid of just the
I think I may be able to use the ‘cut’ command to do this, but I’m unsure how.
Any thoughts?
Thanks Tim
11 thoughts on - Parsing Out Adjacent Text
op 03-06-14 15:18, schreef Tim Dunphy:
cut –delimiter=”>” –field=2
you could even get rid of the awk and pipe your grep to cut –delimiter=”>” –field=2 | cut –delimiter=” ” –field=1
But there are many different ways to solve this.
greetings Patrick
Guys,
Thank you all for your input. I can’t believe how helpful this list is and I’m very grateful. Ok so here is what I have so far of my script to get the number of apache requests to a given host:
#!/bin/bash
# this script parses mod_status to see which hosts are getting the most requests
echo “Time:” >> /tmp/apache_request_log
/usr/bin/ts >> /tmp/apache_request_log
echo -e “\n” >> /tmp/apache_request_log
echo “hostname” >> /tmp/apache_request_log
/bin/hostname -f >> /tmp/apache_request_log
echo -e “\n” >> /tmp/apache_request_log
echo “hostname ip” >> /tmp/apache_request_log
/bin/hostname -i >> /tmp/apache_request_log
echo -e “\n” >> /tmp/apache_request_log
echo “Requests per second:” >> /tmp/apache_request_log
/usr/bin/GET http://$(/bin/hostname -i)/server-status | /bin/grep -i -e request -e requests/sec | grep -i -v -e currently -e code -e ss | awk
‘{print $1}’ | cut -d’>’ -f2 >> /tmp/apache_request_log
echo “Requests processed / Idle workers:” >> /tmp/apache_request_log
/usr/bin/GET http://$(hostname -i)/server-status | /bin/grep -i -e requests -e currently | grep -v -e requests/sec | cut -d’>’ -f2 | cut -d’<' -f1 >> /tmp/apache_request_log
echo -e “\n\n” >> /tmp/apache_request_log
/bin/sleep 60
So now my question is, is there any way to limit the size of the output log from within the script without having to use logrotate? I can use that, but I would prefer to do that from within the script if that’s possib.e
Thanks
Tim
—
GPG me!!
gpg –keyserver pool.sks-keyservers.net –recv-keys F186197B
Try accessing the stats with the additional “?auto” suffix, it is meant to be machine-readable, and is much shorter, e.g:
http://$(hostname -i)/server-status/?auto
—
Marios Zindilis
Awesome tip! This is some of the output I get when I run this command:
[root@uszmpwslp014lc script]# GET $(hostname -f)/server-status/?auto Total Accesses: 1371927
Total kBytes: 27060974
CPULoad: 2.70778
Uptime: 333370
ReqPerSec: 4.11533
BytesPerSec: 83122.2
BytesPerReq: 20198.2
BusyWorkers: 7
IdleWorkers: 44
This will be way easier to parse!
—
GPG me!!
gpg –keyserver pool.sks-keyservers.net –recv-keys F186197B
Ok! So this is where my script is at this point:
#!/bin/bash
# this script parses mod_status to see which hosts are getting the most requests
while true do echo “Time and date:” >> /tmp/apache_request_log
/bin/date +”%D %H:%M:%S” >> /tmp/apache_request_log echo -e “\n”
echo “hostname:” >> /tmp/apache_request_log
/bin/hostname -f >> /tmp/apache_request_log echo -e “\n” >> /tmp/apache_request_log echo “host ip” >> /tmp/apache_request_log
/bin/hostname -i >> /tmp/apache_request_log echo -e “\n” >> /tmp/apache_request_log echo “Server Stats:” >> /tmp/apache_request_log GET $(hostname -f)/server-status/?auto | egrep -i “(accesses|kbytes)” >>
/tmp/apache_request_log echo -e “\n” >> /tmp/apache_request_log
/bin/sleep 60
done
Now the only problems I am having are some output issues. This is the output I’ve gotten from this:
[root@webhost014lc ~]# tail -f /tmp/apache_request_log Time and date:
06/03/14 10:24:09
“hostname:”
webhost014lc.west.dmz-nbcuni.com
“n”
“host ip”
10.10.1.98
“n”
Server Stats:
Total Accesses: 1383898
Total kBytes: 27198225
“n”
Time and date:
06/03/14 10:25:09
“hostname:”
webhost014lc.west.dmz-nbcuni.com
“n”
“host ip”
10.10.1.98
“n”
Server Stats:
Total Accesses: 1384666
Total kBytes: 27206570
“n”
What I need to figure out at this point is how to get the time and date info on the same line as it’s category. ie get
Time and date: 06/03/14 10:24:09
instead of
Time and date:
06/03/14 10:24:09
As it is now.
Also I’m trying to print out newlines with echo -e “\n” but somehow that isn’t working. Tho I think I’ve gotten that to work in the past.
If someone could please help me fix these minor formatting issues that would be great and appreciated.
Thanks Tim
—
GPG me!!
gpg –keyserver pool.sks-keyservers.net –recv-keys F186197B
I strongly suggest that if you are writing a program to use a better language. bash is really painful for this sort of task. Here’s a Perl script that queries /server-status:
http://www.perlmonks.org/?node_idF5848
And modify to taste.
–keith
op 03-06-14 16:32, schreef Tim Dunphy:
printf “Time and date: $(/bin/date +”%D %H:%M:%S”)\n” solves both problems here
Ok this is what I came up with:
#!/bin/bash
# this script parses mod_status to see which hosts are getting the most requests
while true do echo “Time and date: $(/bin/date +”%D %H:%M:%S”)” >>
/tmp/apache_request_log >> /tmp/apache_request_log echo “hostname: $(/bin/hostname -f)\n” >> /tmp/apache_request_log echo “host ip: $(/bin/hostname -i)” >> /tmp/apache_request_log echo “Server Stats: $(/usr/bin/GET `hostname -f`/server-status/?auto |
/bin/egrep -i ‘kbytes’)” >> /tmp/apache_request_log echo “Server Stats: $(/usr/bin/GET `hostname -f`/server-status/?auto |
/bin/egrep -i ‘ReqPerSec’)” >> /tmp/apache_request_log echo -e “\n”
sleep 60
done
Still can’t get the echo -e “\n” statement to print a new line for some reason. Other than that I’m good. And thanks for everyone’s help!
Tim
—
GPG me!!
gpg –keyserver pool.sks-keyservers.net –recv-keys F186197B
Look at this code structure:
while true
do
{
echo Time and date: $(date +”%D %H:%M:%S”)
echo Hostname: $(hostname -f)
echo Hostname IP: $(hostname -i)
…
…
# Leave two blank lines
echo
echo
} >> /tmp/apache_request_log
sleep 60
done
Note how we’re only doing one redirect; this makes the code easier to read and less likely to make a mistake (and more efficient).
That’s one of the mistakes; you forgot the >> /tmp/apache_request_log on the echo line. But “echo” on its own without anything else leaves a blank line.
The next clever bit is to not call “GET” twice; why make apache do twice the work? Call it once and store the results in a variable
stat=$(GET $(hostname -f)/server-status/?auto)
echo Server Stats: $(echo “$stat” | grep -i kbytes)
echo Server Stats: $(echo “$stat” | grep -i ReqPerSec)
(You can get even more clever, but that’s a little more involved; we’ll start with some basics :-))
So we end up with something like:
#!/bin/bash
# These never change…
name=$(hostname -f)
ip=$(hostname -i)
# Once a minute, record some stats
while true
do
{
echo Time and date: $(date +”%D %H:%M:%S”)
echo Hostname: $name
echo Hostname IP: $ip
stat=$(GET $name/server-status/?auto)
echo Server Stats: $(echo “$stat” | grep -i kbytes)
echo Server Stats: $(echo “$stat” | grep -i ReqPerSec)
echo
echo
} >> /tmp/apache_request_log
sleep 60
done
‘echo’ on it’s own should print a new line. If you want two, why not just use two echo lines?
Also, you are piping everything else to the apache_request_log except for the last echo line. Is the problem simply that you forgot to pipe that to the log file?
echo -e “\n” >> /tmp/apache_request_log
—
Bowie
Tim Dunphy writes:
You could add a counter to the script and send the log off by email after polling so many times, or on each poll, tail the file, omitting so many lines as to remove the first entry.