Convert Webserver.log to CSV

A security guy at a campus wanted our web server log file in the CSV format. The original file has lines which look something like:

machine.usg.edu: webserver.log13646,2010-11-30        11:08:32        0.0010  999.999.999.999    b7tPM1hTgGYMn90bLTM1    200     GET     /webct/urw/lc987189066271.tp1333853785371/blank.html    –       262     “Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10_6_5; en-us) AppleWebKit/533.19.4 (KHTML, like Gecko) Version/5.0.3 Safari/533.19.4” username:0:0

Turns out I only need three sed edits to make it look the way I want:

sed ‘s|:2009-|,2009-|g’ testfile.txt | sed ‘s|\t|,|g’ | sed ‘s|: |,|g’

The first converts the colon between the end of the file name and the year into a comma. The second converts all the tabs into commas, and the last changes the colon-space between the host name and webserver.log into a comma.

Easy enough. That line from the web server log now looks like:

machine.usg.edu,webserver.log13646,2010-11-30,11:08:32,0.0010,999.999.999.999,b7tPM1hTgGYMn90bLTM1,200,GET, /webct/urw/lc987189066271.tp1333853785371/blank.html,-,262, “Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10_6_5; en-us) AppleWebKit/533.19.4 (KHTML, like Gecko) Version/5.0.3 Safari/533.19.4”,username:0:0

I love regular expressions.

I have a feeling I’ll need to make a primer for this guy too. 🙁

Hostname,Log Name, Date, Time, Seconds to Process, Load Balancer IP, Session ID, HTTP Response Code, HTTP Method, URI, URI Parameters, Bytes Returned, User Agent, Username:Transactions Read:Transaction Written